Advanced Postgres RLS: Dynamic Data Masking in Multi-Tenant SaaS

18 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Application-Layer Security Fallacy in Multi-Tenant Architectures

In any non-trivial multi-tenant SaaS application, the problem of data segregation and access control is paramount. A common approach is to bake this logic directly into the application layer. You might have services that check a user's role and tenant ID from a JWT, then manually add WHERE tenant_id = ? clauses to every single database query. For data masking, you might fetch the full record and then selectively nullify or transform sensitive fields before serializing the API response.

This approach, while functional, is fundamentally flawed for several reasons:

  • Brittleness: A single forgotten WHERE clause in a complex JOIN can lead to a catastrophic cross-tenant data leak.
  • Code Duplication: The same authorization and masking logic is repeated across multiple services, endpoints, and background jobs, violating the DRY principle.
  • Lack of Centralization: Security policy is scattered throughout the codebase, making it impossible to audit or update consistently.
  • Performance Inefficiency: You often fetch sensitive data from the database only to discard or mask it in the application, wasting network I/O and CPU cycles.
  • For senior engineers, the goal is to build systems that are secure by default. This means pushing security enforcement to the lowest possible, most centralized layer: the database itself. PostgreSQL's Row-Level Security (RLS) is the canonical tool for this, but implementing a sophisticated strategy that goes beyond simple row filtering requires a deeper understanding of its mechanics, especially when dynamic data masking is a requirement.

    This post will dissect a production-grade pattern for implementing both strict tenant isolation and dynamic, role-based data masking using RLS, session context, and SECURITY BARRIER views. We're not talking about basic USING (tenant_id = current_tenant_id()). We're implementing a system where the same query (SELECT FROM users) returns different results—and even different data within the same column*—based on the application-level role of the user executing the query.


    The Scenario: A Multi-Tenant CRM

    To ground our discussion, let's model a simple multi-tenant CRM. We have tenants (companies), users within those tenants, and internal support staff who need limited, PII-masked access for troubleshooting.

    Our Access Control Requirements:

  • Strict Tenancy: A user must never see data from another tenant.
  • Role-Based Row Visibility:
  • * A tenant_user can only see their own user record.

    * A tenant_admin can see all user records within their tenant.

  • Dynamic PII Masking:
  • * tenant_admin and tenant_user roles can see sensitive PII (e.g., Social Security Number) unmasked.

    An internal_support role can see records for any tenant (for troubleshooting), but all PII fields must* be masked.

    Here's our foundational schema:

    sql
    -- Enable the pgcrypto extension for UUIDs
    CREATE EXTENSION IF NOT EXISTS "pgcrypto";
    
    -- Tenants table
    CREATE TABLE tenants (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT NOT NULL
    );
    
    -- Application user roles
    CREATE TYPE app_role AS ENUM ('tenant_user', 'tenant_admin', 'internal_support');
    
    -- Users table with sensitive PII
    CREATE TABLE users (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
        email TEXT NOT NULL,
        full_name TEXT NOT NULL,
        role app_role NOT NULL,
        ssn TEXT, -- Our sensitive PII column
        UNIQUE(tenant_id, email)
    );
    
    -- Index for efficient RLS lookups
    CREATE INDEX idx_users_tenant_id ON users(tenant_id);
    
    -- Populate with sample data
    INSERT INTO tenants (id, name) VALUES
        ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', 'Innovate Inc.'),
        ('b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'Synergy Corp');
    
    INSERT INTO users (tenant_id, email, full_name, role, ssn) VALUES
        -- Innovate Inc. users
        ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', '[email protected]', 'Alice Admin', 'tenant_admin', '111-00-1111'),
        ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', '[email protected]', 'Bob User', 'tenant_user', '222-00-2222'),
    
        -- Synergy Corp users
        ('b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', '[email protected]', 'Charles Admin', 'tenant_admin', '333-00-3333'),
        ('b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', '[email protected]', 'Diana User', 'tenant_user', '444-00-4444');

    The Crucial Link: Passing Application Context to Postgres

    RLS policies are just SQL conditions. For them to be dynamic, they need context about the current application user. How do we securely pass user_id, tenant_id, and role from our application to a Postgres session?

    The most robust and secure method is to use session-level configuration parameters. These are key-value pairs that are namespaced, exist only for the duration of a session, and can be set at the beginning of a transaction.

    We'll use the current_setting(setting_name, missing_ok) function. Let's define our context variables:

    * rls.user_id: The UUID of the application user making the request.

    * rls.tenant_id: The UUID of the user's tenant.

    * rls.role: The application role (tenant_user, tenant_admin, etc.).

    From the application (e.g., a Node.js Express middleware using the pg library), the flow for every request looks like this:

    javascript
    // Example application middleware
    async function setRlsContext(req, res, next) {
        const { userId, tenantId, role } = req.user; // From a JWT or session
        const client = await pool.connect();
        req.dbClient = client; // Attach client to request object
    
        try {
            // IMPORTANT: Use SET LOCAL to scope the settings to the current transaction
            await client.query('BEGIN');
            await client.query(`SET LOCAL rls.user_id = '${userId}';`);
            await client.query(`SET LOCAL rls.tenant_id = '${tenantId}';`);
            await client.query(`SET LOCAL rls.role = '${role}';`);
            next();
        } catch (err) {
            await client.query('ROLLBACK');
            client.release();
            next(err);
        }
    }
    
    // In your route handler, you'd commit or rollback and then release the client
    app.get('/users', setRlsContext, async (req, res) => {
        try {
            const { rows } = await req.dbClient.query('SELECT * FROM users;');
            await req.dbClient.query('COMMIT');
            res.json(rows);
        } catch (err) {
            await req.dbClient.query('ROLLBACK');
            // error handling...
        } finally {
            req.dbClient.release();
        }
    });

    Why SET LOCAL? Using SET LOCAL ensures these settings are automatically reverted at the end of the transaction (COMMIT or ROLLBACK). This is critical in applications with connection pooling. If you used SET (session-level), a setting could leak from one user's request to another if the same pooled connection is reused, creating a massive security hole.

    Now, let's create some helper functions in SQL to safely access these settings. This encapsulates logic and provides type safety.

    sql
    -- Helper function to get the current tenant_id, returns NULL if not set
    CREATE OR REPLACE FUNCTION get_current_tenant_id() RETURNS UUID AS $$
    BEGIN
        RETURN current_setting('rls.tenant_id', true)::UUID;
    EXCEPTION
        WHEN OTHERS THEN
            RETURN NULL;
    END;
    $$ LANGUAGE plpgsql STABLE;
    
    -- Helper function to get the current user_id
    CREATE OR REPLACE FUNCTION get_current_user_id() RETURNS UUID AS $$
    BEGIN
        RETURN current_setting('rls.user_id', true)::UUID;
    EXCEPTION
        WHEN OTHERS THEN
            RETURN NULL;
    END;
    $$ LANGUAGE plpgsql STABLE;
    
    -- Helper function to get the current user's role
    CREATE OR REPLACE FUNCTION get_current_user_role() RETURNS app_role AS $$
    BEGIN
        RETURN current_setting('rls.role', true)::app_role;
    EXCEPTION
        WHEN OTHERS THEN
            RETURN NULL;
    END;
    $$ LANGUAGE plpgsql STABLE;

    Marking these functions as STABLE is a performance hint to Postgres, indicating that they return the same result for the same arguments within a single statement, allowing the planner to optimize calls.

    Implementing the Multi-Layered RLS Policy

    With our context mechanism in place, we can now define the RLS policy on the users table. First, we must enable RLS on the table.

    sql
    ALTER TABLE users ENABLE ROW LEVEL SECURITY;
    -- IMPORTANT: By default, there is a DENY ALL policy. No rows will be returned
    -- until we create a permissive policy.

    Now, let's build our single, comprehensive policy. It's often better to have one policy that handles all roles rather than multiple conflicting policies, as it makes the logic easier to reason about.

    sql
    CREATE POLICY users_access_policy ON users
    FOR ALL -- Applies to SELECT, INSERT, UPDATE, DELETE
    USING (
        -- Rule 1: Internal support can access any tenant
        (get_current_user_role() = 'internal_support')
        OR
        -- Rule 2: All other roles are strictly scoped to their own tenant
        (tenant_id = get_current_tenant_id() AND (
            -- Rule 2a: Admins can see all users in their tenant
            get_current_user_role() = 'tenant_admin'
            OR
            -- Rule 2b: Regular users can only see their own record
            (get_current_user_role() = 'tenant_user' AND id = get_current_user_id())
        ))
    )
    WITH CHECK (
        -- For INSERT/UPDATE, ensure users can only modify data within their own tenant
        tenant_id = get_current_tenant_id()
    );

    Let's break down the USING clause, which governs row visibility for SELECT:

    * The top-level OR separates the logic for our privileged internal_support role from the tenant-scoped roles.

    * If the user is internal_support, the policy returns true, and the tenant_id check is bypassed.

    If not, the user must* match the tenant_id. Within that tenant, a nested check determines if they are an admin (can see all rows in the tenant) or a regular user (can only see their own row, matching on id).

    The WITH CHECK clause applies to INSERT and UPDATE commands, preventing a user from, for example, inserting a new user into a different tenant. It reuses the tenancy isolation part of the USING clause.

    This policy correctly handles row visibility. But it does nothing for data masking. An internal_support user can currently see the raw ssn column, violating our requirements.

    The `SECURITY BARRIER` View: Preventing Side-Channel Attacks

    Our first instinct for masking might be to create a view that uses a CASE statement:

    sql
    -- NAIVE, INSECURE APPROACH - DO NOT USE
    CREATE VIEW insecure_users_view AS
    SELECT
        id, tenant_id, email, full_name, role,
        CASE
            WHEN get_current_user_role() = 'internal_support' THEN '***-**-****'
            ELSE ssn
        END AS ssn
    FROM users;

    Now, if an attacker with the internal_support role queries this view like this:

    SELECT * FROM insecure_users_view WHERE ssn LIKE '111%';

    You might think this is safe. The CASE statement should mask the ssn before the WHERE clause is evaluated. This is dangerously wrong.

    The PostgreSQL query planner is incredibly smart. It might rewrite the query by pushing the WHERE clause predicate down into the view's underlying table scan before the masking CASE statement is applied. The effective query could become:

    SELECT ... FROM users WHERE ssn LIKE '111%';

    This leak happens because the planner sees an opportunity to filter rows early for performance. An attacker could use this to exfiltrate the sensitive ssn data character by character via a timing or boolean-based side-channel attack.

    This is where the SECURITY BARRIER attribute comes in. When applied to a view (or a function), it creates a firewall for the query planner.

    A view with SECURITY BARRIER prevents operators, functions, and predicates from the user's query from being pushed down into the view's definition. They are only evaluated after the view's rows (including the RLS policies on the underlying table) have been produced.

    Let's create the secure view:

    sql
    -- Create a simple masking function for reusability
    CREATE OR REPLACE FUNCTION mask_pii(data TEXT) RETURNS TEXT AS $$
    BEGIN
        RETURN '***-**-' || substr(data, 8, 4);
    END;
    $$ LANGUAGE plpgsql IMMUTABLE;
    
    -- The SECURE implementation
    CREATE VIEW secure_users WITH (security_barrier = true) AS
    SELECT
        id,
        tenant_id,
        email,
        full_name,
        role,
        CASE
            WHEN get_current_user_role() = 'internal_support' THEN mask_pii(ssn)
            ELSE ssn
        END AS ssn
    FROM users;
    
    -- Grant access to the view, not the underlying table
    -- We'll need a role for our application to use
    CREATE ROLE app_user;
    GRANT SELECT, INSERT, UPDATE, DELETE ON secure_users TO app_user;
    -- IMPORTANT: Do NOT grant direct access to the `users` table!

    Now, our application will only ever query secure_users. The users_access_policy RLS on the users table is evaluated first to determine which rows are visible. Then, for those visible rows, the CASE statement in the security_barrier view safely masks the ssn column for the internal_support role. The query planner is forbidden from reordering these operations, closing the side-channel vulnerability.

    Verification: Putting It All Together

    Let's simulate API calls from three different users by setting their context and running the exact same query.

    Scenario 1: Alice, the tenant_admin from Innovate Inc.

    sql
    BEGIN;
    SET LOCAL rls.user_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'; -- This is actually Alice's user ID
    SET LOCAL rls.tenant_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11';
    SET LOCAL rls.role = 'tenant_admin';
    
    -- The application runs this query
    SELECT * FROM secure_users;
    
    -- Result:
    -- id                                   | tenant_id                            | email              | full_name   | role         | ssn
    -- ------------------------------------|--------------------------------------|--------------------|-------------|--------------|------------
    -- a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 | a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 | [email protected] | Alice Admin | tenant_admin | 111-00-1111
    -- a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 | a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 | [email protected]  | Bob User    | tenant_user  | 222-00-2222
    
    COMMIT;

    Alice sees both users in her tenant, with unmasked SSNs. The RLS policy worked perfectly.

    Scenario 2: Bob, the tenant_user from Innovate Inc.

    sql
    BEGIN;
    SET LOCAL rls.user_id = '...'; -- Bob's actual user ID
    SET LOCAL rls.tenant_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11';
    SET LOCAL rls.role = 'tenant_user';
    
    -- The application runs the exact same query
    SELECT * FROM secure_users;
    
    -- Result:
    -- id                                   | tenant_id                            | email             | full_name | role        | ssn
    -- ------------------------------------|--------------------------------------|-------------------|-----------|-------------|------------
    -- ... (Bob's user ID)                   | a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 | [email protected] | Bob User  | tenant_user | 222-00-2222
    
    COMMIT;

    Bob sees only his own record, with his SSN unmasked. The id = get_current_user_id() part of the policy worked.

    Scenario 3: Sam, the internal_support agent

    sql
    BEGIN;
    SET LOCAL rls.user_id = '...'; -- Sam's user ID (from an internal user table)
    SET LOCAL rls.tenant_id = NULL; -- Support doesn't belong to a tenant
    SET LOCAL rls.role = 'internal_support';
    
    -- The application runs the exact same query
    SELECT * FROM secure_users;
    
    -- Result:
    -- id                                   | tenant_id                            | email              | full_name     | role         | ssn
    -- ------------------------------------|--------------------------------------|--------------------|---------------|--------------|-------------
    -- ... (Alice's ID)                     | a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 | [email protected] | Alice Admin   | tenant_admin | ***-**-1111
    -- ... (Bob's ID)                       | a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11 | [email protected]  | Bob User      | tenant_user  | ***-**-2222
    -- ... (Charles' ID)                    | b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12 | [email protected]  | Charles Admin | tenant_admin | ***-**-3333
    -- ... (Diana's ID)                     | b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12 | [email protected]   | Diana User    | tenant_user  | ***-**-4444
    
    COMMIT;

    Success! Sam can see all users across all tenants, but the ssn column is dynamically masked by the security_barrier view. The application code remains blissfully unaware of this complexity; it just runs a simple SELECT.

    Performance Analysis and Optimization

    RLS is not a zero-cost abstraction. Every query against an RLS-protected table will incur the overhead of executing the policy logic. For this pattern to be production-viable, we must analyze and optimize it.

    Let's use EXPLAIN ANALYZE to see what's happening under the hood for a tenant_admin query.

    sql
    BEGIN;
    SET LOCAL rls.tenant_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11';
    SET LOCAL rls.role = 'tenant_admin';
    
    EXPLAIN ANALYZE SELECT * FROM secure_users;
    
    COMMIT;

    The query plan will reveal a few key things:

    text
    Seq Scan on users  (cost=0.00..35.50 rows=1 width=120) (actual time=0.030..0.032 rows=2 loops=1)
      Filter: ((tenant_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'::uuid) AND ...)
      Rows Removed by Filter: 2
    Planning Time: 0.150 ms
    Execution Time: 0.050 ms

    The key line is Filter: .... This is the RLS policy being applied. Notice that it's using our indexed tenant_id column. On a large table, this would be an Index Scan instead of a Seq Scan, which is critical for performance. Always index the columns used in your RLS policies.

    Key Performance Considerations:

  • Function Volatility: Our helper functions were marked STABLE. If they performed any table writes or depended on outside state beyond session GUCs, they would need to be VOLATILE, which severely limits the planner's ability to optimize. Keep policy logic as pure as possible.
  • Policy Complexity: Avoid complex subqueries or joins inside the RLS policy expression itself. The logic is executed for every potential row access. If you need complex logic, consider denormalizing a required attribute onto the table you're protecting.
  • The security_barrier Overhead: The security_barrier attribute forces a planning fence, which can sometimes prevent more optimal plans. The cost is usually negligible compared to the security benefit, but for extremely high-throughput OLTP workloads, you must benchmark. In our case, the alternative (insecure application-layer masking) is far more costly in terms of I/O and risk.
  • Advanced Edge Cases and Gotchas

    Superusers and BYPASSRLS: Database superusers and any role with the BYPASSRLS attribute will completely ignore all RLS policies. Your application role (app_user) must not* have this privilege. This is a common misconfiguration.

    SECURITY DEFINER Functions: If your RLS policy calls a function, that function runs with the permissions of the user calling the query. If that function needs to access a table the user doesn't have permissions for (e.g., a central authorization mapping table), you must define the function with SECURITY DEFINER. This makes it run with the permissions of the user who defined* the function. Use this with extreme caution, as it's a privilege escalation vector if not written carefully. Always use SET search_path = '' inside SECURITY DEFINER functions to prevent hijacking.

    * Views on Views: If you stack views (view_c -> view_b -> secure_users), RLS policies are checked based on the view's owner unless the view is created with security_invoker = true (PostgreSQL 15+). This is a subtle trap. The security_barrier on our base view is the most robust way to ensure the policy is applied predictably.

    * Connection Pooling and DISCARD ALL: While SET LOCAL is transaction-scoped, some aggressive connection poolers might not run a full DISCARD ALL between checkouts. It's a best practice to explicitly RESET rls.user_id; etc. at the end of a request, or ensure your pooler is configured to fully clean session state.

    Conclusion: Security by Design

    By moving multi-tenancy and data masking logic from the application to the database, we've built a system that is more secure, maintainable, and often more performant. The combination of Row-Level Security policies for row visibility and SECURITY BARRIER views for column-level masking provides a comprehensive, centralized enforcement mechanism.

    This pattern allows application developers to write simple, business-logic-focused queries (SELECT * FROM secure_users) with the confidence that the correct data segregation and masking rules are being applied automatically and securely for every user on every request. It transforms security from a repetitive, error-prone task into a foundational, verifiable property of the data layer itself.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles