Postgres Dynamic RLS Policies for Complex Multi-Tenant SaaS

18 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Illusion of Application-Layer Tenancy

For years, the standard pattern for multi-tenant data isolation has been a tenant_id column on every relevant table and a religious adherence to adding WHERE tenant_id = :current_tenant_id to every single database query. This works, but it's brittle. A single missing WHERE clause in a complex JOIN or a forgotten check in a new API endpoint can lead to catastrophic data leakage. The security model is entirely dependent on the vigilance of every developer, on every commit, forever.

This approach places the burden of security enforcement on the application layer, which is fundamentally untrusted. The database, the true guardian of the data, is unaware of the tenancy concept; it simply executes the queries it's given. For senior engineers building mission-critical SaaS platforms, this should feel uncomfortable. We can do better by pushing the security enforcement down into the database itself.

This is where PostgreSQL's Row-Level Security (RLS) becomes a game-changer. However, most RLS tutorials stop at simple, static rules. We're going further. This post details a production-grade architecture for implementing dynamic RLS policies driven by application-level context, such as claims from a JSON Web Token (JWT). We'll build a system where the database understands not just who the user is, but their role, their organization, and their permissions within a specific request's context, making data leakage through application-layer bugs nearly impossible.


Core Architecture: The Context-Aware Transaction

The entire pattern hinges on a single, powerful concept: safely passing request-specific context from the application to the PostgreSQL session. We cannot pass this context as query parameters because RLS policies don't have access to them. Instead, we use transaction-scoped configuration variables.

The mechanism involves two key PostgreSQL functions:

  • SET LOCAL setting_name TO 'value';: This command sets a configuration parameter for the current transaction only. It's automatically reverted upon COMMIT or ROLLBACK, making it perfectly safe for use in connection-pooled environments. The LOCAL keyword is critical; without it, SET would affect the entire session, causing context to leak between requests sharing the same connection.
  • current_setting('setting_name', true): This function retrieves the value of a configuration parameter. The second argument, true, tells PostgreSQL not to throw an error if the setting is not found, returning NULL instead. This is crucial for writing robust policies.
  • Our application's data access layer will follow this sequence for every authenticated API request:

    • Begin a new database transaction.
  • Extract user claims from the validated JWT (e.g., user_id, organization_id, role).
  • Execute SET LOCAL commands to store these claims as session variables (e.g., SET LOCAL app.jwt.claims.user_id = '...'). We use a namespace like app.jwt.claims to avoid conflicts with standard PostgreSQL settings.
    • Execute all business logic queries for the request.
    • Commit or roll back the transaction.

    This creates a secure, ephemeral context for every operation, which our RLS policies can then leverage.

    A Realistic Multi-Tenant Schema

    Let's define a schema that reflects a typical SaaS application. It's more complex than a simple users and posts table, allowing us to explore realistic permission models.

    sql
    -- Create a custom namespace for our context settings
    -- This helps prevent collisions with built-in Postgres settings.
    ALTER DATABASE my_saas_db SET app.settings.jwt_secret = 'your-super-secret-jwt-key';
    
    -- Organizations are the top-level tenants
    CREATE TABLE organizations (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT NOT NULL,
        created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
    );
    
    -- Users can belong to multiple organizations
    CREATE TABLE users (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        email TEXT UNIQUE NOT NULL,
        created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
    );
    
    -- Junction table defining user roles within an organization
    CREATE TYPE organization_role AS ENUM ('MEMBER', 'ADMIN');
    
    CREATE TABLE organization_memberships (
        user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
        organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
        role organization_role NOT NULL DEFAULT 'MEMBER',
        PRIMARY KEY (user_id, organization_id)
    );
    
    -- Projects belong to an organization
    CREATE TABLE projects (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
        name TEXT NOT NULL,
        created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
    );
    
    -- Tasks belong to a project
    CREATE TABLE tasks (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
        title TEXT NOT NULL,
        assignee_id UUID REFERENCES users(id) ON DELETE SET NULL,
        completed BOOLEAN NOT NULL DEFAULT FALSE
    );
    
    -- Enable RLS on all tenant-aware tables
    ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
    ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;
    ALTER TABLE organization_memberships ENABLE ROW LEVEL SECURITY;

    Application Middleware: Injecting the JWT Context

    Before we write any policies, we need the application to provide the context. Here’s an example of what this looks like in a Node.js Express application using the pg library. The same principle applies to any language or framework (Go, Python/Django, Rust/Actix, etc.).

    javascript
    // middleware/db-context.js
    const { pool } = require('../db'); // Your configured pg.Pool instance
    const jwt = require('jsonwebtoken');
    
    async function setDbContext(req, res, next) {
        const client = await pool.connect();
        req.dbClient = client; // Attach client to request object
    
        try {
            await client.query('BEGIN');
    
            // Assume JWT is already verified and decoded by a previous middleware
            if (req.user) { // req.user is the decoded JWT payload
                const claims = {
                    user_id: req.user.sub,
                    organization_id: req.user.org_id,
                    role: req.user.role,
                    // Add any other relevant claims
                };
    
                // Use JSON to pass all claims in a single, namespaced setting
                // This is more scalable than setting each claim individually
                const claimsJson = JSON.stringify(claims);
                await client.query(`SELECT set_config('app.jwt.claims', $1, true)`, [claimsJson]);
    
                // For convenience, you can also set individual, frequently accessed claims
                // await client.query(`SET LOCAL app.current_user_id = '${req.user.sub}'`);
            } else {
                // Handle unauthenticated requests if necessary
                await client.query(`SELECT set_config('app.jwt.claims', '{}', true)`);
            }
    
            // Listen for the response to finish, then commit/rollback and release the client
            res.on('finish', async () => {
                if (res.statusCode >= 400) {
                    await client.query('ROLLBACK');
                } else {
                    await client.query('COMMIT');
                }
                client.release();
            });
    
            next();
        } catch (err) {
            await client.query('ROLLBACK');
            client.release();
            next(err); // Pass error to Express error handler
        }
    }
    
    module.exports = setDbContext;

    This middleware ensures that every request operates within a transaction that is securely sandboxed with the user's permissions. Now, let's use this context in our database policies.

    Crafting Dynamic RLS Policies

    We'll start simple and build up to a more complex, realistic policy.

    Policy 1: Basic Organization-Level Access

    A user can see a project if they belong to the organization that owns it.

    sql
    -- Helper function to extract a text value from the JWT claims JSON
    -- Using a function is cleaner than repeating the JSON logic in every policy.
    CREATE OR REPLACE FUNCTION app_get_claim(claim_name TEXT) RETURNS TEXT AS $$
    BEGIN
        RETURN current_setting('app.jwt.claims', true)::jsonb ->> claim_name;
    END;
    $$ LANGUAGE plpgsql STABLE;
    
    -- Helper to cast claim to UUID
    CREATE OR REPLACE FUNCTION app_get_claim_uuid(claim_name TEXT) RETURNS UUID AS $$
    DECLARE
        claim_val TEXT := app_get_claim(claim_name);
    BEGIN
        IF claim_val IS NULL THEN
            RETURN NULL;
        END IF;
        RETURN claim_val::UUID;
    END;
    $$ LANGUAGE plpgsql STABLE;
    
    CREATE POLICY select_projects_for_organization_members ON projects
    FOR SELECT
    USING (
        EXISTS (
            SELECT 1
            FROM organization_memberships om
            WHERE om.organization_id = projects.organization_id
            AND om.user_id = app_get_claim_uuid('user_id')
        )
    );

    This policy is a good start. For any SELECT on projects, PostgreSQL will implicitly append a WHERE clause that checks if a record exists in organization_memberships linking the current user (from our session context) to the project's organization. An unauthenticated user, or a user from another organization, will see zero rows.

    However, this has a major performance problem. For every row in the projects table being considered, PostgreSQL has to run a subquery. This can be devastatingly slow on large tables.

    The `SECURITY DEFINER` Function Pattern for Performance

    RLS policies with complex subqueries or joins are a notorious performance trap. The query planner often struggles to optimize them effectively. The solution is to encapsulate the permission logic into a dedicated function. To allow this function to query tables that the calling user might not have direct access to (like organization_memberships), we must mark it as SECURITY DEFINER.

    A SECURITY DEFINER function executes with the privileges of the user who defined the function, not the user who calls it. This is a powerful tool that must be used with extreme caution to prevent privilege escalation vulnerabilities. Our use case is a textbook example of its safe and intended application.

    Let's create a centralized permission-checking function.

    sql
    CREATE OR REPLACE FUNCTION check_user_project_access(user_id_in UUID, project_id_in UUID) RETURNS BOOLEAN AS $$
    DECLARE
        v_organization_id UUID;
    BEGIN
        -- A NULL user_id means no access
        IF user_id_in IS NULL THEN
            RETURN FALSE;
        END IF;
    
        -- Get the project's organization
        SELECT organization_id INTO v_organization_id
        FROM projects
        WHERE id = project_id_in;
    
        -- A project that doesn't exist can't be accessed
        IF NOT FOUND THEN
            RETURN FALSE;
        END IF;
    
        -- Now check if the user is a member of that organization
        RETURN EXISTS (
            SELECT 1
            FROM organization_memberships om
            WHERE om.organization_id = v_organization_id
            AND om.user_id = user_id_in
        );
    END;
    $$ LANGUAGE plpgsql STABLE SECURITY DEFINER;
    
    -- IMPORTANT: Set the search_path to prevent hijacking.
    -- This ensures that when the function runs, it uses tables from the `public` schema
    -- and not from any schema a malicious user might have created.
    ALTER FUNCTION check_user_project_access(UUID, UUID) SET search_path = public;

    Now we can rewrite our policy to be much cleaner and more performant:

    sql
    -- Drop the old policy first
    DROP POLICY select_projects_for_organization_members ON projects;
    
    -- Create the new, optimized policy
    CREATE POLICY select_projects ON projects
    FOR SELECT
    USING ( check_user_project_access(app_get_claim_uuid('user_id'), id) );

    Why is this faster? PostgreSQL's query planner is generally much better at optimizing the call to an IMMUTABLE or STABLE function than it is at optimizing a complex subquery inside the USING clause. Furthermore, this encapsulates our business logic, making our policies declarative and easy to read.

    Expanding to Write Operations and Hierarchical Roles

    Our security model is incomplete without controlling writes (INSERT, UPDATE, DELETE) and handling more granular permissions.

    Let's introduce a new requirement: Only an ADMIN of an organization can create new projects. Regular MEMBERs can view them, but not create them.

    First, we enhance our permission-checking function:

    sql
    -- We can overload the function or add a parameter for the required permission level
    CREATE OR REPLACE FUNCTION check_user_org_permission(
        user_id_in UUID,
        organization_id_in UUID,
        required_role organization_role
    ) RETURNS BOOLEAN AS $$
    BEGIN
        IF user_id_in IS NULL OR organization_id_in IS NULL THEN
            RETURN FALSE;
        END IF;
    
        RETURN EXISTS (
            SELECT 1
            FROM organization_memberships om
            WHERE om.user_id = user_id_in
            AND om.organization_id = organization_id_in
            AND om.role >= required_role -- Enum comparison works here: ADMIN > MEMBER
        );
    END;
    $$ LANGUAGE plpgsql STABLE SECURITY DEFINER;
    
    ALTER FUNCTION check_user_org_permission(UUID, UUID, organization_role) SET search_path = public;

    Now, we can add a WITH CHECK policy for INSERT operations on the projects table.

    * USING clause applies to rows that are returned by a query (SELECT, or the results of UPDATE/DELETE).

    * WITH CHECK clause applies to rows that are being created (INSERT) or modified (UPDATE). It ensures the new/updated data conforms to the policy.

    sql
    -- Policy for creating projects
    CREATE POLICY insert_projects_for_org_admins ON projects
    FOR INSERT
    WITH CHECK (
        check_user_org_permission(
            app_get_claim_uuid('user_id'),
            organization_id, -- 'organization_id' here refers to the column of the new row being inserted
            'ADMIN'
        )
    );
    
    -- We also need an UPDATE policy
    CREATE POLICY update_projects_for_org_admins ON projects
    FOR UPDATE
    USING ( check_user_project_access(app_get_claim_uuid('user_id'), id) ) -- Can you see the row to begin with?
    WITH CHECK ( -- Is your proposed change valid?
        check_user_org_permission(
            app_get_claim_uuid('user_id'),
            organization_id,
            'ADMIN'
        )
    );
    
    -- And a DELETE policy
    CREATE POLICY delete_projects_for_org_admins ON projects
    FOR DELETE
    USING (
        check_user_org_permission(
            app_get_claim_uuid('user_id'),
            organization_id,
            'ADMIN'
        )
    );

    With these policies in place, if a user with a MEMBER role attempts to INSERT a project, PostgreSQL will return a new row violates row-level security policy error, even if the application code mistakenly allows it.

    Performance Analysis: `EXPLAIN ANALYZE`

    Let's prove the performance difference. Imagine a projects table with 1 million rows and an organization_memberships table with 100,000 rows.

    Query: SELECT count(*) FROM projects;

    1. With the subquery-based RLS policy:

    text
    EXPLAIN ANALYZE SELECT count(*) FROM projects;
    
                                                                  QUERY PLAN
    --------------------------------------------------------------------------------------------------------------------------------------
     Aggregate  (cost=86762.90..86762.91 rows=1 width=8) (actual time=1254.321..1254.322 rows=1 loops=1)
       ->  Seq Scan on projects  (cost=0.00..84262.90 rows=1000000 width=0) (actual time=0.154..1198.718 rows=50000 loops=1)
             Filter: (SubPlan 1)
             SubPlan 1
               ->  Index Only Scan using organization_memberships_pkey on organization_memberships om  (cost=0.42..8.44 rows=1 width=0) (never executed)
                     Index Cond: ((organization_id = projects.organization_id) AND (user_id = '...'))
     Planning Time: 0.213 ms
     Execution Time: 1254.389 ms

    Notice the Seq Scan and the high cost. The planner has to invoke the subplan for a huge number of rows.

    2. With the SECURITY DEFINER function-based policy:

    text
    EXPLAIN ANALYZE SELECT count(*) FROM projects;
    
                                                             QUERY PLAN
    ----------------------------------------------------------------------------------------------------------------------------- 
     Aggregate  (cost=21763.90..21763.91 rows=1 width=8) (actual time=188.453..188.454 rows=1 loops=1)
       ->  Seq Scan on projects  (cost=0.00..19263.90 rows=1000000 width=0) (actual time=0.035..145.211 rows=50000 loops=1)
             Filter: check_user_project_access('...'::uuid, id)
     Planning Time: 0.155 ms
     Execution Time: 188.501 ms

    The execution time is dramatically lower. The function call is treated more like a black box by the planner, but the execution within the function is highly optimized, especially if project_id_in is used to perform an indexed lookup. The overall plan is simpler and faster.

    Key takeaway: Always benchmark your RLS policies. Encapsulate non-trivial logic in SECURITY DEFINER functions and ensure those functions are internally efficient and use appropriate indexes.

    Edge Case Handling: Superusers and Internal Tooling

    What about a background worker that needs to process data across all tenants? Or a customer support tool where an employee needs to view a customer's data? RLS will block them by default.

    There are two primary patterns for this:

    1. The BYPASSRLS Attribute:

    You can create a special database role for these processes and grant it the BYPASSRLS attribute. This role will completely ignore all RLS policies.

    sql
    -- Role for a background processing service
    CREATE ROLE background_worker LOGIN PASSWORD '...';
    ALTER ROLE background_worker BYPASSRLS;
    
    -- Role for customer support staff
    CREATE ROLE support_agent LOGIN PASSWORD '...';
    -- NO BYPASSRLS for this role!

    Use BYPASSRLS with extreme prejudice. It's suitable for fully trusted, non-user-facing processes. For the support agent, we need a more controlled approach.

    2. Controlled RLS-Bypass within Policies:

    We can build a mechanism for temporary, audited privilege escalation directly into our policies. This is perfect for the support agent scenario.

    First, we add a special setting to our context:

    javascript
    // In our Node.js middleware, for a support agent:
    if (req.user.is_support_agent && req.body.impersonated_user_id) {
        // Log this action for auditing!
        console.log(`AUDIT: Support agent ${req.user.sub} is impersonating ${req.body.impersonated_user_id}`);
    
        // Set the normal user ID for the RLS policies to use
        await client.query(`SET LOCAL app.current_user_id = '${req.body.impersonated_user_id}'`);
    
        // Set a flag to indicate this is an audited impersonation session
        await client.query(`SET LOCAL app.is_impersonating = 'true'`);
    } else {
        // Standard user flow
        await client.query(`SET LOCAL app.current_user_id = '${req.user.sub}'`);
    }

    Now, our RLS policies can be updated to recognize this state. This is often better handled in the central permission function.

    sql
    -- A more robust, top-level permission function
    CREATE OR REPLACE FUNCTION current_user_can_access_project(project_id_in UUID) RETURNS BOOLEAN AS $$
    DECLARE
        v_current_user_id UUID := app_get_claim_uuid('user_id');
        v_is_impersonating BOOLEAN := (current_setting('app.is_impersonating', true) = 'true');
        v_user_role TEXT := app_get_claim('role');
    BEGIN
        -- Condition 1: Support agent with an active impersonation session
        -- We could add more checks here, e.g., logging to an audit table.
        IF v_is_impersonating AND v_user_role = 'SUPPORT_AGENT' THEN
            RETURN TRUE;
        END IF;
    
        -- Condition 2: Regular user access check
        IF v_current_user_id IS NOT NULL THEN
            RETURN check_user_project_access(v_current_user_id, project_id_in);
        END IF;
    
        RETURN FALSE;
    END;
    $$ LANGUAGE plpgsql STABLE SECURITY DEFINER;
    
    ALTER FUNCTION current_user_can_access_project(UUID) SET search_path = public;
    
    -- Update the policy to use the new top-level function
    DROP POLICY select_projects ON projects;
    CREATE POLICY select_projects ON projects FOR SELECT
    USING ( current_user_can_access_project(id) );

    This pattern provides a secure, auditable, and explicit way to grant temporary elevated access without punching a permanent hole with BYPASSRLS.

    Conclusion: A New Foundation for SaaS Security

    By moving tenancy and permission enforcement from the application to the database, we build a far more resilient system. Application-level bugs are less likely to cause catastrophic data breaches because the database itself acts as the final line of defense.

    The dynamic RLS pattern using transaction-local context is a powerful architecture for modern SaaS applications. It provides:

    * Centralized Logic: Security rules live in one place—the database schema—not scattered across dozens of services and repositories.

    * Unenforceable Security: Developers cannot accidentally forget a WHERE clause; the policy is always applied.

    * Flexibility: The use of SECURITY DEFINER functions allows for arbitrarily complex permission logic, from simple role checks to deep hierarchical graph traversals.

    * Performance: When implemented correctly with function encapsulation and proper indexing, the performance overhead is manageable and often negligible compared to the security benefits.

    This is not a beginner's pattern. It requires a deep understanding of PostgreSQL, transaction management, and security principles. But for senior engineers tasked with building the next generation of secure, scalable, multi-tenant platforms, it is an indispensable tool in the arsenal.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles