Advanced PostgreSQL RLS for Hierarchical Multi-Tenant SaaS

20 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Scaling Challenge of RLS in Hierarchical Multi-Tenancy

Row-Level Security (RLS) in PostgreSQL is a powerful mechanism for enforcing data partitioning at the database layer, making it a natural fit for multi-tenant SaaS applications. The canonical example—a policy like USING (tenant_id = current_setting('app.tenant_id'))—is effective for flat tenancy models where a user belongs to a single, isolated tenant.

However, real-world SaaS products rarely have such simple data models. They often feature nested or hierarchical tenancy: a user might be an admin of an organization, granting them implicit access to all teams and projects within that organization, while also being a direct member of a specific team.

Implementing RLS for these models presents a significant architectural and performance challenge. Naive policies that use subqueries or JOINs to traverse the hierarchy on every data access are a recipe for catastrophic performance degradation. The query planner struggles to optimize these dynamic checks, leading to sequential scans and unscalable query times as the number of tenants and users grows.

This article presents a production-proven pattern for implementing highly performant RLS in a hierarchical multi-tenant environment. We will bypass expensive per-query hierarchy traversals by pre-calculating a user's access rights upon authentication, embedding them in a JWT, and using this information to populate a PostgreSQL session context. This approach transforms complex, dynamic RLS checks into simple, index-friendly lookups, ensuring both security and scalability.

We assume you are already familiar with PostgreSQL RLS fundamentals, multi-tenancy concepts, and JWT-based authentication.

Section 1: Defining the Hierarchical Data Model

To ground our discussion, let's define a concrete schema that represents a common SaaS hierarchy. A user's access is determined by their memberships in various organizations and teams, each with a specific role.

  • organizations: The top-level tenant.
  • teams: Belong to an organization.
  • projects: Belong to a team. This is the primary resource we want to protect.
  • users: The actors in the system.
  • memberships: A join table defining a user's role within an organization or a team.
  • Here is the SQL DDL to create this structure:

    sql
    -- Enable UUID generation
    CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
    
    -- Top-level tenants
    CREATE TABLE organizations (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        name TEXT NOT NULL
    );
    
    -- Teams belong to organizations
    CREATE TABLE teams (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
        name TEXT NOT NULL
    );
    CREATE INDEX ON teams(organization_id);
    
    -- Projects are the primary resource we'll protect with RLS
    CREATE TABLE projects (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        team_id UUID NOT NULL REFERENCES teams(id) ON DELETE CASCADE,
        name TEXT NOT NULL,
        data JSONB
    );
    CREATE INDEX ON projects(team_id);
    
    -- Users
    CREATE TABLE users (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        email TEXT NOT NULL UNIQUE
    );
    
    -- Membership roles
    CREATE TYPE membership_role AS ENUM ('org_admin', 'team_member');
    
    -- Join table for user permissions
    CREATE TABLE memberships (
        id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
        user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
        -- A membership can be at the organization or team level
        organization_id UUID REFERENCES organizations(id) ON DELETE CASCADE,
        team_id UUID REFERENCES teams(id) ON DELETE CASCADE,
        role membership_role NOT NULL,
        
        -- Ensure a membership is tied to either an org or a team, but not both
        CONSTRAINT org_or_team_check CHECK (
            (organization_id IS NOT NULL AND team_id IS NULL) OR 
            (organization_id IS NULL AND team_id IS NOT NULL)
        )
    );
    CREATE INDEX ON memberships(user_id);
    CREATE INDEX ON memberships(organization_id);
    CREATE INDEX ON memberships(team_id);

    Our access rules are:

  • A team_member can access projects belonging to their specific team.
  • An org_admin can access all projects within any team under their organization.
  • Section 2: The Naive RLS Implementation and Its Performance Pitfalls

    With our schema defined, a first attempt at an RLS policy for the projects table might involve a complex subquery to check for permissions.

    First, let's enable RLS on the projects table:

    sql
    ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
    -- Prevent access by default
    ALTER TABLE projects FORCE ROW LEVEL SECURITY;

    Now, let's create a role for our application users. This role will not have any superuser or bypass privileges.

    sql
    CREATE ROLE app_user;
    -- Grant necessary permissions to the role
    GRANT USAGE ON SCHEMA public TO app_user;
    GRANT SELECT, INSERT, UPDATE, DELETE ON projects, teams, organizations, users, memberships TO app_user;

    Here is the naive RLS policy. It attempts to check both direct team membership and hierarchical organization membership within the policy's USING clause.

    sql
    CREATE OR REPLACE FUNCTION get_current_user_id() RETURNS UUID AS $$
    BEGIN
        -- In a real app, this would be derived securely from the session context
        -- We cast to UUID to ensure type safety.
        RETURN current_setting('app.user_id')::uuid;
    EXCEPTION
        WHEN OTHERS THEN
            RETURN NULL;
    END;
    $$ LANGUAGE plpgsql STABLE;
    
    CREATE POLICY select_projects ON projects FOR SELECT TO app_user
    USING (
        team_id IN (
            -- Case 1: Direct team membership
            SELECT m.team_id
            FROM memberships m
            WHERE m.user_id = get_current_user_id()
              AND m.role = 'team_member'
              AND m.team_id IS NOT NULL
            
            UNION
            
            -- Case 2: Hierarchical org admin membership
            SELECT t.id
            FROM teams t
            JOIN memberships m ON t.organization_id = m.organization_id
            WHERE m.user_id = get_current_user_id()
              AND m.role = 'org_admin'
              AND m.organization_id IS NOT NULL
        )
    );

    Why this is a performance disaster:

    For every single query against the projects table (e.g., SELECT FROM projects WHERE name = '...'), PostgreSQL must execute the complex subquery inside the USING clause for each row it considers*. The query planner cannot effectively optimize this. It often results in nested loops and an inability to use indexes on the projects table efficiently.

    Let's demonstrate with EXPLAIN ANALYZE. First, populate some data:

    sql
    -- Create a user, org, teams, and projects
    INSERT INTO users (id, email) VALUES ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', '[email protected]');
    INSERT INTO organizations (id, name) VALUES ('b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'Org A');
    INSERT INTO teams (id, organization_id, name) VALUES 
        ('c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13', 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'Team A1'),
        ('d3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14', 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'Team A2');
    
    -- Make the user an admin of Org A
    INSERT INTO memberships (user_id, organization_id, role) VALUES ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'org_admin');
    
    -- Create 10,000 projects spread across the two teams
    INSERT INTO projects (team_id, name, data) 
    SELECT 
        CASE WHEN i % 2 = 0 THEN 'c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13'::uuid ELSE 'd3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14'::uuid END,
        'Project ' || i,
        '{}'
    FROM generate_series(1, 10000) s(i);
    
    ANALYZE projects;
    ANALYZE teams;
    ANALYZE memberships;

    Now, let's run a query as this user:

    sql
    -- Simulate an application connection
    SET ROLE app_user;
    SET app.user_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11';
    
    EXPLAIN ANALYZE SELECT count(*) FROM projects;

    The query plan will be complex and slow, likely involving multiple scans of the memberships and teams tables for every row check on projects. The cost will scale poorly with the number of users, teams, and organizations.

    Section 3: The Session Context Pattern with JWTs

    The performant solution is to shift the complex work of calculating permissions to a single point in time: user authentication. We will compute the user's entire set of accessible entity IDs and store them in a stateless token (JWT). On each API request, we'll pass these IDs to PostgreSQL via session variables.

    Step 1: The Permissions Calculation Query

    Upon successful login, we need a single, efficient query to gather all team_ids the user can access, both directly and hierarchically. A recursive Common Table Expression (CTE) is not ideal here, but a UNION is perfect.

    sql
    -- Query to get all accessible team IDs for a given user_id
    SELECT team_id FROM (
        -- Direct team memberships
        SELECT m.team_id
        FROM memberships m
        WHERE m.user_id = $1 -- parameter for user_id
          AND m.role = 'team_member'
          AND m.team_id IS NOT NULL
        
        UNION
        
        -- Teams accessible via organization admin role
        SELECT t.id AS team_id
        FROM teams t
        JOIN memberships m ON t.organization_id = m.organization_id
        WHERE m.user_id = $1 -- parameter for user_id
          AND m.role = 'org_admin'
          AND m.organization_id IS NOT NULL
    ) AS accessible_teams;

    This query is run once per session/login. It's far more efficient to run this single complex query than to embed it within an RLS policy.

    Step 2: Embedding Permissions in the JWT

    Your authentication service will execute the above query and embed the results into the JWT payload. This keeps your application stateless while providing the necessary context for the database.

    Here's an example of a JWT payload:

    json
    {
      "sub": "a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11",
      "email": "[email protected]",
      "exp": 1678886400,
      "https://myapp.com/claims": {
        "user_id": "a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11",
        "accessible_teams": [
          "c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13",
          "d3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14"
        ]
      }
    }

    Step 3: Populating the PostgreSQL Session Context

    In your application's API middleware, before executing any business logic, you must extract these claims from the validated JWT and set them as PostgreSQL session variables. It is critical to use SET LOCAL to ensure these variables are scoped only to the current transaction, preventing state leakage between requests in a connection pool.

    Here is a Node.js (with node-postgres) example:

    javascript
    // Middleware in an Express.js or similar framework
    async function setDbSessionContext(req, res, next) {
        if (!req.user || !req.user.claims) {
            return next(); // No user or claims, proceed without context
        }
    
        const { user_id, accessible_teams } = req.user.claims;
    
        // Get a client from the pool
        const client = await pool.connect();
        req.dbClient = client; // Attach client to the request object
    
        try {
            // Start a transaction
            await client.query('BEGIN');
    
            // Use SET LOCAL for transaction-level scope
            // Pass values as parameters to prevent SQL injection
            await client.query(`SET LOCAL app.user_id = $1`, [user_id]);
            
            // Convert array of UUIDs to a comma-separated string for PostgreSQL
            const teamsString = accessible_teams.join(',');
            await client.query(`SET LOCAL app.accessible_teams = $1`, [teamsString]);
    
            next();
        } catch (err) {
            // If setting context fails, release the client and send an error
            client.release();
            res.status(500).send('Failed to set database session context');
        }
    }
    
    // In your route handler
    app.get('/projects', setDbSessionContext, async (req, res) => {
        const client = req.dbClient;
        try {
            const result = await client.query('SELECT * FROM projects');
            await client.query('COMMIT');
            res.json(result.rows);
        } catch (err) {
            await client.query('ROLLBACK');
            res.status(500).send('Error executing query');
        } finally {
            if (client) {
                client.release();
            }
        }
    });

    Section 4: High-Performance RLS Policies with Session Context

    Now we can rewrite our RLS policy to be incredibly simple and fast. It no longer contains any subqueries or joins.

    sql
    -- Helper function to safely get the teams array from the session setting
    CREATE OR REPLACE FUNCTION get_accessible_teams() RETURNS UUID[] AS $$
    BEGIN
        -- The 'true' flag in current_setting makes it return NULL if the setting is not found,
        -- preventing an error.
        RETURN string_to_array(current_setting('app.accessible_teams', true), ',')::uuid[];
    EXCEPTION
        -- If the setting is empty or invalid, return an empty array
        WHEN OTHERS THEN
            RETURN ARRAY[]::uuid[];
    END;
    $$ LANGUAGE plpgsql STABLE;
    
    -- Drop the old, slow policy
    DROP POLICY IF EXISTS select_projects ON projects;
    
    -- Create the new, high-performance policy
    CREATE POLICY select_projects_performant ON projects FOR SELECT TO app_user
    USING (
        team_id = ANY(get_accessible_teams())
    );
    
    -- It's good practice to create policies for other operations too
    CREATE POLICY insert_projects_performant ON projects FOR INSERT TO app_user
    WITH CHECK (
        team_id = ANY(get_accessible_teams())
    );
    
    -- Similar policies for UPDATE and DELETE
    CREATE POLICY update_projects_performant ON projects FOR UPDATE TO app_user
    USING (
        team_id = ANY(get_accessible_teams())
    );
    
    CREATE POLICY delete_projects_performant ON projects FOR DELETE TO app_user
    USING (
        team_id = ANY(get_accessible_teams())
    );

    The Performance Difference:

    Let's re-run our EXPLAIN ANALYZE with the new setup:

    sql
    -- Simulate an application connection
    SET ROLE app_user;
    
    -- Set the session context as the middleware would
    SET app.user_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11';
    SET app.accessible_teams = 'c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13,d3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14';
    
    EXPLAIN ANALYZE SELECT count(*) FROM projects;

    The query plan will now be vastly simpler. PostgreSQL sees team_id = ANY(...) with a constant array. It can use the b-tree index on projects(team_id) to perform a highly efficient Bitmap Index Scan or similar plan. The query cost is no longer dependent on the complexity of your permission hierarchy; it only depends on the number of projects being accessed. This pattern scales beautifully.

    Section 5: Edge Case - Handling "Super Admins" and Bypassing RLS

    Your application will inevitably need a way to bypass RLS for administrative tasks, data migrations, or customer support tooling.

    The BYPASSRLS attribute on a role is the correct tool for this.

  • Create a specific, privileged role:
  • sql
        CREATE ROLE internal_tool_role BYPASSRLS;
        -- Grant this role only the permissions it needs
        GRANT SELECT, UPDATE ON users TO internal_tool_role;
  • Use a separate connection pool: Your internal tools should connect to the database with a user belonging to this internal_tool_role, completely separate from the app_user pool.
  • Temporarily escalate privileges (use with extreme caution): In some cases, the main application might need to perform a bypass operation. This can be done by having the main application user be a member of a role that can SET ROLE to the bypass role.
  • sql
        -- The main application user
        CREATE USER my_app_user WITH PASSWORD '...';
        GRANT app_user TO my_app_user;
        GRANT internal_tool_role TO my_app_user;
        
        -- In a specific, secure part of your application code:
        -- BEGIN;
        -- SET LOCAL ROLE internal_tool_role;
        -- -- Perform operation that needs to bypass RLS
        -- SELECT * FROM projects WHERE ...; -- This will bypass all RLS policies
        -- COMMIT; -- The role change is reverted at transaction end

    This approach must be meticulously audited and restricted to prevent privilege escalation vulnerabilities.

    Section 6: Edge Case - The "Stale JWT" Problem

    A critical flaw in this pattern is that permissions are only calculated when the JWT is minted. If a user's permissions change (e.g., they are removed from a team), their existing JWT will contain stale accessible_teams data, granting them access they should no longer have until the token expires.

    Here are several strategies to mitigate this, from simplest to most robust:

  • Short-Lived JWTs: Use very short expiry times (e.g., 5-15 minutes) and a robust refresh token mechanism. This limits the window of stale access but doesn't eliminate it.
  • Revocation Lists: Maintain a blacklist of revoked tokens (e.g., in Redis or a database table). This adds complexity and a stateful component to your otherwise stateless authentication.
  • The Permissions Versioning Pattern (Recommended): This is a highly effective and performant solution.
  • a. Add a versioning column to your users table (or a dedicated permissions table).

    sql
            ALTER TABLE users ADD COLUMN permissions_version INTEGER NOT NULL DEFAULT 1;

    b. Whenever a user's memberships or roles change, increment this version number in the same transaction.

    sql
            -- When removing a user from a team
            BEGIN;
            DELETE FROM memberships WHERE user_id = '...' AND team_id = '...';
            UPDATE users SET permissions_version = permissions_version + 1 WHERE id = '...';
            COMMIT;

    c. Include the permissions_version in the JWT claims when it's created.

    json
            "https://myapp.com/claims": {
              "user_id": "...",
              "accessible_teams": [...],
              "permissions_version": 5
            }

    d. In your API middleware, after validating the JWT, perform a single, lightweight check against the database.

    javascript
            // In the setDbSessionContext middleware
            const { user_id, permissions_version: jwt_version } = req.user.claims;
            
            const { rows } = await client.query('SELECT permissions_version FROM users WHERE id = $1', [user_id]);
            
            if (rows.length === 0 || rows[0].permissions_version !== jwt_version) {
                // Mismatch! Permissions are stale.
                client.release();
                return res.status(401).send('Permissions have changed. Please log in again.');
            }
            
            // If versions match, proceed to set session context...

    This pattern ensures that any permission change immediately invalidates all of a user's active sessions on their next request, forcing a re-authentication to get a fresh JWT with up-to-date claims.

    Section 7: Integration with ORMs and Connection Pooling

    The use of SET LOCAL is non-negotiable for this pattern's safety in a concurrent environment. This has a direct and critical implication for your connection pooler, especially external poolers like PgBouncer.

  • PgBouncer in Transaction Pooling Mode: This mode is the most efficient but is incompatible with our pattern. In transaction mode, PgBouncer can return a connection to the pool immediately after a COMMIT or ROLLBACK. Another request could then receive that same connection with the previous request's app.* settings still active for a brief moment before the next transaction starts, leading to data leakage. SET LOCAL helps, but it's not a complete guarantee in all edge cases of this mode.
  • PgBouncer in Session Pooling Mode: This is the required mode. In session mode, a connection is assigned to a client for the entire duration of that client's connection. This ensures that SET LOCAL works as expected, as the session state is reliably contained. The downside is slightly lower performance, as connections are held for longer.
  • Application-Level Connection Pools: Most built-in application connection pools (like in node-postgres, HikariCP for Java, or SQLAlchemy's pool) effectively operate in a manner similar to session pooling. You get a connection, use it for a request (wrapped in a transaction), and then release it. The key is to ensure your SET LOCAL commands are executed at the beginning of the transaction for every request.
  • Here is a more robust node-postgres example showing proper transaction and client lifecycle management:

    javascript
    // A utility function to wrap the entire request lifecycle
    async function withDbClient(req, handler) {
        const client = await pool.connect();
        try {
            await client.query('BEGIN');
    
            if (req.user && req.user.claims) {
                const { user_id, accessible_teams } = req.user.claims;
                const teamsString = accessible_teams.join(',');
                await client.query(`SET LOCAL app.user_id = $1`, [user_id]);
                await client.query(`SET LOCAL app.accessible_teams = $1`, [teamsString]);
            }
    
            const result = await handler(client);
            await client.query('COMMIT');
            return result;
        } catch (err) {
            await client.query('ROLLBACK');
            throw err; // Re-throw the error to be handled by an error middleware
        } finally {
            client.release();
        }
    }
    
    // Usage in a route
    app.get('/projects', async (req, res, next) => {
        try {
            const projects = await withDbClient(req, (client) => {
                return client.query('SELECT id, name FROM projects');
            });
            res.json(projects.rows);
        } catch (err) {
            next(err); // Pass to error handler
        }
    });

    Conclusion: A Scalable Architecture for Secure Data Access

    By moving complex permission logic out of RLS policies and into a single, upfront authentication step, we transform an unscalable security model into a highly performant one. The session context pattern, powered by JWT claims, provides the database with all the information it needs to make simple, index-friendly access control decisions.

    This architecture provides several key benefits:

  • High Performance: RLS policies become trivial, allowing the query planner to use indexes effectively.
  • Scalability: Performance is not degraded as the number of tenants, users, or the complexity of the hierarchy grows.
  • Centralized Logic: Permission calculation is centralized in your authentication service, not scattered across database policies.
  • Maintainability: RLS policies are simple and rarely need to change, even if your business logic for permissions evolves.
  • While this pattern introduces new considerations, such as handling stale JWTs and ensuring correct connection pool configuration, the trade-offs are well worth the dramatic gains in performance and scalability for any non-trivial multi-tenant SaaS application built on PostgreSQL.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles