Dynamic PostgreSQL RLS for Hierarchical Multi-Tenant SaaS

16 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

Beyond Flat Tenancy: The Challenge of Hierarchical RLS

For senior engineers building multi-tenant SaaS platforms, PostgreSQL's Row-Level Security (RLS) is a powerful tool for data isolation. The canonical example—a simple policy checking a tenant_id column against a session variable—is a well-trodden path:

sql
-- The simple, common approach
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;

CREATE POLICY projects_isolation_policy ON projects
FOR ALL
USING (tenant_id = current_setting('app.current_tenant_id')::uuid);

This pattern works flawlessly for applications with a flat tenancy model, where a user belongs to exactly one tenant. However, modern SaaS applications rarely fit this mold. The reality is a complex web of permissions:

* A user can be a member of multiple organizations.

* Within an organization, they might be an 'owner' with full access, but only a 'viewer' in another.

* Permissions are hierarchical: access granted at the Organization level should cascade down to its Teams, Projects, and Tasks.

* A user might also have direct, granular access to a specific Project, bypassing the organizational hierarchy.

Attempting to model this complexity with a single app.current_tenant_id is untenable. It forces the application layer to constantly switch context or, worse, pull broad datasets and filter them in application memory, defeating the purpose of database-level security. The challenge is to make RLS policies aware of the user's entire permission graph for the duration of their session.

This article presents a production-grade pattern for implementing dynamic, hierarchical RLS. We will leverage PostgreSQL's session-level configuration parameters (GUCs), SECURITY DEFINER functions, and recursive Common Table Expressions (CTEs) to enforce complex rules directly within the database, ensuring security is both robust and centralized.


The Core Pattern: Session Context Injection

The fundamental shift is to move from a single tenant ID to a rich session context. This context, established by your application middleware at the beginning of each request or transaction, contains all the information the database needs to make authorization decisions. This context is set using set_config().

Let's define our hierarchical schema:

sql
CREATE TABLE organizations (id UUID PRIMARY KEY);

CREATE TABLE teams (
    id UUID PRIMARY KEY,
    organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE
);

CREATE TABLE projects (
    id UUID PRIMARY KEY,
    team_id UUID NOT NULL REFERENCES teams(id) ON DELETE CASCADE
);

CREATE TABLE users (id UUID PRIMARY KEY);

-- A join table to model memberships and roles
CREATE TABLE memberships (
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    -- Can be an organization ID, team ID, etc.
    entity_id UUID NOT NULL,
    -- e.g., 'organization', 'team'
    entity_type TEXT NOT NULL,
    -- e.g., 'admin', 'member'
    role TEXT NOT NULL,
    PRIMARY KEY (user_id, entity_id)
);

Instead of setting one ID, our application middleware will authenticate the user, compute their full access scope, and inject it into the session. A robust approach is to pass a JSONB object representing the user's access claims.

Middleware Implementation (Conceptual Example in Node.js/Express)

Imagine a middleware that runs after authentication. It queries the memberships table to build a permissions object for the current user and sets it as a session variable.

javascript
// express-middleware.js
const { Pool } = require('pg');
const pool = new Pool();

async function setUserContext(req, res, next) {
    if (!req.user) {
        return next(); // Or handle unauthenticated request
    }

    const client = await pool.connect();
    req.dbClient = client; // Attach client to request for the route handler

    try {
        // Fetch all direct memberships for the user
        const membershipsResult = await client.query(
            'SELECT entity_id, entity_type, role FROM memberships WHERE user_id = $1',
            [req.user.id]
        );

        // Build a structured context object
        const userContext = {
            userId: req.user.id,
            memberships: membershipsResult.rows.reduce((acc, row) => {
                if (!acc[row.entity_type]) {
                    acc[row.entity_type] = [];
                }
                acc[row.entity_type].push({ id: row.entity_id, role: row.role });
                return acc;
            }, {})
        };

        // Set the context as a JSON string in a namespaced GUC.
        // The third argument 'false' scopes the setting to the current transaction.
        await client.query(`SELECT set_config('app.user_context', $1, false)`, [JSON.stringify(userContext)]);

        res.on('finish', () => {
            client.release();
        });

        next();
    } catch (err) {
        client.release();
        next(err);
    }
}

// Usage in Express app
app.use(authenticateUser); // Your JWT/session middleware
app.use(setUserContext);

Now, every query within the transaction for that request will have access to this rich context via current_setting('app.user_context').


Crafting Hierarchy-Aware RLS Policies with Functions

With the user context available in the session, we can now write intelligent RLS policies. Directly parsing JSON and running complex logic within the USING clause of a policy is cumbersome and inefficient. The standard practice is to encapsulate this logic within a function.

Crucially, this function must be STABLE (so its result can be cached within a query) and often SECURITY DEFINER (to allow it to query tables like memberships that the calling user may not have direct access to). Using SECURITY DEFINER has significant security implications that we will address later.

Let's create a function that determines if a user can access a given project. It needs to check for access via two paths:

  • Direct membership to the project's parent team.
  • Inherited access from the project's grandparent organization.

The `SECURITY DEFINER` Helper Function

sql
CREATE OR REPLACE FUNCTION can_user_access_project(project_id UUID) 
RETURNS BOOLEAN AS $$
DECLARE
    user_context JSONB;
    project_team_id UUID;
    project_org_id UUID;
    membership RECORD;
BEGIN
    -- Safely get the context, returning false if not set or invalid
    BEGIN
        user_context := current_setting('app.user_context')::jsonb;
    EXCEPTION WHEN OTHERS THEN
        RETURN FALSE;
    END;

    -- Get the project's hierarchy
    SELECT p.team_id, t.organization_id 
    INTO project_team_id, project_org_id
    FROM projects p
    JOIN teams t ON p.team_id = t.id
    WHERE p.id = project_id;

    -- 1. Check for organization-level access (e.g., 'admin' role)
    IF jsonb_path_exists(user_context, '$.memberships.organization[*] ? (@.id == $org_id && @.role == "admin")', jsonb_build_object('org_id', project_org_id)) THEN
        RETURN TRUE;
    END IF;

    -- 2. Check for direct team-level access
    IF jsonb_path_exists(user_context, '$.memberships.team[*] ? (@.id == $team_id)', jsonb_build_object('team_id', project_team_id)) THEN
        RETURN TRUE;
    END IF;

    RETURN FALSE;
END;
$$ LANGUAGE plpgsql STABLE SECURITY DEFINER;

-- IMPORTANT: Set a safe search path for SECURITY DEFINER functions
-- This prevents a malicious user from creating a rogue function in their own schema
-- that could be executed with the definer's privileges.
ALTER FUNCTION can_user_access_project(UUID) SET search_path = pg_catalog, public;

Applying the Function in an RLS Policy

Now the policy itself becomes clean and declarative:

sql
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;

-- Ensure RLS is also enforced for table owners
ALTER TABLE projects FORCE ROW LEVEL SECURITY;

CREATE POLICY select_project_based_on_hierarchy ON projects FOR SELECT
USING ( can_user_access_project(id) );

When a user runs SELECT * FROM projects, PostgreSQL will, for each row, invoke can_user_access_project() with the id of that row. If the function returns true, the row is included in the result set.


The Performance Trap and The Recursive CTE Solution

The previous pattern is functionally correct, but it hides a massive performance pitfall. The USING clause is evaluated for every single row in the projects table during a scan. Our can_user_access_project function, while STABLE, still executes a join to fetch the project's hierarchy. On a table with millions of projects, this per-row invocation will be disastrously slow.

We can prove this with EXPLAIN ANALYZE:

sql
-- Assuming a user has access to a few projects
EXPLAIN ANALYZE SELECT * FROM projects;

-- Likely output snippet
-- ...
-- -> Seq Scan on projects  (cost=0.00..123456.78 rows=500000 width=...)
--      Filter: can_user_access_project(id)
-- ...

The planner is forced into a sequential scan, applying our expensive filter to every row.

A More Performant Approach: Pre-calculating a Set of Accessible IDs

The performant solution is to calculate the complete set of accessible entity IDs once per transaction and use that set for filtering. Instead of a function that checks one ID at a time, we'll write a function that returns the entire set of accessible project IDs.

This is a perfect use case for a recursive CTE to traverse the hierarchy.

sql
CREATE OR REPLACE FUNCTION get_accessible_project_ids()
RETURNS TABLE(id UUID) AS $$
DECLARE
    user_context JSONB;
    user_id UUID;
BEGIN
    BEGIN
        user_context := current_setting('app.user_context')::jsonb;
        user_id := (user_context->>'userId')::uuid;
    EXCEPTION WHEN OTHERS THEN
        -- Return empty set if context is missing/invalid
        RETURN;
    END;

    RETURN QUERY
    WITH RECURSIVE accessible_entities AS (
        -- Base case: Direct memberships of the user
        SELECT m.entity_id, m.entity_type
        FROM memberships m
        WHERE m.user_id = user_id

        UNION

        -- Recursive step: Traverse down the hierarchy
        -- From Organization -> Team
        SELECT t.id, 'team'::text
        FROM teams t
        JOIN accessible_entities ae ON t.organization_id = ae.entity_id
        WHERE ae.entity_type = 'organization'
        -- You could add role checks here, e.g., only cascade for 'admin' roles
    )
    -- Final projection: Select all projects belonging to the accessible entities
    SELECT p.id FROM projects p
    WHERE
        -- Projects in accessible teams
        p.team_id IN (SELECT entity_id FROM accessible_entities WHERE entity_type = 'team')
        OR
        -- Projects in teams that are under accessible organizations
        p.team_id IN (SELECT t.id FROM teams t WHERE t.organization_id IN (SELECT entity_id FROM accessible_entities WHERE entity_type = 'organization'));

END;
$$ LANGUAGE plpgsql STABLE SECURITY DEFINER;

ALTER FUNCTION get_accessible_project_ids() SET search_path = pg_catalog, public;

Now, we rewrite our RLS policy to use this set-returning function. The key is to use an IN or = ANY subquery, which the PostgreSQL planner is much better at optimizing.

sql
-- Drop the old policy
DROP POLICY IF EXISTS select_project_based_on_hierarchy ON projects;

-- Create the new, performant policy
CREATE POLICY select_project_performant ON projects FOR SELECT
USING ( id IN (SELECT * FROM get_accessible_project_ids()) );

Let's re-examine the query plan:

sql
EXPLAIN ANALYZE SELECT * FROM projects;

-- New, improved output snippet
-- ...
-- -> Nested Loop
--    -> Function Scan on get_accessible_project_ids  (cost=... rows=100 ...)
--    -> Index Scan using projects_pkey on projects  (cost=... rows=1 ...)
--          Index Cond: (id = get_accessible_project_ids.id)
-- ...

This plan is vastly superior. PostgreSQL executes the get_accessible_project_ids function once, gets the small set of allowed IDs (e.g., 100 rows), and then performs a highly efficient Index Scan on the projects table to retrieve only those specific rows. This scales beautifully, even with billions of projects, as the work done is proportional to the number of projects a user has access to, not the total number of projects in the database.


Production Hardening and Edge Cases

Implementing this pattern in a real-world system requires addressing several critical edge cases.

1. `INSERT`, `UPDATE`, and `DELETE` Policies

RLS isn't just for SELECT. You must define policies for other DML operations. INSERT and UPDATE policies typically use a WITH CHECK clause to prevent users from creating or moving data into a hierarchy they cannot access.

sql
-- Let's define a more granular access function for writing
CREATE OR REPLACE FUNCTION can_user_write_to_team(team_id_to_check UUID)
RETURNS BOOLEAN AS $$
    -- Logic to check if the current user has 'admin' or 'member' role
    -- in the team OR is an 'admin' of the parent organization.
    -- This would be similar to the can_user_access_project function, but
    -- with more specific role checks.
$$ LANGUAGE plpgsql STABLE SECURITY DEFINER;

-- Policy for creating new projects
CREATE POLICY insert_project ON projects FOR INSERT
WITH CHECK ( can_user_write_to_team(team_id) );

-- Policy for updating existing projects
CREATE POLICY update_project ON projects FOR UPDATE
USING ( id IN (SELECT * FROM get_accessible_project_ids()) ) -- Can they see it?
WITH CHECK ( can_user_write_to_team(team_id) ); -- Is the new location valid?

The WITH CHECK clause for UPDATE is crucial. It prevents a user who has access to Project A in Team X from maliciously updating its team_id to Team Y, which they do not have access to.

2. The Connection Pooling Conundrum (PgBouncer)

This is the most common and dangerous pitfall. Many production environments use transaction-level connection poolers like PgBouncer to manage connections efficiently. In transaction pooling mode, PgBouncer can give your application a different physical database connection for each transaction. This breaks our pattern because set_config(..., ..., false) creates a transaction-scoped variable. When the transaction ends, the setting is gone.

If a new transaction for the same API request gets a different physical connection from the pool, our app.user_context will not be set, and all RLS policies will fail closed, denying access.

Solution: You must configure your connection pooler for session-level pooling.

* PgBouncer: Set pool_mode = session. This dedicates a physical database connection to the client application as long as it's connected. When the client disconnects (e.g., the Node.js process releases the client back to its internal pool), PgBouncer can then clean up and reassign the database connection.

* Application-side Poolers: Most modern application-level pools (like node-postgres's pool or HikariCP in Java) manage this correctly, as they grab a connection and hold it for the duration of a request's work.

The trade-off is that session pooling is less efficient at the database level, as it can hold connections open for longer. You must monitor your max_connections and tune your pool sizes accordingly. For RLS that relies on session state, this is non-negotiable.

3. Superuser and Internal Tool Access

Your internal support tools, administrative scripts, or data migration processes need a way to bypass RLS. The correct way to handle this is with the BYPASSRLS attribute.

sql
-- Create a specific role for administrative tasks
CREATE ROLE internal_tools_role;

-- Grant it the ability to bypass all RLS policies
ALTER ROLE internal_tools_role BYPASSRLS;

-- Grant this role to the user account used by your tools
GRANT internal_tools_role TO tool_user;

Never grant BYPASSRLS to your main application user. This privilege should be reserved for a small, trusted set of roles used for specific, audited operations.

4. Performance Tuning the Access Function

Our get_accessible_project_ids function is much better than the per-row check, but it can still be optimized.

* Indexing: The performance of the recursive CTE depends entirely on proper indexing. The memberships table should have a composite index on (user_id, entity_id). All foreign key columns (organization_id, team_id) must be indexed.

Materialization: For extremely complex permission graphs, the CTE can become expensive to re-calculate on every query. An alternative, application-side caching strategy can be employed: the middleware calculates the list of accessible project IDs once* upon user login (or first request), caches it (e.g., in Redis), and then simply passes the pre-computed list of IDs into the session GUC.

Example: Application-side pre-computation

  • Middleware runs the recursive query once and gets a list of UUIDs.
    • It sets this list as a simple comma-separated string:
    javascript
        const accessibleIds = ['uuid-1', 'uuid-2', ...];
        await client.query(`SELECT set_config('app.accessible_project_ids_csv', $1, false)`, [accessibleIds.join(',')]);
    • The RLS policy becomes trivial and lightning-fast:
    sql
        CREATE POLICY select_project_optimized ON projects FOR SELECT
        USING ( id::text = ANY(string_to_array(current_setting('app.accessible_project_ids_csv', true), ',')) );

    This moves complexity from the database to the application and introduces cache invalidation challenges (what happens when a user's permissions change?), but for read-heavy workloads, it can be the ultimate performance optimization.

    Conclusion

    By moving beyond simple tenant ID checks and embracing a session-level context, we can unlock the full potential of PostgreSQL's Row-Level Security for complex, hierarchical SaaS applications. This pattern centralizes authorization logic within the database, making it authoritative and preventing security bugs scattered across application code.

    The key takeaways for a production-ready implementation are:

  • Inject a Rich Context: Use middleware to set a comprehensive user context (e.g., a JSON object) in a session GUC (app.user_context) at the start of each transaction.
  • Use Set-Returning Functions: Avoid per-row function calls in RLS policies. Instead, use a SECURITY DEFINER function with a recursive CTE to compute the entire set of accessible IDs once.
  • Optimize the Policy: Write policies that use id IN (SELECT ...) to allow the query planner to execute the function once and perform an efficient index scan.
  • Manage Connection Pooling: Ensure your connection pooler is in session mode to preserve the GUC across transactions within a single request.
  • Harden for Production: Implement WITH CHECK policies for INSERT/UPDATE, use the BYPASSRLS attribute for admin roles, and ensure all underlying tables and security functions are properly indexed and secured.
  • While more complex to set up than basic RLS, this dynamic, hierarchy-aware pattern provides a scalable and secure foundation for data isolation in sophisticated multi-tenant systems.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles