Advanced PostgreSQL RLS for Hierarchical Multi-Tenant SaaS

20 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Illusion of Simplicity in Multi-Tenant Security

For many SaaS applications, multi-tenancy is a solved problem. A tenant_id column on every relevant table, a session variable like current_setting('app.tenant_id'), and a simple Row-Level Security (RLS) policy: USING (tenant_id = current_setting('app.tenant_id')::uuid). This pattern is effective, ubiquitous, and sufficient for flat tenancy models.

However, the moment your product's domain model evolves to support hierarchical relationships—Organizations containing Teams, which in turn contain Projects—this simple model shatters. A user's access is no longer defined by a single tenant_id. A user might be an owner of an Organization, granting them implicit access to all child Teams and Projects. Another user might be a member of a specific Team, granting them access only to that Team's Projects. A third might be invited directly to a single Project.

Attempting to solve this at the application layer leads to a nightmare of complex, error-prone authorization logic scattered across your codebase. Queries become bloated with permission-checking JOINs and WHERE clauses. The database, the ultimate arbiter of data, is relegated to a passive store, unaware of the security context it's meant to enforce. This is where we, as senior engineers, must leverage the full power of the database. This post dissects the implementation of a robust, performant, and centrally-managed hierarchical RLS system in PostgreSQL.

We will build a system that handles this complexity directly at the data layer, examining the performance trade-offs of different RLS implementations, from naive function calls to optimized subqueries and session-cached permission sets.


Setting the Stage: The Hierarchical Data Model

To ground our discussion, let's define a concrete schema representing a typical hierarchical SaaS product. An organization is the top-level tenant. It has teams, which have projects. users are granted access via a memberships table that links them to one of these entities with a specific role.

Here is the DDL for our core schema. Note the nullable parent relationships in teams and projects that form the hierarchy.

sql
-- Enable necessary extensions
CREATE EXTENSION IF NOT EXISTS "pgcrypto";

-- User and authentication table
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email TEXT NOT NULL UNIQUE,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Top-level tenant
CREATE TABLE organizations (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name TEXT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Teams belong to an organization
CREATE TABLE teams (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
    name TEXT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Projects can belong directly to an organization or to a team
CREATE TABLE projects (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
    team_id UUID REFERENCES teams(id) ON DELETE CASCADE, -- Nullable for org-level projects
    name TEXT NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- The data we want to protect
CREATE TABLE tasks (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    title TEXT NOT NULL,
    completed BOOLEAN NOT NULL DEFAULT FALSE,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Membership and roles table connecting users to the hierarchy
CREATE TYPE membership_role AS ENUM ('owner', 'admin', 'member');

CREATE TABLE memberships (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    organization_id UUID REFERENCES organizations(id) ON DELETE CASCADE,
    team_id UUID REFERENCES teams(id) ON DELETE CASCADE,
    project_id UUID REFERENCES projects(id) ON DELETE CASCADE,
    role membership_role NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    -- A user can be a member of an org, a team, or a project, but only one at a time per membership record.
    CONSTRAINT single_membership_target CHECK (
        (organization_id IS NOT NULL)::int +
        (team_id IS NOT NULL)::int +
        (project_id IS NOT NULL)::int = 1
    )
);

-- Indexes for performance
CREATE INDEX ON memberships(user_id);
CREATE INDEX ON memberships(organization_id);
CREATE INDEX ON memberships(team_id);
CREATE INDEX ON memberships(project_id);

The challenge is clear: a query for SELECT * FROM tasks must only return tasks belonging to projects the current user is authorized to see, based on their potentially indirect membership via a parent Team or Organization.

Propagating Application Context to the Database

RLS is useless without context. The database needs to know who is asking. Passing user IDs as arguments to every single function and parameter in every query is untenable. The standard, secure pattern is to use session-level configuration parameters.

In your application's middleware, for every authenticated request, you must set the current user's ID within the database transaction for that request. This isolates the context to the current session and transaction.

Here's a conceptual example using Node.js with the pg library:

javascript
// Conceptual middleware in a Node.js Express-like framework
async function setUserContext(req, res, next) {
    if (!req.user || !req.user.id) {
        // Unauthenticated request, proceed without setting context
        return next();
    }

    const dbClient = await pool.connect();
    req.dbClient = dbClient; // Attach client to request for use in route handlers

    try {
        // Use SET LOCAL to scope the setting to the current transaction only.
        // This is CRITICAL for connection pooling environments to prevent context leakage.
        await dbClient.query('BEGIN');
        await dbClient.query(`SET LOCAL app.current_user_id = '${req.user.id}';`);
        
        // Setting a role can also be useful for bypassing RLS for admins/services
        // await dbClient.query(`SET LOCAL ROLE a_specific_role;`);

        next();
    } catch (err) {
        await dbClient.query('ROLLBACK');
        dbClient.release();
        next(err);
    }
}

// In your route handler, you'd use req.dbClient
app.get('/tasks', setUserContext, async (req, res, next) => {
    try {
        const { rows } = await req.dbClient.query('SELECT * FROM tasks;');
        await req.dbClient.query('COMMIT');
        res.json(rows);
    } catch (err) {
        await req.dbClient.query('ROLLBACK');
        next(err);
    } finally {
        if (req.dbClient) {
            req.dbClient.release();
        }
    }
});

Critical Production Note: Using SET LOCAL is non-negotiable in an environment with connection pooling. SET (without LOCAL) would modify the session state permanently, causing catastrophic context leakage where one user's session could see another user's data. Always begin a transaction, use SET LOCAL, and then commit or rollback.

Naive RLS: The Inefficient Subquery Approach

Let's start by enabling RLS on our tasks table and creating our first, simple policy.

sql
-- First, enable RLS on the table.
ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;

-- A naive policy that checks project membership directly
CREATE POLICY select_tasks_if_direct_project_member ON tasks
FOR SELECT
USING (
    EXISTS (
        SELECT 1
        FROM memberships
        WHERE
            memberships.project_id = tasks.project_id AND
            memberships.user_id = current_setting('app.current_user_id', true)::uuid
    )
);

This policy works, but only for users directly assigned to a project. It completely ignores the hierarchy. An organization owner would see zero tasks. This is incorrect.

Advanced RLS for Hierarchical Access: The Recursive CTE

To correctly model the hierarchy, we need to check if a user has membership in the task's project, the project's team, or the project's organization. The most elegant way to express this in SQL is with a recursive Common Table Expression (CTE).

Let's build a policy for the projects table first, as access to tasks is derivative of access to projects.

sql
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;

CREATE POLICY select_projects_based_on_hierarchy ON projects
FOR SELECT
USING (
    EXISTS (
        WITH RECURSIVE entity_hierarchy AS (
            -- Base case: Start with the project itself
            SELECT id, team_id, organization_id FROM projects WHERE id = projects.id

            UNION ALL

            -- Recursive step: Traverse up to the team
            SELECT t.id, NULL::uuid, t.organization_id
            FROM teams t
            JOIN entity_hierarchy eh ON eh.team_id = t.id

            UNION ALL

            -- Recursive step: Traverse up to the organization
            SELECT o.id, NULL::uuid, NULL::uuid
            FROM organizations o
            JOIN entity_hierarchy eh ON eh.organization_id = o.id
        )
        SELECT 1
        FROM memberships m
        WHERE m.user_id = current_setting('app.current_user_id', true)::uuid
          AND (
            m.project_id IN (SELECT id FROM entity_hierarchy)
            OR m.team_id IN (SELECT id FROM entity_hierarchy)
            OR m.organization_id IN (SELECT id FROM entity_hierarchy)
          )
    )
);

This is much better. For each project row being considered, it constructs an in-memory hierarchy of that project and its parents, then checks if the current user has a membership record associated with any entity in that hierarchy.

Now we can create a very simple policy on tasks that leverages the check on projects:

sql
CREATE POLICY select_tasks_via_project_access ON tasks
FOR SELECT
USING (
    EXISTS (
        SELECT 1 FROM projects WHERE id = tasks.project_id
    )
);

Because RLS is enabled on projects, the SELECT 1 FROM projects WHERE id = tasks.project_id subquery is itself subject to the select_projects_based_on_hierarchy policy. If the user can't see the project, the EXISTS clause returns false, and the task row is filtered. This composition is a powerful feature of RLS.

Performance Deep Dive I: The Cost of Nested-Loop Execution

The recursive CTE is functionally correct, but we must analyze its performance. Let's populate our database with a significant amount of data and run EXPLAIN ANALYZE.

sql
-- Assumes you have a function to populate data with, e.g., 100 orgs, 1000 teams, 10000 projects.

-- Set the context for a specific user
BEGIN;
SET LOCAL app.current_user_id = 'the-uuid-of-a-user-in-a-large-org';

EXPLAIN ANALYZE SELECT count(*) FROM projects;

-- Example output might look like this:
-- Aggregate  (cost=123456.78..123456.79 rows=1 width=8) (actual time=5432.109..5432.110 rows=1 loops=1)
--   ->  Seq Scan on projects  (cost=0.00..111222.33 rows=12345) (actual time=0.123..5300.987 rows=500 loops=1)
--         Filter: check_policy(....) -- This is the RLS policy
-- Planning Time: 0.213 ms
-- Execution Time: 5432.150 ms

The execution time is abysmal. The query plan reveals the problem: PostgreSQL is performing a Seq Scan on the projects table. For every single row in the table, it executes the RLS policy. The policy's recursive CTE is being run tens of thousands of times. This is effectively a nested-loop join between the projects table and our permission logic, resulting in O(N*M) complexity, where N is the number of projects and M is the complexity of the permission check.

Optimization Strategy 1: The SECURITY DEFINER Helper Function

One common approach to simplify complex RLS logic and potentially improve performance is to encapsulate it in a function. This improves readability but can introduce new performance characteristics and security considerations.

sql
-- The SECURITY DEFINER clause is important. It means the function runs with the privileges
-- of the user who defined it, not the user who invokes it. This allows the function to query
-- tables like `memberships` even if the current user role doesn't have direct SELECT grants.
CREATE OR REPLACE FUNCTION check_project_access(p_project_id UUID, p_user_id UUID) 
RETURNS BOOLEAN AS $$
BEGIN
    RETURN EXISTS (
        WITH RECURSIVE hierarchy AS (
            SELECT id, team_id, organization_id FROM projects WHERE id = p_project_id
            UNION ALL
            SELECT t.id, NULL::uuid, t.organization_id FROM teams t JOIN hierarchy h ON h.team_id = t.id
            UNION ALL
            SELECT o.id, NULL::uuid, NULL::uuid FROM organizations o JOIN hierarchy h ON h.organization_id = o.id
        )
        SELECT 1
        FROM memberships m
        JOIN hierarchy h ON m.project_id = h.id OR m.team_id = h.id OR m.organization_id = h.id
        WHERE m.user_id = p_user_id
    );
END;
$$ LANGUAGE plpgsql STABLE SECURITY DEFINER;

-- Now, rewrite the policy to be much cleaner
DROP POLICY select_projects_based_on_hierarchy ON projects;

CREATE POLICY select_projects_based_on_hierarchy ON projects
FOR SELECT
USING (check_project_access(id, current_setting('app.current_user_id', true)::uuid));

Let's re-run EXPLAIN ANALYZE.

The result will likely be very similar, or even slightly worse due to function call overhead. The planner, by default, treats functions as black boxes and cannot inline the logic to create a more holistic query plan. While cleaner, this hasn't solved the fundamental performance issue.

Performance Deep Dive II: Pre-Caching Permissions

The root of the performance problem is re-calculating permissions for every row. For the lifetime of a single web request, a user's permissions are static. We can exploit this. The optimal strategy is to compute the user's entire set of accessible entities once at the beginning of the request and store it in a session variable. The RLS policies then become incredibly simple, high-speed checks against this pre-computed set.

Step 1: Create a function to calculate and cache all accessible IDs.

This function will find all organizations, teams, and projects a user is a member of (directly or indirectly) and store their UUIDs in a local session variable as a text representation of a PostgreSQL array.

sql
CREATE OR REPLACE FUNCTION cache_user_permissions() 
RETURNS void AS $$
DECLARE
    current_user_id UUID;
    accessible_project_ids UUID[];
BEGIN
    -- Get the user ID from the session context, exiting if not set.
    current_user_id := current_setting('app.current_user_id', true)::uuid;
    IF current_user_id IS NULL THEN
        RETURN;
    END IF;

    -- Use a CTE to find all projects accessible to the user.
    WITH RECURSIVE user_accessible_entities AS (
        -- Direct memberships
        SELECT organization_id, team_id, project_id FROM memberships WHERE user_id = current_user_id

        UNION ALL

        -- Children of accessible organizations (teams)
        SELECT NULL::uuid, t.id, NULL::uuid
        FROM teams t
        JOIN user_accessible_entities uae ON uae.organization_id = t.organization_id

        UNION ALL

        -- Children of accessible teams (projects)
        SELECT NULL::uuid, NULL::uuid, p.id
        FROM projects p
        JOIN user_accessible_entities uae ON uae.team_id = p.team_id

        UNION ALL
        
        -- Children of accessible organizations (projects)
        SELECT NULL::uuid, NULL::uuid, p.id
        FROM projects p
        JOIN user_accessible_entities uae ON uae.organization_id = p.organization_id AND p.team_id IS NULL
    )
    SELECT array_agg(DISTINCT project_id) INTO accessible_project_ids
    FROM user_accessible_entities
    WHERE project_id IS NOT NULL;
    
    -- Store the result in a session-local variable.
    -- We use `|| '{}'` to ensure we never set a NULL value, which `current_setting` dislikes.
    PERFORM set_config('app.accessible_project_ids', array_to_string(accessible_project_ids || '{}', ','), true);

END;
$$ LANGUAGE plpgsql VOLATILE SECURITY DEFINER;

Step 2: Update the application middleware to call this function.

javascript
// Conceptual middleware in a Node.js Express-like framework
async function setUserContextAndPermissions(req, res, next) {
    if (!req.user || !req.user.id) return next();

    const dbClient = await pool.connect();
    req.dbClient = dbClient;

    try {
        await dbClient.query('BEGIN');
        // Set the user ID first, as the caching function depends on it
        await dbClient.query(`SET LOCAL app.current_user_id = '${req.user.id}';`);
        // Now, call the function to compute and cache permissions for this transaction
        await dbClient.query('SELECT cache_user_permissions();');
        next();
    } catch (err) {
        // ... error handling ...
    }
}

Step 3: Drastically simplify the RLS policies.

The policies now just need to check if a resource's ID is present in the cached array. This is an extremely fast operation.

sql
-- Helper function to convert the comma-separated string back to a UUID array
CREATE OR REPLACE FUNCTION get_accessible_project_ids() 
RETURNS UUID[] AS $$
DECLARE
    ids_str TEXT;
BEGIN
    ids_str := current_setting('app.accessible_project_ids', true);
    IF ids_str = '' OR ids_str IS NULL THEN
        RETURN ARRAY[]::UUID[];
    END IF;
    RETURN string_to_array(ids_str, ',')::UUID[];
END;
$$ LANGUAGE plpgsql STABLE;

-- The new, hyper-performant project policy
DROP POLICY IF EXISTS select_projects_based_on_hierarchy ON projects;
CREATE POLICY select_projects_by_cached_ids ON projects
FOR SELECT
USING (id = ANY(get_accessible_project_ids()));

-- The tasks policy remains the same but is now indirectly much faster
DROP POLICY IF EXISTS select_tasks_via_project_access ON tasks;
CREATE POLICY select_tasks_via_project_access ON tasks
FOR SELECT
USING (EXISTS (SELECT 1 FROM projects WHERE id = tasks.project_id));

Now, let's run EXPLAIN ANALYZE on SELECT count(*) FROM projects; again.

text
-- New, optimized output
-- Aggregate  (cost=253.45..253.46 rows=1 width=8) (actual time=15.123..15.124 rows=1 loops=1)
--   ->  Index Only Scan using projects_pkey on projects  (cost=0.42..250.95 rows=500 width=0) (actual time=0.050..14.555 rows=500 loops=1)
--         Index Cond: (id = ANY ('{uuid1,uuid2,...}'::uuid[]))
--         Heap Fetches: 0
-- Planning Time: 0.345 ms
-- Execution Time: 15.180 ms

The difference is staggering: from over 5 seconds to just 15 milliseconds. The query plan has shifted from a full table scan with a complex filter to a highly efficient Index Only Scan using the primary key index to fetch the exact projects the user has access to. We've paid a small, one-time cost to call cache_user_permissions(), and in return, every subsequent query within the transaction is orders of magnitude faster.

Handling Edge Cases and Other Operations

`INSERT`, `UPDATE`, `DELETE` Policies

RLS isn't just for SELECT. You must define policies for modification commands to ensure data integrity.

* USING clause applies to rows that a command might affect (for SELECT, UPDATE, DELETE).

* WITH CHECK clause applies to new rows being created (INSERT) or rows resulting from an update (UPDATE). It ensures a user cannot create data they would then be unable to see.

sql
-- Policy for creating projects
CREATE POLICY insert_projects_in_accessible_entity ON projects
FOR INSERT
WITH CHECK (
    -- User must have access to the parent organization
    EXISTS (SELECT 1 FROM organizations WHERE id = organization_id)
    AND
    -- If a team is specified, user must also have access to that team
    (team_id IS NULL OR EXISTS (SELECT 1 FROM teams WHERE id = team_id))
);

-- Policy for modifying/deleting projects
CREATE POLICY modify_projects_if_accessible ON projects
FOR UPDATE, DELETE
USING (id = ANY(get_accessible_project_ids()));

The `BYPASSRLS` Super-user

Sometimes, you need a role for internal services, migrations, or super-user support that should see all data. Granting the BYPASSRLS attribute to a role achieves this.

sql
-- Create a role for internal services
CREATE ROLE internal_service LOGIN PASSWORD '...';

-- This role will now ignore all RLS policies
ALTER ROLE internal_service WITH BYPASSRLS;

Use this attribute with extreme caution. It is a powerful override that should only be granted to trusted, non-human roles.

RLS and Foreign Key Violations

A subtle but critical edge case: RLS can cause unexpected foreign key constraint violations. Imagine a user tries to insert a task with a project_id that exists but which they cannot see due to RLS. The database checks for the existence of the project row, RLS filters it out, the check fails, and a foreign key violation is returned. This is correct behavior, but it can be confusing to debug. Your application's error handling must be able to distinguish between a project_id that is truly invalid versus one that is simply inaccessible to the current user.

Conclusion: RLS as a Strategic Architectural Choice

Row-Level Security is one of PostgreSQL's most powerful features for building secure, multi-tenant applications. However, as we've seen, a naive implementation for a hierarchical model can lead to catastrophic performance problems.

The journey from a simple subquery to a recursive CTE, and finally to a pre-cached permission set, demonstrates the critical thinking required for production-grade database architecture. The final pattern—a one-time, upfront permission calculation per request, followed by hyper-fast RLS checks—provides the best of all worlds: centralized and unenforced security logic at the data layer, and high performance for the application's queries.

By deeply understanding query execution plans and the session context, you can move RLS from a potential bottleneck to a strategic advantage, ensuring your application's data is not just secure by policy, but secure by design.

Found this article helpful?

Share it with others who might benefit from it.

More Articles