Advanced RLS Policies in PostgreSQL for Hierarchical Multi-Tenancy

19 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

Beyond Flat Multi-Tenancy: The Hierarchical Security Challenge

In modern SaaS applications, multi-tenancy is a foundational architectural requirement. The most common implementation involves a simple tenant_id column on every relevant table, with Row-Level Security (RLS) policies enforcing isolation: USING (tenant_id = current_setting('app.tenant_id')). This pattern is effective for flat tenancy models where tenants are isolated peers.

However, this approach shatters when confronted with hierarchical tenancy structures. Consider a system where an organization has multiple teams, and each team has multiple projects. A user, who is a member of a specific team, might need read access to all projects within their parent organization, but write access only to projects within their immediate team. A simple tenant_id check is insufficient to model these cascading, granular permissions.

This article dissects a robust pattern for implementing RLS in such a hierarchical environment. We will not cover the basics of CREATE POLICY. Instead, we will focus on the advanced techniques required to make this model secure, performant, and maintainable in a production system. We'll build a solution from the ground up, starting with a secure context, implementing the core recursive logic, and then diving deep into performance optimization and edge case management.

Defining the Hierarchical Schema

Let's establish our data model. We'll have organizations, teams (which belong to an organization), and projects (which belong to a team). Users are linked directly to teams.

sql
-- Organizations are the top-level tenants.
CREATE TABLE organizations (
    id BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
    name TEXT NOT NULL
);

-- Teams belong to an organization.
CREATE TABLE teams (
    id BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
    organization_id BIGINT NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
    name TEXT NOT NULL
);

-- Projects belong to a team, and therefore transitively to an organization.
CREATE TABLE projects (
    id BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
    team_id BIGINT NOT NULL REFERENCES teams(id) ON DELETE CASCADE,
    name TEXT NOT NULL,
    -- We'll denormalize organization_id for easier policy creation.
    organization_id BIGINT NOT NULL REFERENCES organizations(id) ON DELETE CASCADE
);

-- Users are members of a team.
CREATE TABLE users (
    id BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
    team_id BIGINT NOT NULL REFERENCES teams(id) ON DELETE CASCADE,
    email TEXT NOT NULL UNIQUE
);

-- A simple table to demonstrate RLS on project-related data.
CREATE TABLE tasks (
    id BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
    project_id BIGINT NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    description TEXT NOT NULL,
    completed BOOLEAN NOT NULL DEFAULT false
);

-- Create indexes on foreign keys for performance.
CREATE INDEX ON teams (organization_id);
CREATE INDEX ON projects (team_id);
CREATE INDEX ON projects (organization_id);
CREATE INDEX ON users (team_id);
CREATE INDEX ON tasks (project_id);

Our goal is to create RLS policies that allow:

  • A user to see all projects and tasks within their entire organization.
  • A user to only create, update, or delete projects and tasks belonging to their specific team.
  • Section 1: Securely Setting User Context

    Before we can write policies, we need a secure and reliable way to inform PostgreSQL about the identity of the user executing a query. The standard mechanism for this is current_setting(), which reads transaction-scoped configuration variables. The key is to set these variables in a controlled manner.

    Never trust the client to set these values directly. A compromised API layer could allow a user to set app.user_id to another value and impersonate them. The context must be set by a trusted role, typically from a backend API middleware immediately after authentication.

    Let's create a dedicated schema and a SECURITY DEFINER function to encapsulate this logic. SECURITY DEFINER ensures the function executes with the privileges of the user who defined it (the owner), not the user who calls it.

    sql
    -- A private schema to hold our security helper functions.
    CREATE SCHEMA app_private;
    
    CREATE OR REPLACE FUNCTION app_private.set_current_user_context(user_id BIGINT) 
    RETURNS VOID AS $$
    DECLARE
        user_team_id BIGINT;
        user_org_id BIGINT;
    BEGIN
        -- Find the user's team and organization.
        SELECT u.team_id, t.organization_id 
        INTO user_team_id, user_org_id
        FROM users u
        JOIN teams t ON u.team_id = t.id
        WHERE u.id = user_id;
    
        IF NOT FOUND THEN
            RAISE EXCEPTION 'User not found: %', user_id;
        END IF;
    
        -- Set transaction-local settings. These are secure and isolated.
        -- The 'true' flag indicates that an error should not be thrown if the setting doesn't exist.
        PERFORM set_config('app.current_user_id', user_id::TEXT, true);
        PERFORM set_config('app.current_team_id', user_team_id::TEXT, true);
        PERFORM set_config('app.current_org_id', user_org_id::TEXT, true);
    END;
    $$ LANGUAGE plpgsql SECURITY DEFINER;
    
    -- Revoke execute from public to ensure only trusted roles can call this.
    REVOKE EXECUTE ON FUNCTION app_private.set_current_user_context(BIGINT) FROM PUBLIC;

    In your application's middleware (e.g., in a Node.js Express server), after validating a JWT, you would execute this function at the beginning of each request's database transaction:

    javascript
    // Example middleware in a Node.js application
    async function setUserContext(req, res, next) {
      if (req.user) { // Assuming req.user is populated from a validated JWT
        const client = await pool.connect();
        try {
          // BEGIN transaction implicitly handled by some libraries
          await client.query('SELECT app_private.set_current_user_context($1)', [req.user.id]);
          res.locals.dbClient = client; // Attach client to request for use in handlers
          next();
        } catch (err) {
          client.release();
          next(err);
        }
      } else {
        next();
      }
    }

    This pattern ensures that for the duration of the database transaction serving a single API request, PostgreSQL is aware of the authenticated user's full hierarchical context (user_id, team_id, org_id).

    Section 2: The Core Pattern: Helper Functions in Policies

    Embedding complex logic directly into every RLS policy is a maintenance nightmare. A far better approach is to encapsulate the permission logic within helper functions and call these functions from the policies. This keeps the policies clean and the logic centralized.

    Our primary challenge is determining if a given resource (like a project) is accessible to the current user. For SELECT grants, this means checking if the project's organization matches the user's organization. For INSERT/UPDATE/DELETE, it means checking if the project's team matches the user's team.

    Let's create these helper functions.

    sql
    -- A public schema for functions that can be safely used in RLS policies.
    CREATE SCHEMA app_public;
    
    -- Helper to get the current user ID, with casting.
    CREATE OR REPLACE FUNCTION app_public.current_user_id() RETURNS BIGINT AS $$
        SELECT nullif(current_setting('app.current_user_id', true), '')::BIGINT;
    $$ LANGUAGE sql STABLE;
    
    -- Helper to get the current team ID.
    CREATE OR REPLACE FUNCTION app_public.current_team_id() RETURNS BIGINT AS $$
        SELECT nullif(current_setting('app.current_team_id', true), '')::BIGINT;
    $$ LANGUAGE sql STABLE;
    
    -- Helper to get the current organization ID.
    CREATE OR REPLACE FUNCTION app_public.current_org_id() RETURNS BIGINT AS $$
        SELECT nullif(current_setting('app.current_org_id', true), '')::BIGINT;
    $$ LANGUAGE sql STABLE;
    
    -- The core logic for read access: Is the target entity within the user's organization?
    CREATE OR REPLACE FUNCTION app_public.is_in_my_organization(target_organization_id BIGINT) RETURNS BOOLEAN AS $$
        SELECT target_organization_id = app_public.current_org_id();
    $$ LANGUAGE sql STABLE;
    
    -- The core logic for write access: Is the target entity within the user's immediate team?
    CREATE OR REPLACE FUNCTION app_public.is_in_my_team(target_team_id BIGINT) RETURNS BOOLEAN AS $$
        SELECT target_team_id = app_public.current_team_id();
    $$ LANGUAGE sql STABLE;

    Why STABLE? We mark these functions as STABLE to inform the query planner that they return the same result for all rows within a single statement. This allows for potential optimizations, as the planner knows it doesn't need to re-evaluate the function for every single row being checked.

    Section 3: Implementing Granular RLS Policies

    With our context-setting mechanism and helper functions in place, we can now define the RLS policies. The key is to use different rules for different commands (SELECT vs. INSERT/UPDATE/DELETE).

    First, enable RLS on the tables.

    sql
    ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
    ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;

    Now, create the policies for the projects table.

    sql
    -- Policy for SELECT access: Users can see all projects in their organization.
    CREATE POLICY select_projects ON projects
        FOR SELECT
        USING (app_public.is_in_my_organization(organization_id));
    
    -- Policy for INSERT access: Users can only create projects for their own team.
    CREATE POLICY insert_projects ON projects
        FOR INSERT
        WITH CHECK (app_public.is_in_my_team(team_id));
    
    -- Policy for UPDATE access: Users can only update projects within their own team.
    CREATE POLICY update_projects ON projects
        FOR UPDATE
        USING (app_public.is_in_my_team(team_id));
    
    -- Policy for DELETE access: Users can only delete projects from their own team.
    CREATE POLICY delete_projects ON projects
        FOR DELETE
        USING (app_public.is_in_my_team(team_id));

    Notice the critical difference between USING and WITH CHECK:

    * USING (expression): This applies to existing rows. For SELECT, UPDATE, and DELETE, it determines which rows are visible or targetable by the command.

    * WITH CHECK (expression): This applies to new or updated rows. For INSERT and UPDATE, it ensures that the data being written conforms to the policy. An attempt to insert a row that fails the WITH CHECK will result in an error.

    Let's do the same for the tasks table. Since tasks don't have a direct organization_id, we must perform a JOIN within the policy. This is a common pattern, but it has performance implications we'll address later.

    sql
    CREATE POLICY select_tasks ON tasks
        FOR SELECT
        USING (
            EXISTS (
                SELECT 1
                FROM projects p
                WHERE p.id = tasks.project_id
                AND app_public.is_in_my_organization(p.organization_id)
            )
        );
    
    CREATE POLICY insert_tasks ON tasks
        FOR INSERT
        WITH CHECK (
            EXISTS (
                SELECT 1
                FROM projects p
                WHERE p.id = tasks.project_id
                AND app_public.is_in_my_team(p.team_id)
            )
        );
    
    -- For UPDATE and DELETE, we combine the logic.
    CREATE POLICY modify_tasks ON tasks
        FOR ALL -- Applies to UPDATE and DELETE, as well as SELECT/INSERT if others not defined
        USING (
            EXISTS (
                SELECT 1
                FROM projects p
                WHERE p.id = tasks.project_id
                AND app_public.is_in_my_team(p.team_id)
            )
        )
        WITH CHECK (
            EXISTS (
                SELECT 1
                FROM projects p
                WHERE p.id = tasks.project_id
                AND app_public.is_in_my_team(p.team_id)
            )
        );

    This setup works, but the EXISTS subquery in the tasks policy can be a performance drag. Every query on tasks now requires an additional index scan or seek on the projects table.

    Section 4: Performance Optimization for Deep Hierarchies

    The simple model above works well for shallow hierarchies. But what if our model was Organizations -> Business Units -> Departments -> Teams -> Projects? A simple JOIN becomes a chain of JOINs, and performance degrades rapidly. The real challenge in production RLS is not defining the logic, but ensuring it doesn't kill your database performance.

    Let's explore two advanced optimization patterns.

    Optimization 1: Recursive CTEs for Dynamic Hierarchy Traversal

    What if a manager at the organization level should have access, but also a specific team lead from a different branch of the hierarchy needs one-off access to a project? Our model is too rigid. A more flexible approach uses a proper hierarchy table and a recursive query to find all accessible nodes for a user.

    Let's adjust our schema to be more generic, using a parent-child relationship.

    sql
    -- A more generic hierarchy model
    CREATE TABLE tenants (
        id BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
        parent_id BIGINT REFERENCES tenants(id),
        name TEXT NOT NULL
    );
    CREATE INDEX ON tenants (parent_id);
    
    CREATE TABLE user_tenant_memberships (
        user_id BIGINT NOT NULL REFERENCES users(id),
        tenant_id BIGINT NOT NULL REFERENCES tenants(id),
        PRIMARY KEY (user_id, tenant_id)
    );
    
    -- Our core helper function, now using recursion
    CREATE OR REPLACE FUNCTION app_public.is_descendant_or_self(ancestor_id BIGINT, descendant_id BIGINT) 
    RETURNS BOOLEAN AS $$
    BEGIN
        -- Fast path for direct equality
        IF ancestor_id = descendant_id THEN
            RETURN TRUE;
        END IF;
    
        RETURN EXISTS (
            WITH RECURSIVE hierarchy AS (
                SELECT id FROM tenants WHERE id = descendant_id
                UNION ALL
                SELECT t.parent_id FROM tenants t JOIN hierarchy h ON t.id = h.id WHERE t.parent_id IS NOT NULL
            )
            SELECT 1 FROM hierarchy WHERE id = ancestor_id
        );
    END;
    $$ LANGUAGE plpgsql STABLE;

    Our RLS policy on a resource table would then look like this:

    sql
    -- Assuming 'resources' table has a 'tenant_id'
    CREATE POLICY select_resources ON resources
    FOR SELECT USING (
        EXISTS (
            SELECT 1
            FROM user_tenant_memberships utm
            WHERE utm.user_id = app_public.current_user_id()
            AND app_public.is_descendant_or_self(utm.tenant_id, resources.tenant_id)
        )
    );

    The Problem: Recursive CTEs are powerful but can be slow. For each row being checked by the RLS policy, PostgreSQL may have to execute this recursive query. On a large table, this is a performance disaster. The planner is often unable to optimize this pattern effectively.

    Optimization 2: Materialized Paths and Pre-calculated Permissions (The Production Pattern)

    For high-throughput systems, re-calculating the hierarchy on every read is unacceptable. The solution is to denormalize the hierarchy information in a way that allows for extremely fast lookups.

    Method A: Materialized Paths (ltree)

    PostgreSQL's ltree extension is purpose-built for this. We store the entire ancestral path for each node in the hierarchy.

    sql
    CREATE EXTENSION IF NOT EXISTS ltree;
    
    ALTER TABLE tenants ADD COLUMN path LTREE;
    
    -- We'd need a trigger to maintain this path on insert/update
    CREATE OR REPLACE FUNCTION update_tenant_path() RETURNS TRIGGER AS $$
    BEGIN
        IF NEW.parent_id IS NULL THEN
            NEW.path = NEW.id::text::ltree;
        ELSE
            SELECT path || NEW.id::text FROM tenants WHERE id = NEW.parent_id INTO NEW.path;
            IF NOT FOUND THEN
                RAISE EXCEPTION 'Invalid parent_id %', NEW.parent_id;
            END IF;
        END IF;
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;
    
    CREATE TRIGGER tenant_path_trigger
    BEFORE INSERT OR UPDATE ON tenants
    FOR EACH ROW EXECUTE FUNCTION update_tenant_path();
    
    -- Create a GiST index for fast path lookups
    CREATE INDEX tenant_path_idx ON tenants USING gist (path);

    Now, our permission check becomes a simple, indexable ltree operation:

    sql
    CREATE OR REPLACE FUNCTION app_public.is_descendant_or_self_ltree(ancestor_id BIGINT, descendant_id BIGINT)
    RETURNS BOOLEAN AS $$
    DECLARE
        ancestor_path LTREE;
        descendant_path LTREE;
    BEGIN
        SELECT path INTO ancestor_path FROM tenants WHERE id = ancestor_id;
        SELECT path INTO descendant_path FROM tenants WHERE id = descendant_id;
    
        -- The '<@' operator means 'is ancestor of or equal to'
        RETURN ancestor_path <@ descendant_path;
    END;
    $$ LANGUAGE plpgsql STABLE;

    This is dramatically faster than the recursive CTE.

    Method B: Pre-calculating Permission Sets

    For the absolute highest read performance, we can pre-calculate the entire set of accessible tenants for each user and store it in a dedicated table.

    sql
    CREATE TABLE user_accessible_tenants (
        user_id BIGINT NOT NULL,
        tenant_id BIGINT NOT NULL,
        PRIMARY KEY (user_id, tenant_id)
    );
    
    -- This table would be populated by a trigger on 'tenants' and 'user_tenant_memberships'
    -- The trigger would calculate all descendants of a user's assigned tenants and insert them.

    Our RLS policy then becomes blindingly fast:

    sql
    CREATE POLICY select_resources_fast ON resources
    FOR SELECT USING (
        EXISTS (
            SELECT 1
            FROM user_accessible_tenants uat
            WHERE uat.user_id = app_public.current_user_id()
            AND uat.tenant_id = resources.tenant_id
        )
    );

    This is just a simple JOIN against a pre-computed lookup table. The trade-off is increased write complexity and storage overhead. For read-heavy workloads, this is often the winning pattern.

    Section 5: Advanced Edge Cases and Production Considerations

    Implementing the core logic is only half the battle. Production systems present numerous edge cases.

    Super-Admin / System Access

    How do you bypass RLS for administrative tasks, migrations, or background jobs? There are two primary approaches:

  • The BYPASSRLS Attribute (The Big Hammer): You can grant a role the BYPASSRLS attribute (ALTER ROLE my_admin_role BYPASSRLS;). Any user connected with this role will ignore all RLS policies. This is simple but dangerous. A single compromised connection with this role exposes all data.
  • The session_replication_role (The Surgical Tool): A safer method for migrations or bulk updates is to set session_replication_role = 'replica'. This disables not only RLS but also triggers and foreign key checks. It's powerful but must be used with extreme care and reset to 'origin' immediately after the operation.
  • Dedicated Admin Policies (The Safest Pattern): A more robust approach is to build bypass logic into your policies. You can define a specific app.is_admin session variable.
  • sql
        -- In set_current_user_context, check if the user is an admin and set the flag
        -- PERFORM set_config('app.is_admin', 'true', true);
    
        -- Modify the policy
        CREATE POLICY select_projects ON projects FOR SELECT
        USING (
            (current_setting('app.is_admin', true) = 'true')
            OR
            (app_public.is_in_my_organization(organization_id))
        );

    This makes bypass an explicit and auditable part of the security model.

    Testing and Debugging RLS

    Debugging RLS can be maddening because it fails silently—queries simply return no rows. Here are some strategies:

    * Impersonation: Use SET ROLE to run queries as a less-privileged application user to test your policies. Wrap it in a transaction to isolate the test.

    sql
        BEGIN;
        -- Set the context for the user you want to test
        SELECT app_private.set_current_user_context(123);
        -- Impersonate the role your application uses
        SET ROLE my_application_role;
        
        -- Run your test query
        EXPLAIN ANALYZE SELECT * FROM tasks WHERE id = 456;
        -- This will show you the query plan with the RLS policy applied.
    
        ROLLBACK;

    * pgTAP: Use a testing framework like pgTAP to write automated tests for your security functions and policies. You can create specific users and tenants, impersonate them, and assert that SELECT queries return the expected number of rows.

    Performance Impact of `SECURITY DEFINER`

    Functions marked SECURITY DEFINER have a performance cost. PostgreSQL must perform additional security checks and cannot inline these functions into the calling query. This is why our pattern uses a single SECURITY DEFINER function to set the context at the start of the transaction, while the per-row helper functions (is_in_my_organization, etc.) are simple sql functions (or plpgsql without SECURITY DEFINER). This allows the query planner to inline the RLS logic directly into the query plan for better performance, a critical micro-optimization that has a large impact at scale.

    Conclusion: RLS as a Centralized Security Pillar

    Implementing Row-Level Security for hierarchical multi-tenancy is a significant step up from simple tenant isolation. It shifts complex authorization logic from the application layer directly into the database, creating a single, authoritative source of truth for data access.

    While the initial implementation is more complex, the benefits are substantial:

    * Security: Authorization rules are consistently enforced, regardless of whether data is accessed via a primary API, a reporting tool, or a direct database connection.

    * Maintainability: The permission logic is co-located with the data it protects, simplifying application code.

    * Performance: By leveraging advanced patterns like materialized paths or pre-calculated permission sets, RLS can be made to perform exceptionally well even on large, complex hierarchies.

    The key takeaway is to move beyond naive RLS implementations. A production-grade system requires a secure context-setting mechanism, centralized helper functions, and a proactive approach to performance optimization. By treating your RLS policies as a core part of your application's architecture, you can build a highly secure and scalable foundation for any multi-tenant application.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles