Advanced RLS Policies for Hierarchical Multi-Tenancy in PostgreSQL
The Challenge: Beyond Flat Multi-Tenancy
In modern SaaS applications, simple multi-tenancy—isolating data using a single tenant_id column—is often insufficient. Real-world business logic demands hierarchical permissions. Consider a common structure: an Organization has multiple Teams, and each Team has multiple Users. A regular User can only see their own data. A Team Manager must see all data for their team. An Organization Admin needs access to all data across all teams within their organization. And finally, a Super Admin (your support staff) needs to bypass these rules entirely.
Implementing this logic at the application layer is fraught with peril. It's easy to forget a WHERE clause, leading to catastrophic data leaks. The logic becomes scattered across services, making it difficult to audit and maintain. This is where PostgreSQL's Row-Level Security (RLS) becomes a powerful tool, allowing us to enforce these complex access rules directly within the database, ensuring they are applied to every single query, regardless of its origin.
However, a naive implementation of hierarchical RLS can cripple your database performance. This article is not an introduction to RLS; it assumes you understand the basics. Instead, we will dissect the performance bottlenecks of common RLS patterns and construct a robust, production-grade solution using SECURITY DEFINER functions and session-level configuration parameters.
Schema Foundation for Hierarchical Tenancy
Let's establish our working schema. This structure represents a typical hierarchical model.
-- Enable the pgcrypto extension for UUIDs
CREATE EXTENSION IF NOT EXISTS "pgcrypto";
-- Organizations are the top-level tenants
CREATE TABLE organizations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL
);
-- Teams belong to an organization
CREATE TABLE teams (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
name TEXT NOT NULL
);
-- Users belong to a team and have a role
CREATE TYPE user_role AS ENUM ('member', 'manager', 'admin');
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
team_id UUID NOT NULL REFERENCES teams(id) ON DELETE CASCADE,
email TEXT NOT NULL UNIQUE,
role user_role NOT NULL DEFAULT 'member'
);
-- A sample data table we want to protect
CREATE TABLE projects (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
owner_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
team_id UUID NOT NULL REFERENCES teams(id) ON DELETE CASCADE,
organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
name TEXT NOT NULL,
confidential_data TEXT
);
-- Create indexes crucial for performance
CREATE INDEX ON teams (organization_id);
CREATE INDEX ON users (team_id);
CREATE INDEX ON projects (owner_id);
CREATE INDEX ON projects (team_id);
CREATE INDEX ON projects (organization_id);
We've denormalized team_id and organization_id onto the projects table. While this adds some storage overhead and write complexity, it's a critical performance optimization that allows our RLS policies to avoid expensive joins.
The Naive RLS Approach and Its Performance Pitfalls
A common first attempt at implementing hierarchical RLS involves writing policies with subqueries or JOINs to determine the current user's permissions.
Let's define a policy for projects that enforces our business rules:
member: Can see projects they own.manager: Can see all projects within their team.admin: Can see all projects within their organization.-- !! WARNING: THIS IS THE INEFFICIENT, NAIVE APPROACH !!
-- First, let's create a role for our application to use
CREATE ROLE app_user NOLOGIN;
-- Enable RLS on the projects table
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
-- Create the policy
CREATE POLICY projects_hierarchical_access ON projects
FOR ALL
TO app_user
USING (
-- Check if the current user is the owner (for 'member' role)
owner_id = (SELECT id FROM users WHERE email = current_user) OR
-- Check if the user is a 'manager' and the project is in their team
EXISTS (
SELECT 1 FROM users
WHERE email = current_user
AND role = 'manager'
AND team_id = projects.team_id
) OR
-- Check if the user is an 'admin' and the project is in their organization
EXISTS (
SELECT 1 FROM users u
JOIN teams t ON u.team_id = t.id
WHERE u.email = current_user
AND u.role = 'admin'
AND t.organization_id = projects.organization_id
)
);
At first glance, this seems logical. It correctly expresses our rules. The problem is performance. PostgreSQL will append this USING clause to every single query on the projects table. Let's populate some data and see the impact.
-- Sample Data Generation
-- (In a real scenario, this would be millions of rows)
INSERT INTO organizations (id, name) VALUES ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', 'Org A');
INSERT INTO teams (id, organization_id, name) VALUES ('b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', 'Team Alpha');
INSERT INTO teams (id, organization_id, name) VALUES ('c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13', 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', 'Team Beta');
INSERT INTO users (id, team_id, email, role) VALUES ('d3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14', 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', '[email protected]', 'admin');
INSERT INTO users (id, team_id, email, role) VALUES ('e4eebc99-9c0b-4ef8-bb6d-6bb9bd380a15', 'c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13', '[email protected]', 'manager');
INSERT INTO users (id, team_id, email, role) VALUES ('f5eebc99-9c0b-4ef8-bb6d-6bb9bd380a16', 'c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13', '[email protected]', 'member');
-- Generate a large number of projects
INSERT INTO projects (owner_id, team_id, organization_id, name, confidential_data)
SELECT
'f5eebc99-9c0b-4ef8-bb6d-6bb9bd380a16', -- owned by [email protected]
'c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13', -- in Team Beta
'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', -- in Org A
'Project ' || g.i,
'secret data'
FROM generate_series(1, 1000000) g(i);
ANALYZE projects;
Now, let's analyze a query as the organization admin:
-- Simulate running a query as the admin user
SET ROLE '[email protected]';
EXPLAIN ANALYZE SELECT count(*) FROM projects;
The query plan will reveal the problem. For every single row in the projects table, PostgreSQL has to execute the subqueries in the policy. This leads to a nested loop where the inner side is a repeated lookup on the users and teams tables. The performance will be abysmal, especially as the projects table grows.
The Performant Solution: `SECURITY DEFINER` and Session Context
The key to performant RLS is to make the policy check as simple as possible. Ideally, it should be a direct comparison against an immutable value, allowing the query planner to use indexes effectively. We can achieve this by pre-calculating the user's permissions once per session and storing them in session-level configuration parameters (also known as GUCs - Grand Unified Configuration variables).
Here's the pattern:
- When a user authenticates, the application backend starts a new database session.
SECURITY DEFINER function to fetch the user's ID, role, team ID, and organization ID.rls.user_id, rls.role).current_setting() function.This approach is vastly more performant because the expensive permission lookup happens only once per session, not once per row.
Step 1: Create Secure Helper Functions
A SECURITY DEFINER function runs with the privileges of the user who defined it, not the user who invokes it. This is powerful but requires careful security considerations. We'll use it to allow a low-privilege application user to set session context based on a lookup they couldn't normally perform.
-- Create a privileged user to own the security functions
CREATE ROLE rls_owner NOLOGIN;
GRANT app_user TO rls_owner;
-- This function securely fetches user details and sets them in the session.
-- It's defined with SECURITY DEFINER to run as 'rls_owner'.
CREATE OR REPLACE FUNCTION set_user_context(user_email TEXT)
RETURNS void
LANGUAGE plpgsql
SECURITY DEFINER
-- IMPORTANT: Prevent users from manipulating the search path to hijack functions
SET search_path = public
AS $$
DECLARE
user_rec RECORD;
BEGIN
-- Find the user and their hierarchy
SELECT u.id, u.role, t.id as team_id, t.organization_id
INTO user_rec
FROM users u
JOIN teams t ON u.team_id = t.id
WHERE u.email = user_email;
IF FOUND THEN
-- Set the session variables. The 'true' flag makes them local to the current transaction.
-- For connection pools, you might set this to 'false' to persist for the session.
PERFORM set_config('rls.user_id', user_rec.id::text, true);
PERFORM set_config('rls.user_role', user_rec.role::text, true);
PERFORM set_config('rls.team_id', user_rec.team_id::text, true);
PERFORM set_config('rls.organization_id', user_rec.organization_id::text, true);
ELSE
-- Handle case where user is not found
RAISE EXCEPTION 'User not found: %', user_email;
END IF;
END;
$$;
-- Change ownership to our privileged role
ALTER FUNCTION set_user_context(TEXT) OWNER TO rls_owner;
-- The app_user role only needs to EXECUTE this function
GRANT EXECUTE ON FUNCTION set_user_context(TEXT) TO app_user;
-- Helper functions to easily access the context variables. This is optional
-- but makes policies more readable. They should be lightweight and IMMUTABLE.
CREATE OR REPLACE FUNCTION current_user_id() RETURNS UUID LANGUAGE sql STABLE AS $$ SELECT nullif(current_setting('rls.user_id', true), '')::uuid $$;
CREATE OR REPLACE FUNCTION current_user_role() RETURNS user_role LANGUAGE sql STABLE AS $$ SELECT nullif(current_setting('rls.user_role', true), '')::user_role $$;
CREATE OR REPLACE FUNCTION current_team_id() RETURNS UUID LANGUAGE sql STABLE AS $$ SELECT nullif(current_setting('rls.team_id', true), '')::uuid $$;
CREATE OR REPLACE FUNCTION current_organization_id() RETURNS UUID LANGUAGE sql STABLE AS $$ SELECT nullif(current_setting('rls.organization_id', true), '')::uuid $$;
Security Note: The SET search_path = public is critical. Without it, a malicious user could create a function named set_config in their own schema, and if their schema is earlier in the search_path, the SECURITY DEFINER function would execute their malicious code with elevated privileges.
Step 2: Rewrite the RLS Policy
Now we can replace our complex, slow policy with a highly efficient one that uses these new functions.
-- Drop the old, inefficient policy
DROP POLICY IF EXISTS projects_hierarchical_access ON projects;
-- Create the new, performant policy
CREATE POLICY projects_hierarchical_access_performant ON projects
FOR ALL
TO app_user
USING (
(
current_user_role() = 'admin' AND
organization_id = current_organization_id()
) OR
(
current_user_role() = 'manager' AND
team_id = current_team_id()
) OR
(
current_user_role() = 'member' AND
owner_id = current_user_id()
)
);
This policy is a thing of beauty. It contains no subqueries and no joins. It's a simple boolean check against values that are constant for the duration of the query. The PostgreSQL query planner can treat these current_setting() calls as immutable, allowing it to use indexes on organization_id, team_id, and owner_id with maximum efficiency.
Step 3: Application Integration
Your application's database connection logic now needs to call set_user_context immediately after acquiring a connection from the pool.
Here's a conceptual example in a Node.js application using pg:
const { Pool } = require('pg');
const pool = new Pool();
async function getClientForUser(userEmail) {
const client = await pool.connect();
try {
// Set the context for this session
await client.query('SELECT set_user_context($1)', [userEmail]);
// The role must be set for the RLS policy to apply
await client.query('SET ROLE app_user');
return client;
} catch (err) {
client.release(); // Release client on error
throw err;
}
}
// Usage:
async function getUserProjects(userEmail) {
const client = await getClientForUser(userEmail);
try {
const res = await client.query('SELECT id, name FROM projects LIMIT 10');
return res.rows;
} finally {
// IMPORTANT: Always release the client back to the pool
client.release();
}
}
Performance Analysis: The Proof
Let's rerun our EXPLAIN ANALYZE test with the new system. We'll simulate the application's behavior.
-- Start a new session context
RESET ROLE;
-- 1. Set the context for the admin user
SELECT set_user_context('[email protected]');
-- 2. Set the role to activate the policy
SET ROLE app_user;
-- 3. Run the query and analyze the plan
EXPLAIN ANALYZE SELECT count(*) FROM projects;
Expected Result Breakdown:
* Naive Policy: The plan would likely show a Seq Scan on projects. For each row, it would perform lookups on users and teams. The cost would be massive.
* Optimized Policy: The plan will be dramatically different. The planner sees organization_id = current_organization_id() as organization_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'. It can use the index on projects(organization_id) to perform a highly efficient Index-Only Scan or Bitmap Heap Scan. The execution time will be orders of magnitude faster.
The difference between milliseconds and multiple seconds (or even minutes) on large tables is the hallmark of this pattern.
Edge Cases and Production Considerations
Implementing this pattern in production requires handling several advanced scenarios.
1. Super-Admin Bypass
Your support or operations team may need to bypass RLS entirely. PostgreSQL has a built-in mechanism for this.
-- Create a super-admin role
CREATE ROLE super_admin BYPASSRLS NOLOGIN;
-- Grant it to your ops user
CREATE USER ops_user PASSWORD '...';
GRANT super_admin TO ops_user;
When a user connects as ops_user, their BYPASSRLS attribute means RLS policies are simply not applied to their queries. This is much cleaner than adding OR is_super_admin() logic to every policy.
2. Connection Pooling (The Big Gotcha)
This is the most critical operational challenge. Session-level context (set_config) interacts dangerously with transaction-level connection poolers like PgBouncer. In transaction pooling mode, a client might get a different database session for each transaction. This means the context you set in transaction 1 will be gone in transaction 2!
Solutions:
* Session Pooling: Configure your pooler (PgBouncer, Pgpool-II) to use session pooling mode. This dedicates a database connection to a client for the duration of the client's connection. This is the simplest solution but may require more database connections.
* Transaction-Local Context: In our set_user_context function, we used set_config(..., true). The true makes the setting local to the current transaction. This is safer for transaction-level pooling, but it means you must call set_user_context at the beginning of every single transaction. This can be managed with application-level middleware.
* Advisory Locks: A more advanced pattern involves using advisory locks. The application takes a specific lock for its user session, and the set_user_context function is only called if the lock is not already held for that connection. This can reduce the overhead of setting the context on every transaction but adds complexity.
For most applications, combining transaction-local context with middleware that runs set_user_context at the start of each transaction is the most robust approach.
3. Testing and Debugging
Testing RLS is notoriously difficult. You need to verify not only that users can see the data they're supposed to, but also that they cannot see data they aren't.
* Use SET ROLE: Your test suite should create a set of test users with different roles and in different teams/orgs. In your tests, you can impersonate these users to verify the policies are working as expected.
-- In your test setup
SELECT set_user_context('[email protected]');
SET ROLE app_user;
-- Now run assertions
-- e.g., Assert that `SELECT COUNT(*) FROM projects` returns the correct, small number.
-- Switch to another user
RESET ROLE;
SELECT set_user_context('[email protected]');
SET ROLE app_user;
-- Assert that `SELECT COUNT(*) FROM projects` returns a much larger number.
* Dedicated Test Schemas: Use a framework like pg_prove or your application's testing framework to run tests within transactions that are rolled back, ensuring a clean state for each test case.
4. Views and Security Barriers
When you create a view on a table with RLS, the policy can sometimes be subverted if the view's WHERE clause is evaluated before the RLS policy. This can leak information through side-channels (e.g., an error message revealing the existence of a row).
To prevent this, use the security_barrier attribute on views that access RLS-protected tables.
CREATE VIEW project_summaries WITH (security_barrier = true) AS
SELECT id, name, organization_id
FROM projects
WHERE name LIKE 'Critical%';
This forces the RLS policy on projects to be evaluated before the view's WHERE clause, preventing rows from being filtered out prematurely and then leaking information.
Conclusion
Row-Level Security is a transformative feature for building secure, multi-tenant applications in PostgreSQL. However, moving from simple, flat tenancy to complex hierarchical permissions requires a disciplined architectural approach. Naive RLS policies with subqueries and joins are a performance trap that will not scale.
The production-grade pattern is clear:
SECURITY DEFINER function to perform the complex, one-time lookup of a user's permissions and hierarchy.set_config().current_setting(), allowing the query planner to use indexes effectively.By adopting this pattern, you can build a data access layer that is not only robustly secure but also highly performant, keeping your authorization logic clean, centralized, and co-located with the data it protects.