Advanced PostgreSQL RLS for Hierarchical Multi-Tenant SaaS
The Scaling Challenge of RLS in Hierarchical Multi-Tenancy
Row-Level Security (RLS) in PostgreSQL is a powerful mechanism for enforcing data partitioning at the database layer, making it a natural fit for multi-tenant SaaS applications. The canonical example—a policy like USING (tenant_id = current_setting('app.tenant_id'))—is effective for flat tenancy models where a user belongs to a single, isolated tenant.
However, real-world SaaS products rarely have such simple data models. They often feature nested or hierarchical tenancy: a user might be an admin of an organization, granting them implicit access to all teams and projects within that organization, while also being a direct member of a specific team.
Implementing RLS for these models presents a significant architectural and performance challenge. Naive policies that use subqueries or JOINs to traverse the hierarchy on every data access are a recipe for catastrophic performance degradation. The query planner struggles to optimize these dynamic checks, leading to sequential scans and unscalable query times as the number of tenants and users grows.
This article presents a production-proven pattern for implementing highly performant RLS in a hierarchical multi-tenant environment. We will bypass expensive per-query hierarchy traversals by pre-calculating a user's access rights upon authentication, embedding them in a JWT, and using this information to populate a PostgreSQL session context. This approach transforms complex, dynamic RLS checks into simple, index-friendly lookups, ensuring both security and scalability.
We assume you are already familiar with PostgreSQL RLS fundamentals, multi-tenancy concepts, and JWT-based authentication.
Section 1: Defining the Hierarchical Data Model
To ground our discussion, let's define a concrete schema that represents a common SaaS hierarchy. A user's access is determined by their memberships in various organizations and teams, each with a specific role.
organizations: The top-level tenant.teams: Belong to an organization.projects: Belong to a team. This is the primary resource we want to protect.users: The actors in the system.memberships: A join table defining a user's role within an organization or a team.Here is the SQL DDL to create this structure:
-- Enable UUID generation
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Top-level tenants
CREATE TABLE organizations (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name TEXT NOT NULL
);
-- Teams belong to organizations
CREATE TABLE teams (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
organization_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
name TEXT NOT NULL
);
CREATE INDEX ON teams(organization_id);
-- Projects are the primary resource we'll protect with RLS
CREATE TABLE projects (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
team_id UUID NOT NULL REFERENCES teams(id) ON DELETE CASCADE,
name TEXT NOT NULL,
data JSONB
);
CREATE INDEX ON projects(team_id);
-- Users
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
email TEXT NOT NULL UNIQUE
);
-- Membership roles
CREATE TYPE membership_role AS ENUM ('org_admin', 'team_member');
-- Join table for user permissions
CREATE TABLE memberships (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
-- A membership can be at the organization or team level
organization_id UUID REFERENCES organizations(id) ON DELETE CASCADE,
team_id UUID REFERENCES teams(id) ON DELETE CASCADE,
role membership_role NOT NULL,
-- Ensure a membership is tied to either an org or a team, but not both
CONSTRAINT org_or_team_check CHECK (
(organization_id IS NOT NULL AND team_id IS NULL) OR
(organization_id IS NULL AND team_id IS NOT NULL)
)
);
CREATE INDEX ON memberships(user_id);
CREATE INDEX ON memberships(organization_id);
CREATE INDEX ON memberships(team_id);
Our access rules are:
team_member can access projects belonging to their specific team.org_admin can access all projects within any team under their organization.Section 2: The Naive RLS Implementation and Its Performance Pitfalls
With our schema defined, a first attempt at an RLS policy for the projects table might involve a complex subquery to check for permissions.
First, let's enable RLS on the projects table:
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
-- Prevent access by default
ALTER TABLE projects FORCE ROW LEVEL SECURITY;
Now, let's create a role for our application users. This role will not have any superuser or bypass privileges.
CREATE ROLE app_user;
-- Grant necessary permissions to the role
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON projects, teams, organizations, users, memberships TO app_user;
Here is the naive RLS policy. It attempts to check both direct team membership and hierarchical organization membership within the policy's USING clause.
CREATE OR REPLACE FUNCTION get_current_user_id() RETURNS UUID AS $$
BEGIN
-- In a real app, this would be derived securely from the session context
-- We cast to UUID to ensure type safety.
RETURN current_setting('app.user_id')::uuid;
EXCEPTION
WHEN OTHERS THEN
RETURN NULL;
END;
$$ LANGUAGE plpgsql STABLE;
CREATE POLICY select_projects ON projects FOR SELECT TO app_user
USING (
team_id IN (
-- Case 1: Direct team membership
SELECT m.team_id
FROM memberships m
WHERE m.user_id = get_current_user_id()
AND m.role = 'team_member'
AND m.team_id IS NOT NULL
UNION
-- Case 2: Hierarchical org admin membership
SELECT t.id
FROM teams t
JOIN memberships m ON t.organization_id = m.organization_id
WHERE m.user_id = get_current_user_id()
AND m.role = 'org_admin'
AND m.organization_id IS NOT NULL
)
);
Why this is a performance disaster:
For every single query against the projects table (e.g., SELECT FROM projects WHERE name = '...'), PostgreSQL must execute the complex subquery inside the USING clause for each row it considers*. The query planner cannot effectively optimize this. It often results in nested loops and an inability to use indexes on the projects table efficiently.
Let's demonstrate with EXPLAIN ANALYZE. First, populate some data:
-- Create a user, org, teams, and projects
INSERT INTO users (id, email) VALUES ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', '[email protected]');
INSERT INTO organizations (id, name) VALUES ('b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'Org A');
INSERT INTO teams (id, organization_id, name) VALUES
('c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13', 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'Team A1'),
('d3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14', 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'Team A2');
-- Make the user an admin of Org A
INSERT INTO memberships (user_id, organization_id, role) VALUES ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 'org_admin');
-- Create 10,000 projects spread across the two teams
INSERT INTO projects (team_id, name, data)
SELECT
CASE WHEN i % 2 = 0 THEN 'c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13'::uuid ELSE 'd3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14'::uuid END,
'Project ' || i,
'{}'
FROM generate_series(1, 10000) s(i);
ANALYZE projects;
ANALYZE teams;
ANALYZE memberships;
Now, let's run a query as this user:
-- Simulate an application connection
SET ROLE app_user;
SET app.user_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11';
EXPLAIN ANALYZE SELECT count(*) FROM projects;
The query plan will be complex and slow, likely involving multiple scans of the memberships and teams tables for every row check on projects. The cost will scale poorly with the number of users, teams, and organizations.
Section 3: The Session Context Pattern with JWTs
The performant solution is to shift the complex work of calculating permissions to a single point in time: user authentication. We will compute the user's entire set of accessible entity IDs and store them in a stateless token (JWT). On each API request, we'll pass these IDs to PostgreSQL via session variables.
Step 1: The Permissions Calculation Query
Upon successful login, we need a single, efficient query to gather all team_ids the user can access, both directly and hierarchically. A recursive Common Table Expression (CTE) is not ideal here, but a UNION is perfect.
-- Query to get all accessible team IDs for a given user_id
SELECT team_id FROM (
-- Direct team memberships
SELECT m.team_id
FROM memberships m
WHERE m.user_id = $1 -- parameter for user_id
AND m.role = 'team_member'
AND m.team_id IS NOT NULL
UNION
-- Teams accessible via organization admin role
SELECT t.id AS team_id
FROM teams t
JOIN memberships m ON t.organization_id = m.organization_id
WHERE m.user_id = $1 -- parameter for user_id
AND m.role = 'org_admin'
AND m.organization_id IS NOT NULL
) AS accessible_teams;
This query is run once per session/login. It's far more efficient to run this single complex query than to embed it within an RLS policy.
Step 2: Embedding Permissions in the JWT
Your authentication service will execute the above query and embed the results into the JWT payload. This keeps your application stateless while providing the necessary context for the database.
Here's an example of a JWT payload:
{
"sub": "a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11",
"email": "[email protected]",
"exp": 1678886400,
"https://myapp.com/claims": {
"user_id": "a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11",
"accessible_teams": [
"c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13",
"d3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14"
]
}
}
Step 3: Populating the PostgreSQL Session Context
In your application's API middleware, before executing any business logic, you must extract these claims from the validated JWT and set them as PostgreSQL session variables. It is critical to use SET LOCAL to ensure these variables are scoped only to the current transaction, preventing state leakage between requests in a connection pool.
Here is a Node.js (with node-postgres) example:
// Middleware in an Express.js or similar framework
async function setDbSessionContext(req, res, next) {
if (!req.user || !req.user.claims) {
return next(); // No user or claims, proceed without context
}
const { user_id, accessible_teams } = req.user.claims;
// Get a client from the pool
const client = await pool.connect();
req.dbClient = client; // Attach client to the request object
try {
// Start a transaction
await client.query('BEGIN');
// Use SET LOCAL for transaction-level scope
// Pass values as parameters to prevent SQL injection
await client.query(`SET LOCAL app.user_id = $1`, [user_id]);
// Convert array of UUIDs to a comma-separated string for PostgreSQL
const teamsString = accessible_teams.join(',');
await client.query(`SET LOCAL app.accessible_teams = $1`, [teamsString]);
next();
} catch (err) {
// If setting context fails, release the client and send an error
client.release();
res.status(500).send('Failed to set database session context');
}
}
// In your route handler
app.get('/projects', setDbSessionContext, async (req, res) => {
const client = req.dbClient;
try {
const result = await client.query('SELECT * FROM projects');
await client.query('COMMIT');
res.json(result.rows);
} catch (err) {
await client.query('ROLLBACK');
res.status(500).send('Error executing query');
} finally {
if (client) {
client.release();
}
}
});
Section 4: High-Performance RLS Policies with Session Context
Now we can rewrite our RLS policy to be incredibly simple and fast. It no longer contains any subqueries or joins.
-- Helper function to safely get the teams array from the session setting
CREATE OR REPLACE FUNCTION get_accessible_teams() RETURNS UUID[] AS $$
BEGIN
-- The 'true' flag in current_setting makes it return NULL if the setting is not found,
-- preventing an error.
RETURN string_to_array(current_setting('app.accessible_teams', true), ',')::uuid[];
EXCEPTION
-- If the setting is empty or invalid, return an empty array
WHEN OTHERS THEN
RETURN ARRAY[]::uuid[];
END;
$$ LANGUAGE plpgsql STABLE;
-- Drop the old, slow policy
DROP POLICY IF EXISTS select_projects ON projects;
-- Create the new, high-performance policy
CREATE POLICY select_projects_performant ON projects FOR SELECT TO app_user
USING (
team_id = ANY(get_accessible_teams())
);
-- It's good practice to create policies for other operations too
CREATE POLICY insert_projects_performant ON projects FOR INSERT TO app_user
WITH CHECK (
team_id = ANY(get_accessible_teams())
);
-- Similar policies for UPDATE and DELETE
CREATE POLICY update_projects_performant ON projects FOR UPDATE TO app_user
USING (
team_id = ANY(get_accessible_teams())
);
CREATE POLICY delete_projects_performant ON projects FOR DELETE TO app_user
USING (
team_id = ANY(get_accessible_teams())
);
The Performance Difference:
Let's re-run our EXPLAIN ANALYZE with the new setup:
-- Simulate an application connection
SET ROLE app_user;
-- Set the session context as the middleware would
SET app.user_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11';
SET app.accessible_teams = 'c2eebc99-9c0b-4ef8-bb6d-6bb9bd380a13,d3eebc99-9c0b-4ef8-bb6d-6bb9bd380a14';
EXPLAIN ANALYZE SELECT count(*) FROM projects;
The query plan will now be vastly simpler. PostgreSQL sees team_id = ANY(...) with a constant array. It can use the b-tree index on projects(team_id) to perform a highly efficient Bitmap Index Scan or similar plan. The query cost is no longer dependent on the complexity of your permission hierarchy; it only depends on the number of projects being accessed. This pattern scales beautifully.
Section 5: Edge Case - Handling "Super Admins" and Bypassing RLS
Your application will inevitably need a way to bypass RLS for administrative tasks, data migrations, or customer support tooling.
The BYPASSRLS attribute on a role is the correct tool for this.
CREATE ROLE internal_tool_role BYPASSRLS;
-- Grant this role only the permissions it needs
GRANT SELECT, UPDATE ON users TO internal_tool_role;
internal_tool_role, completely separate from the app_user pool.SET ROLE to the bypass role. -- The main application user
CREATE USER my_app_user WITH PASSWORD '...';
GRANT app_user TO my_app_user;
GRANT internal_tool_role TO my_app_user;
-- In a specific, secure part of your application code:
-- BEGIN;
-- SET LOCAL ROLE internal_tool_role;
-- -- Perform operation that needs to bypass RLS
-- SELECT * FROM projects WHERE ...; -- This will bypass all RLS policies
-- COMMIT; -- The role change is reverted at transaction end
This approach must be meticulously audited and restricted to prevent privilege escalation vulnerabilities.
Section 6: Edge Case - The "Stale JWT" Problem
A critical flaw in this pattern is that permissions are only calculated when the JWT is minted. If a user's permissions change (e.g., they are removed from a team), their existing JWT will contain stale accessible_teams data, granting them access they should no longer have until the token expires.
Here are several strategies to mitigate this, from simplest to most robust:
a. Add a versioning column to your users table (or a dedicated permissions table).
ALTER TABLE users ADD COLUMN permissions_version INTEGER NOT NULL DEFAULT 1;
b. Whenever a user's memberships or roles change, increment this version number in the same transaction.
-- When removing a user from a team
BEGIN;
DELETE FROM memberships WHERE user_id = '...' AND team_id = '...';
UPDATE users SET permissions_version = permissions_version + 1 WHERE id = '...';
COMMIT;
c. Include the permissions_version in the JWT claims when it's created.
"https://myapp.com/claims": {
"user_id": "...",
"accessible_teams": [...],
"permissions_version": 5
}
d. In your API middleware, after validating the JWT, perform a single, lightweight check against the database.
// In the setDbSessionContext middleware
const { user_id, permissions_version: jwt_version } = req.user.claims;
const { rows } = await client.query('SELECT permissions_version FROM users WHERE id = $1', [user_id]);
if (rows.length === 0 || rows[0].permissions_version !== jwt_version) {
// Mismatch! Permissions are stale.
client.release();
return res.status(401).send('Permissions have changed. Please log in again.');
}
// If versions match, proceed to set session context...
This pattern ensures that any permission change immediately invalidates all of a user's active sessions on their next request, forcing a re-authentication to get a fresh JWT with up-to-date claims.
Section 7: Integration with ORMs and Connection Pooling
The use of SET LOCAL is non-negotiable for this pattern's safety in a concurrent environment. This has a direct and critical implication for your connection pooler, especially external poolers like PgBouncer.
COMMIT or ROLLBACK. Another request could then receive that same connection with the previous request's app.* settings still active for a brief moment before the next transaction starts, leading to data leakage. SET LOCAL helps, but it's not a complete guarantee in all edge cases of this mode.SET LOCAL works as expected, as the session state is reliably contained. The downside is slightly lower performance, as connections are held for longer.node-postgres, HikariCP for Java, or SQLAlchemy's pool) effectively operate in a manner similar to session pooling. You get a connection, use it for a request (wrapped in a transaction), and then release it. The key is to ensure your SET LOCAL commands are executed at the beginning of the transaction for every request.Here is a more robust node-postgres example showing proper transaction and client lifecycle management:
// A utility function to wrap the entire request lifecycle
async function withDbClient(req, handler) {
const client = await pool.connect();
try {
await client.query('BEGIN');
if (req.user && req.user.claims) {
const { user_id, accessible_teams } = req.user.claims;
const teamsString = accessible_teams.join(',');
await client.query(`SET LOCAL app.user_id = $1`, [user_id]);
await client.query(`SET LOCAL app.accessible_teams = $1`, [teamsString]);
}
const result = await handler(client);
await client.query('COMMIT');
return result;
} catch (err) {
await client.query('ROLLBACK');
throw err; // Re-throw the error to be handled by an error middleware
} finally {
client.release();
}
}
// Usage in a route
app.get('/projects', async (req, res, next) => {
try {
const projects = await withDbClient(req, (client) => {
return client.query('SELECT id, name FROM projects');
});
res.json(projects.rows);
} catch (err) {
next(err); // Pass to error handler
}
});
Conclusion: A Scalable Architecture for Secure Data Access
By moving complex permission logic out of RLS policies and into a single, upfront authentication step, we transform an unscalable security model into a highly performant one. The session context pattern, powered by JWT claims, provides the database with all the information it needs to make simple, index-friendly access control decisions.
This architecture provides several key benefits:
While this pattern introduces new considerations, such as handling stale JWTs and ensuring correct connection pool configuration, the trade-offs are well worth the dramatic gains in performance and scalability for any non-trivial multi-tenant SaaS application built on PostgreSQL.