Postgres RLS with JWT Claims for Multi-Tenant SaaS Isolation
The Inevitable Flaw in Application-Layer Tenancy
For any non-trivial multi-tenant SaaS application, data isolation is not a feature; it's the bedrock of trust. The common approach—diligently adding a WHERE tenant_id = ? clause to every single database query—is a ticking time bomb. It relies on perfect, perpetual developer discipline. A single forgotten clause in a complex JOIN or a deeply nested subquery, and you've created a catastrophic data leak. This application-layer enforcement is fragile, difficult to audit, and creates a massive surface area for bugs.
As senior engineers, we must architect for resilience against human error. The principle of least privilege dictates that the application's database connection itself should be incapable of accessing data outside its designated tenant context. This is where we move enforcement from the fallible application layer to the authoritative database layer using PostgreSQL's Row-Level Security (RLS).
This article isn't an introduction to RLS. It assumes you know what it is. Instead, we will focus on a specific, production-proven pattern: dynamically driving RLS policies using claims extracted from a JSON Web Token (JWT). We will set up a secure bridge between our stateless application server and the PostgreSQL session to enforce tenancy transparently and immutably for every query executed within a request's lifecycle.
The RLS + JWT Pattern: A Database-Enforced Architecture
The core of this pattern is to make the database session tenant-aware. When a user makes an authenticated API request, their JWT contains claims identifying them, their role, and, crucially, their tenant. Our goal is to securely transmit these claims to Postgres so that RLS policies can use them as the basis for access control decisions.
Here's the high-level flow for a single API request:
SET commands that store the JWT claims as session-level configuration variables. For example: SET session.user.id = '...'; SET session.user.tenant_id = '...'; SET session.user.role = '...';WHERE tenant_id = ? clauses. The code becomes blissfully unaware of tenancy.SELECT, INSERT, UPDATE, or DELETE, PostgreSQL automatically and implicitly appends the conditions defined in our RLS policies, which read the session variables we just set.This architecture is powerful because it centralizes tenancy logic in one place—the database schema definition—making it auditable, robust, and nearly impossible for application-level code to bypass.
Core Implementation Walkthrough
Let's build a practical implementation using a Node.js backend with the pg library. We'll model a simple scenario: a projects table where each project belongs to a tenant.
1. Schema, Roles, and Table Setup
First, we establish our database objects. We'll create a dedicated, unprivileged role (app_user) that our application will use to connect. This role will not own tables but will be granted specific permissions.
-- Create a dedicated role for our application
-- NOLOGIN prevents direct connections, but it can be assumed by other roles.
-- We will connect as a superuser and SET ROLE to app_user.
-- Or, create it with a password and connect directly.
CREATE ROLE app_user NOLOGIN;
-- Create tenants and users tables for context
CREATE TABLE tenants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL
);
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
email TEXT NOT NULL UNIQUE,
role TEXT NOT NULL DEFAULT 'member' -- e.g., 'member', 'admin'
);
-- The core multi-tenant table
CREATE TABLE projects (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
name TEXT NOT NULL,
created_by UUID NOT NULL REFERENCES users(id)
);
-- Grant base permissions to the application role
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON projects, users, tenants TO app_user;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO app_user;
-- CRITICAL: Enable RLS on the target table
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
-- IMPORTANT: By default, no rows are visible or modifiable. This is a secure default.
2. The Bridge: JWT Claims to Postgres Session Variables
Now for the critical link. We need a secure way for RLS policies to access the current user's claims. We can't trust the application to pass these as query parameters. Instead, we use current_setting() with a custom namespace (e.g., jwt.claims.).
To prevent tampering and provide a clean interface, we wrap current_setting() in SECURITY DEFINER functions. This is a crucial security step. A SECURITY DEFINER function executes with the privileges of the user who defined it (e.g., a superuser), not the user who calls it (app_user). This allows us to control access to session variables tightly.
-- Helper function to get the current tenant_id from the session.
-- The 'true' flag makes it permissive to missing settings (returns NULL).
CREATE OR REPLACE FUNCTION get_current_tenant_id() RETURNS UUID AS $$
BEGIN
RETURN current_setting('jwt.claims.tenant_id', true)::UUID;
EXCEPTION
-- Handle cases where the setting is not a valid UUID
WHEN invalid_text_representation THEN
RETURN NULL;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
-- Helper function to get the current user's role.
CREATE OR REPLACE FUNCTION get_current_user_role() RETURNS TEXT AS $$
SELECT current_setting('jwt.claims.role', true);
$$ LANGUAGE plpgsql SECURITY DEFINER;
-- Revoke default public execution rights and grant only to our app user
REVOKE EXECUTE ON FUNCTION get_current_tenant_id() FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION get_current_user_role() FROM PUBLIC;
GRANT EXECUTE ON FUNCTION get_current_tenant_id() TO app_user;
GRANT EXECUTE ON FUNCTION get_current_user_role() TO app_user;
Security Note: Using SECURITY DEFINER is powerful and requires caution. Here, it's safe because the function's logic is static and doesn't execute arbitrary SQL. We also explicitly revoke public access to prevent any other database role from calling these functions.
3. Crafting the Security Barrier: RLS Policies
With our helper functions in place, defining the RLS policies becomes declarative and clean. We'll create a policy that ensures users can only see projects belonging to their own tenant.
-- First, drop any existing policy to ensure a clean state
DROP POLICY IF EXISTS projects_tenant_isolation ON projects;
-- Create the RLS policy
CREATE POLICY projects_tenant_isolation ON projects
AS PERMISSIVE -- This is the default, but explicit is good
FOR ALL -- Applies to SELECT, INSERT, UPDATE, DELETE
TO app_user -- Only applies to this role
USING ( tenant_id = get_current_tenant_id() )
WITH CHECK ( tenant_id = get_current_tenant_id() );
Let's break this down:
* FOR ALL: The policy applies to all command types.
* USING ( ... ): This clause is for SELECT, UPDATE, and DELETE. A row is only visible or modifiable if this expression returns true.
* WITH CHECK ( ... ): This clause is for INSERT and UPDATE. It prevents a user from creating a new row or updating an existing row to a state that would violate the USING clause. In this case, it prevents a user from tenant A from inserting a project with tenant_id of tenant B.
4. The Node.js Middleware: Tying It All Together
Finally, the application code. Here's an example of an Express.js middleware that orchestrates the connection checkout, session variable setting, and cleanup.
// lib/db.js - A simplified connection pool setup
const { Pool } = require('pg');
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
});
module.exports = { pool };
// middleware/db-context.js
const { pool } = require('../lib/db');
// This middleware wraps each request in a DB transaction
// and sets the RLS context variables from the JWT.
const dbContext = async (req, res, next) => {
// Assume a previous middleware has validated the JWT and attached it to req.user
if (!req.user) {
// Or handle as an unauthenticated request
return next();
}
const { tenantId, userId, role } = req.user;
if (!tenantId) {
// Every authenticated user in a multi-tenant system MUST have a tenant.
return res.status(400).send({ error: 'Missing tenant identifier in token.' });
}
let client;
try {
client = await pool.connect();
} catch (err) {
console.error('Failed to get client from pool', err);
return res.status(503).send({ error: 'Service temporarily unavailable.' });
}
try {
// Start the transaction
await client.query('BEGIN');
// Set the session variables. Use pg-format for safe value escaping,
// although for these specific values it's less critical than for user input.
// IMPORTANT: These settings are local to the transaction.
await client.query(`SET LOCAL session.user.id = '${userId}';`);
await client.query(`SET LOCAL session.user.tenant_id = '${tenantId}';`);
await client.query(`SET LOCAL session.user.role = '${role}';`);
// Make the client available to downstream route handlers
req.dbClient = client;
// Wait for the route handler to finish
await new Promise(resolve => res.on('finish', resolve));
// Once the response is finished, commit the transaction
await client.query('COMMIT');
} catch (err) {
console.error('Error during transaction, rolling back', err);
if (client) {
try {
await client.query('ROLLBACK');
} catch (rollbackErr) {
console.error('Failed to rollback', rollbackErr);
}
}
// If the response hasn't been sent, send an error
if (!res.headersSent) {
res.status(500).send({ error: 'An internal server error occurred.' });
}
} finally {
if (client) {
client.release();
}
// Ensure next() is called even on error, if no response was sent.
// Though in this structure, we handle the response directly.
if (!res.headersSent) {
next();
}
}
};
// Example route handler
// GET /projects
app.get('/projects', dbContext, async (req, res) => {
// Notice the query has NO tenant_id filter. RLS handles it.
const { rows } = await req.dbClient.query('SELECT id, name FROM projects');
res.json(rows);
});
Key Implementation Details:
SET LOCAL: This is critical. SET LOCAL scopes the variable to the current transaction only*. When the transaction ends (COMMIT or ROLLBACK), the setting is reverted. This is the key to safely using a connection pool, preventing one user's session variables from leaking into another user's request.
* Error Handling & Cleanup: The finally block ensures client.release() is always called, returning the connection to the pool. The try/catch block ensures a ROLLBACK on any error.
Middleware Order: This dbContext middleware must run after your authentication middleware (which populates req.user) and before* your route handlers.
Advanced Scenarios: Beyond Simple Tenant Isolation
Real-world systems are more complex. Let's extend our policies.
1. Hierarchical Roles within a Tenant
What if a tenant has admin users who can see all projects, and member users who can only see projects they created? We can create multiple policies, and Postgres will combine them using OR (for PERMISSIVE policies, which is the default and most common type).
-- Policy for tenant admins: they can see all projects in their tenant.
CREATE POLICY projects_tenant_admin ON projects
FOR SELECT
USING (
get_current_user_role() = 'admin' AND tenant_id = get_current_tenant_id()
);
-- Policy for tenant members: they can only see projects they created within their tenant.
CREATE POLICY projects_tenant_member ON projects
FOR SELECT
USING (
get_current_user_role() = 'member' AND
tenant_id = get_current_tenant_id() AND
created_by = current_setting('jwt.claims.user_id', true)::UUID
);
-- The policy for writes can remain simple: you can only write within your tenant.
-- The application logic would handle the admin/member distinction for writes.
-- Or you could create separate policies for UPDATE/DELETE.
DROP POLICY IF EXISTS projects_tenant_isolation ON projects;
CREATE POLICY projects_write_isolation ON projects
FOR INSERT, UPDATE, DELETE
USING ( tenant_id = get_current_tenant_id() )
WITH CHECK ( tenant_id = get_current_tenant_id() );
Now, when a user with the 'admin' role runs a SELECT, the projects_tenant_admin policy matches. When a 'member' runs a SELECT, they must satisfy the projects_tenant_member policy. Postgres evaluates all permissive policies and if any of them returns true for a given row, the row is visible.
2. Handling Super-Admin / Support Access
Your support team might need to access data across all tenants. Hard-coding a bypass in the application is a security risk. A better way is to create a special role and a bypass policy.
-- Create a special role in the database for support staff
CREATE ROLE support_user NOLOGIN;
GRANT app_user to support_user; -- It inherits app_user's base permissions
-- Policy that bypasses tenant checks for the support_user role
CREATE POLICY super_admin_bypass ON projects
FOR ALL
TO support_user -- This policy ONLY applies to this role
USING (true); -- This user can see everything
In your application, you would have a separate, internal tool for the support team. When a support person logs in, the application connects to the database, and after connecting, it runs SET ROLE support_user;. All subsequent queries in that session will now run as support_user, activating the bypass policy. This is far more secure and auditable than application-level flags.
Performance Deep Dive: The Cost of Security
RLS is not free. Every query on an RLS-enabled table incurs the overhead of executing the policy functions. Poorly written policies or missing indexes can cripple performance.
1. Analyzing RLS Policy Impact with `EXPLAIN ANALYZE`
The most important tool is EXPLAIN ANALYZE. Let's see how RLS affects a query plan.
Imagine we have a projects table with millions of rows and an index on tenant_id.
CREATE INDEX idx_projects_tenant_id ON projects(tenant_id);
Now, let's analyze a query. First, we need to simulate the session context:
BEGIN;
SET LOCAL session.user.tenant_id = 'some-tenant-uuid';
EXPLAIN ANALYZE SELECT * FROM projects WHERE name LIKE 'Q4 Report%';
ROLLBACK;
The output will show a Filter or Recheck Cond step corresponding to the RLS policy. A good plan will look something like this:
Bitmap Heap Scan on projects (cost=12.44..152.91 rows=10 width=128) (actual time=0.08..0.15 rows=5 loops=1)
Recheck Cond: (name ~~ 'Q4 Report%')
Filter: (tenant_id = get_current_tenant_id())
-> Bitmap Index Scan on idx_projects_name (cost=0.00..12.44 rows=100 width=0) (actual time=0.05..0.05 rows=5 loops=1)
Index Cond: (name ~~ 'Q4 Report%')
Here, Postgres uses an index on name first and then applies the RLS Filter. This is okay, but not ideal. A great plan would incorporate the tenant_id check into the index scan itself.
2. Indexing Strategies for RLS
For RLS to be performant, the columns used in your USING clauses must be indexed. For our primary policy, (tenant_id = get_current_tenant_id()), the idx_projects_tenant_id is essential.
To help the planner, ensure your queries also include the tenant ID where possible, even though it's redundant from a security perspective. This can sometimes lead to better plans.
The ideal index for multi-tenant queries is often a composite index, starting with tenant_id.
CREATE INDEX idx_projects_tenant_name ON projects(tenant_id, name);
With this index, the query planner is much more likely to produce an optimal plan that first narrows down the search space to the correct tenant and then searches within that small subset, which is significantly faster.
3. The `current_setting` Overhead
The current_setting() function itself has some overhead. By default, Postgres may treat functions as volatile, meaning it re-evaluates them for every row. You can provide a hint to the planner by marking your helper functions as STABLE if their result is guaranteed not to change within a single query.
CREATE OR REPLACE FUNCTION get_current_tenant_id() RETURNS UUID AS $$
-- ... function body ...
$$ LANGUAGE plpgsql STABLE SECURITY DEFINER;
A STABLE function's result can be cached within a query. This is safe here because our SET LOCAL command ensures the setting is fixed for the duration of the transaction/query. This can provide a noticeable performance boost on tables with many rows.
Production Pitfalls and Edge Cases
1. Connection Pooling and State Leakage
This is the most dangerous pitfall. If you use SET instead of SET LOCAL, the setting will persist on the connection even after it's returned to the pool. The next request that picks up that connection will inherit the previous user's identity. Always use SET LOCAL or wrap the context-setting in a transaction. The transaction-based approach shown in the Node.js example is the most robust way to guarantee cleanup.
Another strategy is to issue a RESET ALL; or DISCARD ALL; command before returning a connection to the pool, but this is less efficient and reliable than the transactional guarantee.
2. Security Hardening: `search_path` and `FORCE RLS`
To prevent malicious users from creating functions or tables that could interfere with your RLS policies, it's good practice to lock down the search_path for your application user.
ALTER ROLE app_user SET search_path = public;
Furthermore, to ensure table owners cannot bypass RLS, you should enforce it for them as well.
ALTER TABLE projects FORCE ROW LEVEL SECURITY;
This prevents the table owner, and by extension any SECURITY DEFINER function they might create, from bypassing the RLS policies you have defined.
3. Handling Null or Missing JWT Claims
What happens if the jwt.claims.tenant_id is not set? Our get_current_tenant_id() function returns NULL. The policy check tenant_id = NULL will evaluate to NULL (which is treated as false), and the user will see no data. This is a secure default. Your middleware should proactively check for the presence of a tenantId in the JWT and reject the request early, providing a clear error message rather than letting the user see an empty data set and get confused.
Conclusion
Implementing Row-Level Security driven by JWT claims is a significant architectural step up from application-layer tenancy enforcement. It co-locates security rules with the data itself, making your application more robust, secure, and auditable. While it introduces complexity in the form of policy management, session context handling, and performance tuning, the payoff is a system that is fundamentally more secure by design.
The key takeaways for a production-grade implementation are:
SECURITY DEFINER functions to expose session variables to policies.SET LOCAL within a transaction when using a connection pool.STABLE and create composite indexes starting with your tenancy key (tenant_id).EXPLAIN ANALYZE to ensure your RLS policies are not creating performance bottlenecks.By adopting this pattern, you move from a state of hoping developers remember the WHERE clause to a state of guaranteeing data isolation at the most fundamental layer of your stack.