PostgreSQL RLS with JWTs: A Deep Dive for Serverless Architectures
The Fragility of Application-Layer Tenancy
In any multi-tenant system, the cardinal sin is cross-tenant data access. The standard approach involves meticulously adding a WHERE tenant_id = :current_tenant_id clause to every single database query. While functional, this pattern is notoriously brittle. It relies on developer discipline, exhaustive code reviews, and ORM features that can sometimes be bypassed. A single developer mistake, a complex JOIN, or a raw SQL query can silently introduce a severe security vulnerability.
The principle of defense-in-depth suggests we shouldn't rely on a single layer of security. If the application layer is the primary gatekeeper, the database should be the impenetrable vault. This is where PostgreSQL's Row-Level Security (RLS) becomes a game-changer. By defining security policies directly on the database tables, we move the enforcement boundary from fallible application code to the database itself. An application query that forgets a WHERE clause will simply return zero rows, not data from another tenant.
However, integrating RLS into a modern serverless architecture using JWT for authentication introduces its own set of complex challenges. How do we securely inform PostgreSQL about the identity of the user making the request? How does this work with stateless serverless functions and connection poolers? What are the performance implications? This article dissects these advanced challenges and provides production-grade solutions.
The Core Architecture: JWT Claims to RLS Policies
Our target architecture looks like this:
Authorization: Bearer header.- API Gateway and an authorizer validate the JWT's signature and basic claims.
- The serverless function receives the request and the decoded JWT payload.
- Before executing any business logic, the function acquires a database connection and sets a session-specific configuration variable containing the JWT claims.
SELECT * FROM projects;) without any explicit WHERE tenant_id = ? clauses.projects table, it invokes the RLS policy.tenant_id and role.USING (tenant_id = 'current-tenant-id-from-session')) to the query's execution plan.- The query returns only the rows that the user is authorized to see.
- The connection is returned to the pool, and the session variable is reset.
This approach centralizes tenancy logic within the database schema, making the application code simpler and far more secure.
Deep Dive: Implementing the JWT-to-Session Bridge
The linchpin of this entire system is securely passing context from the stateless function to the stateful database session. We achieve this using runtime configuration parameters.
First, let's define our helper function and policies in SQL. We'll create a function to reliably extract claims, which is cleaner than parsing JSON in every policy.
-- A helper function to safely get a value from the JWT claims.
-- The `is_valid_json` check is a safeguard.
CREATE OR REPLACE FUNCTION get_jwt_claim(claim_name TEXT) RETURNS TEXT AS $$
DECLARE
claims_json JSONB;
BEGIN
-- `current_setting` is the key function to read our session variable.
-- The 't' flag makes it return NULL if the setting is not found, preventing an error.
SELECT current_setting('app.jwt.claims', 't') INTO claims_json;
IF claims_json IS NULL OR NOT jsonb_typeof(claims_json) = 'object' THEN
RETURN NULL;
END IF;
RETURN claims_json ->> claim_name;
EXCEPTION
WHEN invalid_text_representation THEN
RETURN NULL;
END;
$$ LANGUAGE plpgsql STABLE;
-- Example Tables
CREATE TABLE projects (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
name TEXT NOT NULL
);
CREATE TABLE tasks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
project_id UUID REFERENCES projects(id),
tenant_id UUID NOT NULL, -- Denormalized for simpler RLS
title TEXT NOT NULL,
assignee_id UUID
);
-- Enable RLS on the tables
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;
-- Create the Policies
-- This policy ensures users can only see projects within their own tenant.
CREATE POLICY select_projects_for_tenant
ON projects FOR SELECT
USING (tenant_id::TEXT = get_jwt_claim('tenant_id'));
CREATE POLICY manage_projects_for_tenant
ON projects FOR ALL -- INSERT, UPDATE, DELETE
USING (tenant_id::TEXT = get_jwt_claim('tenant_id'))
WITH CHECK (tenant_id::TEXT = get_jwt_claim('tenant_id'));
-- This policy is slightly more complex, joining to the parent table.
-- Note: Denormalizing tenant_id on tasks is often a performance win for RLS.
CREATE POLICY select_tasks_for_tenant
ON tasks FOR SELECT
USING (tenant_id::TEXT = get_jwt_claim('tenant_id'));
CREATE POLICY manage_tasks_for_tenant
ON tasks FOR ALL
USING (tenant_id::TEXT = get_jwt_claim('tenant_id'))
WITH CHECK (tenant_id::TEXT = get_jwt_claim('tenant_id'));
The Application-Side Implementation (TypeScript/Node.js)
Now, let's look at the serverless function code. We'll use the pg (node-postgres) library. The crucial part is how we manage the connection and the session state.
A Naive (and Dangerous) Approach
// WARNING: DO NOT USE THIS IN PRODUCTION
import { Pool } from 'pg';
const pool = new Pool();
export async function getProjects(jwtClaims: object) {
const client = await pool.connect();
try {
// Set the session variable
await client.query(`SET LOCAL app.jwt.claims = '${JSON.stringify(jwtClaims)}'`);
// Business logic query - no WHERE clause needed!
const res = await client.query('SELECT * FROM projects');
return res.rows;
} finally {
client.release();
}
}
This code has a catastrophic flaw related to connection pooling, which we will address in the "Edge Cases" section. The SET LOCAL command is also important; it scopes the setting to the current transaction. If not in a transaction, it behaves like SET SESSION, lasting for the life of the session.
A Production-Grade Connection Manager
A robust solution abstracts this session management logic away from the business code. We can create a wrapper function that handles setting and, critically, resetting the state.
import { Pool, PoolClient } from 'pg';
// Initialize the pool once per container instance
const pool = new Pool({
max: 10, // Example configuration
idleTimeoutMillis: 30000,
});
interface JwtClaims {
sub: string;
tenant_id: string;
role: 'admin' | 'member';
// ... other claims
}
// This is our high-order function for safe, RLS-aware queries
export async function withRls<T>(
claims: JwtClaims,
callback: (client: PoolClient) => Promise<T>
): Promise<T> {
const client = await pool.connect();
try {
// Start a transaction. This is crucial.
await client.query('BEGIN');
// SET LOCAL ensures the setting is automatically discarded at the end of the transaction.
// This is the safest way to manage session state with a connection pool.
const claimsJson = JSON.stringify(claims);
await client.query({
text: `SET LOCAL app.jwt.claims = $1`,
values: [claimsJson]
});
// Execute the developer's business logic
const result = await callback(client);
// Commit the transaction
await client.query('COMMIT');
return result;
} catch (e) {
// If anything goes wrong, roll back
await client.query('ROLLBACK');
throw e;
} finally {
// Release the client back to the pool. The `SET LOCAL` is already gone.
client.release();
}
}
// --- Example Usage in a Serverless Handler ---
interface ApiEvent {
// Represents the event from API Gateway
requestContext: {
authorizer: {
claims: JwtClaims;
};
};
}
export async function handler(event: ApiEvent) {
const claims = event.requestContext.authorizer.claims;
try {
const projects = await withRls(claims, async (dbClient) => {
// This code is now clean and tenancy-unaware.
// The security is handled transparently by the `withRls` wrapper and the DB.
const res = await dbClient.query('SELECT id, name FROM projects');
return res.rows;
});
return {
statusCode: 200,
body: JSON.stringify(projects),
};
} catch (error) {
console.error('Failed to fetch projects:', error);
return {
statusCode: 500,
body: JSON.stringify({ message: 'Internal Server Error' }),
};
}
}
The key takeaway here is using BEGIN, SET LOCAL, and COMMIT/ROLLBACK. SET LOCAL scopes the variable's lifetime to the transaction. When the transaction ends, PostgreSQL automatically discards the setting. This is the most reliable way to prevent session state from leaking between different function invocations that might reuse the same database connection.
Performance Considerations and Optimization
RLS is not free. Every query against a protected table forces PostgreSQL to execute the policy function(s). Here's how to analyze and mitigate the performance impact.
1. The Cost of `current_setting()`
Calling current_setting() and the get_jwt_claim() function has a small but non-zero overhead. On a hot query path executed thousands of times per second, this can add up. The function is marked as STABLE, which means PostgreSQL knows it will return the same result for the duration of a statement, allowing for some optimization. However, the cost is still present.
2. RLS-Aware Indexing (The Most Critical Optimization)
This is the most important performance factor. Any columns used in your RLS USING clause must be indexed. In our example, tenant_id is the key column.
Let's analyze the difference with EXPLAIN ANALYZE.
Scenario: A projects table with 10 million rows, 100,000 tenants.
Query: SELECT * FROM projects WHERE name LIKE 'Q1 Report%';
Without an index on tenant_id:
EXPLAIN ANALYZE SELECT * FROM projects WHERE name LIKE 'Q1 Report%';
-- Likely Result:
-- Seq Scan on projects (cost=0.00..250000.00 rows=100 width=128) (actual time=0.01..5000.00 ms)
-- Filter: ((name ~~ 'Q1 Report%') AND (tenant_id = get_jwt_claim('tenant_id')::uuid))
-- Rows Removed by Filter: 9999900
PostgreSQL is forced to perform a full sequential scan on all 10 million rows. For each row, it applies the RLS policy. This is disastrous for performance.
WITH an index on tenant_id:
CREATE INDEX idx_projects_tenant_id ON projects(tenant_id);
EXPLAIN ANALYZE SELECT * FROM projects WHERE name LIKE 'Q1 Report%';
-- Likely Result:
-- Bitmap Heap Scan on projects (cost=10.00..500.00 rows=10 width=128) (actual time=0.5..1.0 ms)
-- Recheck Cond: (tenant_id = get_jwt_claim('tenant_id')::uuid)
-- Filter: (name ~~ 'Q1 Report%')
-- -> Bitmap Index Scan on idx_projects_tenant_id (cost=0.00..10.00 rows=100) (actual time=0.4..0.4 ms)
-- Index Cond: (tenant_id = get_jwt_claim('tenant_id')::uuid)
The query planner is now smart enough to see that the RLS policy constrains tenant_id. It uses the idx_projects_tenant_id index to first find the small subset of rows belonging to the current tenant (e.g., 100 rows) and then applies the additional name LIKE ... filter. This is orders of magnitude faster.
A composite index might be even better if queries often filter on other columns:
CREATE INDEX idx_projects_tenant_id_name ON projects(tenant_id, name text_pattern_ops);
Always use EXPLAIN ANALYZE on your application's queries to ensure your RLS policies are not preventing the query planner from using your indexes effectively.
Advanced Edge Cases and Production Hardening
Implementing the core logic is only half the battle. Production systems require handling numerous edge cases.
1. Connection Pooling: The Leaky Session Problem
What happens if we don't use the SET LOCAL in a transaction pattern? Consider this alternative:
// DANGEROUS PATTERN
const client = await pool.connect();
// Using SET SESSION, which lasts for the life of the connection
await client.query(`SET app.jwt.claims = '...'`);
// ... run query ...
client.release(); // The setting is NOT cleared!
If this connection is now given to a different serverless invocation for a different user, the old app.jwt.claims value will persist! The new user would see the previous user's data. This is a critical security failure.
The SET LOCAL within a transaction pattern is the best defense. An alternative, though less robust, is to explicitly RESET the variable in a finally block:
// A less ideal but workable pattern
const client = await pool.connect();
try {
await client.query(`SET app.jwt.claims = '...'`);
// ...
} finally {
await client.query('RESET app.jwt.claims');
client.release();
}
This is more brittle because if the RESET query fails or the process crashes before it runs, the connection could be returned to the pool in a tainted state. Stick to the transactional SET LOCAL pattern.
2. Superusers, Migrations, and Background Jobs
RLS policies do not apply to table owners or roles with the BYPASSRLS attribute. This is a double-edged sword.
BYPASSRLS. CREATE ROLE app_user LOGIN PASSWORD '...';
GRANT CONNECT ON DATABASE my_db TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
ALTER ROLE app_user NOBYPASSRLS;
- Option A (Impersonation): If the job is acting on behalf of a specific user/tenant, it should be passed the tenant_id and user_id. The job can then use the same withRls pattern to set the session context before executing its work.
- Option B (Privileged Access): If the job is an administrative task (e.g., data aggregation across all tenants), it should connect as a privileged user that bypasses RLS.
3. Administrative / Cross-Tenant Access
How does a customer support user or a system administrator view data for a specific tenant they don't belong to?
We can enhance our RLS policies to handle this. Let's introduce an admin role and an optional app.impersonated.tenant_id session variable.
-- A more advanced policy that allows admins to impersonate tenants
CREATE OR REPLACE POLICY select_projects_for_tenant_and_admin
ON projects FOR SELECT
USING (
-- A regular user can only access their own tenant
(tenant_id::TEXT = get_jwt_claim('tenant_id'))
OR
-- An admin can access an impersonated tenant
(
get_jwt_claim('role') = 'system_admin' AND
tenant_id::TEXT = current_setting('app.impersonated.tenant_id', true)
)
);
Your admin backend would then use a modified withRls function:
async function withAdminRls<T>(
adminClaims: JwtClaims, // Must have role: 'system_admin'
impersonatedTenantId: string,
callback: (client: PoolClient) => Promise<T>
): Promise<T> {
if (adminClaims.role !== 'system_admin') {
throw new Error('Forbidden: Only admins can impersonate');
}
const client = await pool.connect();
try {
await client.query('BEGIN');
// Set both the admin's identity AND the target tenant
await client.query(`SET LOCAL app.jwt.claims = '${JSON.stringify(adminClaims)}'`);
await client.query(`SET LOCAL app.impersonated.tenant_id = '${impersonatedTenantId}'`);
const result = await callback(client);
await client.query('COMMIT');
return result;
} catch (e) {
await client.query('ROLLBACK');
throw e;
} finally {
client.release();
}
}
This pattern provides a secure, auditable way to grant temporary, targeted access to privileged users without disabling the entire security model.
Conclusion: A New Baseline for Multi-Tenant Security
By moving tenancy enforcement from the application to the database layer with PostgreSQL RLS, we fundamentally harden the security posture of our multi-tenant applications. This is not a simple replacement for application-level checks but a powerful layer of defense-in-depth. It makes application code cleaner and eliminates an entire class of catastrophic data leak vulnerabilities.
The integration with JWTs in a serverless environment is nuanced but achievable with the right patterns. The key takeaways for a production-ready implementation are:
SET LOCAL within a transaction to pass JWT claims to the database session. This is the safest way to manage state with connection poolers.USING clauses. Use EXPLAIN ANALYZE to verify that your indexes are being used.NOBYPASSRLS) and a privileged role for migrations and administrative tasks.Adopting this architecture requires a shift in thinking, treating the database schema not just as a data store, but as an active participant in the application's security model. For senior engineers building mission-critical SaaS platforms, this pattern should be considered the new baseline for secure, scalable multi-tenant data isolation.