PostgreSQL RLS with JWTs for Granular Multi-Tenant Authorization
The Fragility of Application-Layer Tenant Isolation
In the world of multi-tenant SaaS development, preventing data leakage between tenants is not just a feature; it's a foundational requirement. The standard approach is deceptively simple: diligently add a WHERE tenant_id = ? clause to every single database query that accesses tenant-specific data. While functional, this pattern is fraught with peril and technical debt.
This application-layer enforcement creates a massive surface area for error. A single forgotten WHERE clause in a complex JOIN or a hastily written analytics query can lead to a catastrophic data breach. It relies entirely on developer discipline, which is notoriously fallible under pressure. Code reviews can catch some omissions, but not all. Furthermore, this pattern leads to boilerplate code, cluttering business logic with repetitive authorization checks.
More insidiously, it creates a false sense of security. If any part of your infrastructure allows for direct database connections (e.g., an internal admin tool, a data analytics platform, a compromised microservice), the application-layer protections are completely bypassed. The database itself has no intrinsic knowledge of tenant boundaries.
This is where we, as senior engineers, must advocate for a more robust architecture. The principle of least privilege dictates that we should enforce security at the lowest possible layer. For data, that layer is the database itself. By leveraging PostgreSQL's Row-Level Security (RLS), we can create a watertight security boundary that is enforced for every query, regardless of its origin.
This article details a production-grade pattern for implementing RLS in a multi-tenant environment, driven by claims extracted from a JSON Web Token (JWT). We will not be covering the basics of RLS or JWTs. Instead, we'll focus on the critical glue that binds them together, the advanced patterns for real-world complexity, and the performance implications you must consider before deploying to production.
The Core Pattern: Propagating JWT Claims to the PostgreSQL Session
The entire system hinges on a single, powerful mechanism: securely passing user and tenant context from the verified JWT to the active PostgreSQL session. This context is then used by RLS policies to filter data.
When a user makes an API request, they present a JWT. Our authentication middleware validates this token and extracts its claims, such as user_id, tenant_id, and role. The challenge is to make these claims available within the database session that serves the request.
The solution lies in PostgreSQL's runtime configuration parameters. We can define custom parameters and set their values for the duration of a session or, more importantly, a single transaction. The key functions are:
SET LOCAL my.setting = 'value';: Sets a configuration parameter's value, but only for the current transaction*. This is the cornerstone of connection pool safety. When the transaction commits or rolls back, the setting is reverted. This prevents a setting for User A from leaking into a subsequent query for User B that reuses the same database connection from the pool.
* current_setting('my.setting', true): Retrieves the value of the custom parameter. The second argument, true, indicates that it's okay if the setting doesn't exist, in which case it returns NULL without throwing an error.
Our strategy is as follows:
- An API request arrives with a JWT.
tenant_id, user_id, role).- For each request, we begin a new database transaction.
SET LOCAL commands to populate the session with the JWT claims.SELECT * FROM projects.current_setting() to read these claims and append the necessary WHERE clauses invisibly.- We commit or roll back the transaction. The session settings are automatically cleared, and the connection is safely returned to the pool.
This architecture is profoundly elegant. The application code is simplified, and the security rules are centralized, auditable, and impossible to bypass at the application layer.
Production Implementation Walkthrough
Let's build this system from the ground up. We'll use a simplified schema for a project management SaaS.
1. Database Schema and RLS Enablement
First, define the tables. Note that every tenant-isolated table must have a tenant_id column.
-- A central table for tenants
CREATE TABLE tenants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid()
-- other tenant-specific info like name, subscription_plan, etc.
);
-- Users belong to a tenant
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
email TEXT NOT NULL UNIQUE
-- other user info
);
-- Projects are the primary tenant-specific resource
CREATE TABLE projects (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
owner_id UUID NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
name TEXT NOT NULL
);
-- Add indexes, critical for RLS performance
CREATE INDEX ON users(tenant_id);
CREATE INDEX ON projects(tenant_id);
CREATE INDEX ON projects(owner_id);
Now, we enable RLS on the projects table. It's crucial to use FORCE ROW LEVEL SECURITY to ensure that even table owners are subject to the policies. This prevents accidental data access by privileged database roles during routine operations.
-- Enable RLS for the projects table
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
-- Force RLS even for the table owner. This is a critical production setting.
ALTER TABLE projects FORCE ROW LEVEL SECURITY;
2. Crafting the RLS Policies
With RLS enabled, all access to the projects table is denied by default. We must create policies to grant access. A policy is essentially a rule that defines which rows are visible or modifiable.
Let's create a comprehensive policy that governs all operations (SELECT, INSERT, UPDATE, DELETE). We'll use the custom setting jwt.claims.tenant_id which we'll populate from our application.
CREATE POLICY projects_isolation_policy ON projects
AS PERMISSIVE -- 'PERMISSIVE' means policies are combined with OR. 'RESTRICTIVE' is AND.
FOR ALL -- Applies to SELECT, INSERT, UPDATE, DELETE
TO authenticated_user -- Apply this policy only to a specific database role
USING (
-- The USING clause is evaluated for existing rows (for SELECT, UPDATE, DELETE).
-- It acts as the invisible WHERE clause.
tenant_id = current_setting('jwt.claims.tenant_id')::uuid
)
WITH CHECK (
-- The WITH CHECK clause is evaluated for new or updated rows (for INSERT, UPDATE).
-- It prevents a user from inserting/moving data into another tenant's scope.
tenant_id = current_setting('jwt.claims.tenant_id')::uuid
);
Key Implementation Details:
* AS PERMISSIVE: For most multi-tenancy scenarios, a single permissive policy is sufficient. If you have more complex logic (e.g., a base tenancy policy AND a restrictive archival policy), you might combine PERMISSIVE and RESTRICTIVE policies.
* TO authenticated_user: It's a best practice to grant table permissions (SELECT, INSERT, etc.) to a specific, non-login role (authenticated_user) and then GRANT authenticated_user to your actual application login role. The RLS policy is then attached to this abstract role. This decouples RLS from the specific connection role.
* Type Casting: current_setting() returns text. You must cast it to the appropriate type (e.g., ::uuid) for PostgreSQL to use indexes correctly and avoid type mismatch errors. Failure to do so can lead to disastrous full-table scans.
USING vs. WITH CHECK: A common mistake is to only define a USING clause. This protects reads but would allow a user to INSERT a row with a different tenant_id or UPDATE a row to change its tenant_id, effectively moving data between tenants. The WITH CHECK clause prevents this by validating the data before* the write operation is committed.
3. The Application Layer: A Go Example
Now for the crucial link. Here is a complete, production-ready example in Go using the pgx library, which has excellent support for PostgreSQL's features.
First, let's define a middleware for handling JWT validation and context propagation.
package main
import (
"context"
"fmt"
"net/http"
"strings"
"github.com/golang-jwt/jwt/v4"
)
// Define a custom context key to avoid collisions
type contextKey string
const ClaimsContextKey = contextKey("claims")
// UserClaims represents the JWT claims we care about
type UserClaims struct {
TenantID string `json:"tenant_id"`
UserID string `json:"user_id"`
Role string `json:"role"`
jwt.RegisteredClaims
}
// JWTMiddleware validates the token and injects claims into the request context
func JWTMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
authHeader := r.Header.Get("Authorization")
if !strings.HasPrefix(authHeader, "Bearer ") {
http.Error(w, "Missing or invalid authorization token", http.StatusUnauthorized)
return
}
tokenString := strings.TrimPrefix(authHeader, "Bearer ")
claims := &UserClaims{}
// In production, use a secure key and proper error handling
// This secret key is for demonstration only!
jwtKey := []byte("my_super_secret_key")
token, err := jwt.ParseWithClaims(tokenString, claims, func(token *jwt.Token) (interface{}, error) {
return jwtKey, nil
})
if err != nil || !token.Valid {
http.Error(w, "Invalid token", http.StatusUnauthorized)
return
}
// Token is valid. Store claims in the context for downstream handlers.
ctx := context.WithValue(r.Context(), ClaimsContextKey, claims)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
Now, let's look at the database interaction. We'll create a ProjectStore that encapsulates data access. The key is the executeTxWithRLS function, which wraps every database operation.
package main
import (
"context"
"fmt"
"github.com/jackc/pgx/v5/pgxpool"
)
// Project represents our data model
type Project struct {
ID string `json:"id"`
TenantID string `json:"tenant_id"` // For demonstration; in real code, this would not be exposed
OwnerID string `json:"owner_id"`
Name string `json:"name"`
}
// ProjectStore handles database operations for projects
type ProjectStore struct {
pool *pgxpool.Pool
}
// This is the most important function in the pattern.
// It sets the RLS context within a transaction.
func (s *ProjectStore) executeTxWithRLS(ctx context.Context, fn func(tx pgx.Tx) error) error {
claims, ok := ctx.Value(ClaimsContextKey).(*UserClaims)
if !ok || claims == nil {
return fmt.Errorf("authorization claims not found in context")
}
tx, err := s.pool.Begin(ctx)
if err != nil {
return err
}
defer tx.Rollback(ctx) // Rollback is a no-op if the tx has been committed
// SET LOCAL is critical for connection pool safety.
// These settings will be automatically discarded when the transaction ends.
setTenantCmd := fmt.Sprintf("SET LOCAL jwt.claims.tenant_id = '%s';", claims.TenantID)
setUserIDCmd := fmt.Sprintf("SET LOCAL jwt.claims.user_id = '%s';", claims.UserID)
setRoleCmd := fmt.Sprintf("SET LOCAL jwt.claims.role = '%s';", claims.Role)
// In a real application, be extremely careful about SQL injection here.
// The claims should be validated (e.g., ensuring TenantID is a valid UUID) before this step.
// Since they come from a verified JWT, the risk is lower but not zero.
batch := &pgx.Batch{}
batch.Queue(setTenantCmd)
batch.Queue(setUserIDCmd)
batch.Queue(setRoleCmd)
br := tx.SendBatch(ctx, batch)
if err := br.Close(); err != nil {
return fmt.Errorf("failed to set RLS context: %w", err)
}
// Execute the actual business logic within the transaction
if err := fn(tx); err != nil {
return err
}
return tx.Commit(ctx)
}
// GetProjectsForUser demonstrates the simplicity of the business logic code.
// Notice the complete absence of `WHERE tenant_id = ?`.
func (s *ProjectStore) GetProjectsForUser(ctx context.Context) ([]Project, error) {
var projects []Project
queryFn := func(tx pgx.Tx) error {
rows, err := tx.Query(ctx, "SELECT id, tenant_id, owner_id, name FROM projects")
if err != nil {
return err
}
defer rows.Close()
for rows.Next() {
var p Project
if err := rows.Scan(&p.ID, &p.TenantID, &p.OwnerID, &p.Name); err != nil {
return err
}
projects = append(projects, p)
}
return rows.Err()
}
err := s.executeTxWithRLS(ctx, queryFn)
return projects, err
}
This pattern is powerful. The GetProjectsForUser function is clean and focused on business logic. All authorization is handled transparently by the executeTxWithRLS wrapper and the database itself. An engineer adding a new query function cannot forget the tenancy check, because it's not their responsibility.
Advanced Patterns and Edge Cases
Real-world systems are never this simple. Here’s how to handle more complex requirements.
1. Role-Based Access Control (RBAC)
What if a tenant has 'admin' users who can see all projects, and 'member' users who can only see projects they own? We can encode this logic directly into our RLS policy by using the jwt.claims.role and jwt.claims.user_id settings.
First, update the policy:
DROP POLICY projects_isolation_policy ON projects;
CREATE POLICY projects_rbacl_policy ON projects
AS PERMISSIVE
FOR SELECT
TO authenticated_user
USING (
tenant_id = current_setting('jwt.claims.tenant_id')::uuid AND
(
-- An 'admin' can see all projects in their tenant
current_setting('jwt.claims.role') = 'admin'
OR
-- A non-admin can only see projects they own
owner_id = current_setting('jwt.claims.user_id')::uuid
)
);
-- Keep a simpler policy for writes to ensure data stays in the right tenant
CREATE POLICY projects_write_policy ON projects
AS PERMISSIVE
FOR INSERT, UPDATE, DELETE
TO authenticated_user
USING (
tenant_id = current_setting('jwt.claims.tenant_id')::uuid
)
WITH CHECK (
tenant_id = current_setting('jwt.claims.tenant_id')::uuid
);
Now, our authorization logic is even richer. The application code in Go does not change at all. We simply need to ensure our JWT minting service includes the correct role claim. This is a huge win for separation of concerns.
2. Handling Superusers and Internal Tools
Your DevOps team needs to run migrations. Your support team needs a tool to fix a customer's data. These operations must bypass RLS. Attempting to manage this by giving these tools a JWT with a special role is an anti-pattern; it mixes operational concerns with application logic.
The correct solution is at the PostgreSQL role level. We can create a specific role for administrative tasks and grant it the BYPASSRLS attribute.
-- Create a role for migrations and other administrative tasks
CREATE ROLE service_migrations WITH LOGIN PASSWORD 'a_very_strong_password';
-- This is the magic attribute
ALTER ROLE service_migrations BYPASSRLS;
-- Grant it necessary permissions, but be restrictive
GRANT CONNECT ON DATABASE my_app_db TO service_migrations;
GRANT USAGE ON SCHEMA public TO service_migrations;
GRANT ALL ON ALL TABLES IN SCHEMA public TO service_migrations;
When your migration tool or support script connects to the database, it should authenticate as the service_migrations role. For this role, RLS is completely disabled, and it can see all rows. The application's main connection pool should continue to use its own, non-BYPASSRLS role.
3. Cross-Tenant Access for System Admins
What about a system-level admin dashboard that needs to display data aggregated across all tenants? The SET LOCAL pattern is designed for a single tenant context and is ill-suited for this.
For this specific, high-privilege use case, we can employ a SECURITY DEFINER function. A SECURITY DEFINER function executes with the privileges of the user who defined the function, not the user who invokes it.
Warning: SECURITY DEFINER is extremely powerful and can be dangerous if not used correctly. Always set a secure search_path to prevent hijacking.
-- 1. Create a dedicated, non-login role that will own the function
CREATE ROLE cross_tenant_accessor;
GRANT USAGE ON SCHEMA public TO cross_tenant_accessor;
GRANT SELECT ON projects TO cross_tenant_accessor;
-- 2. Create the function owned by this role
CREATE OR REPLACE FUNCTION get_all_projects_for_system_admin()
RETURNS SETOF projects AS $$
BEGIN
-- This function runs with the privileges of `cross_tenant_accessor`,
-- which is not subject to the RLS policy defined for `authenticated_user`.
RETURN QUERY SELECT * FROM projects;
END;
$$ LANGUAGE plpgsql
SECURITY DEFINER
-- CRITICAL: Prevent search_path hijacking
SET search_path = public;
ALTER FUNCTION get_all_projects_for_system_admin() OWNER TO cross_tenant_accessor;
-- 3. Grant execute permission on this function to your system admin role
GRANT EXECUTE ON FUNCTION get_all_projects_for_system_admin() TO system_admin_role;
Your admin dashboard application can now connect as system_admin_role and simply call SELECT * FROM get_all_projects_for_system_admin(); to get the data it needs, without having to bypass RLS globally.
Performance Considerations and Benchmarking
RLS is not free. Every query against a protected table incurs the overhead of executing the policy functions. You must analyze and understand this cost.
1. The Cost of `current_setting()` and Policy Logic
The current_setting() function itself is highly optimized, but it's not a no-op. More importantly, the complexity of your policy matters. A simple equality check is cheap. A policy that involves subqueries or complex JOINs can add significant overhead to every single query.
Rule of Thumb: Keep your RLS policies as simple as possible. They should contain only the logic required for authorization. If you need complex business rules, they still belong in the application layer.
2. Indexing is Non-Negotiable
An RLS policy effectively adds a WHERE clause to your query. If the columns used in your policy are not indexed, the database will be forced into a sequential scan. For a multi-tenant table with millions of rows, this is catastrophic.
Consider our policy: USING (tenant_id = current_setting('jwt.claims.tenant_id')::uuid).
Let's look at the EXPLAIN ANALYZE output for SELECT * FROM projects WHERE name = 'Some Project';
Without an index on tenant_id:
Seq Scan on projects (cost=0.00..6754.00 rows=1 width=128) (actual time=0.025..54.321 rows=1 loops=1)
Filter: ((name = 'Some Project'::text) AND (tenant_id = current_setting('jwt.claims.tenant_id')::uuid))
Rows Removed by Filter: 99999
Planning Time: 0.150 ms
Execution Time: 54.345 ms
The planner had to scan the entire table and then filter, which is very slow.
After CREATE INDEX ON projects(tenant_id);:
Bitmap Heap Scan on projects (cost=12.50..1524.00 rows=1 width=128) (actual time=0.045..0.046 rows=1 loops=1)
Recheck Cond: (tenant_id = current_setting('jwt.claims.tenant_id')::uuid)
Filter: (name = 'Some Project'::text)
-> Bitmap Index Scan on projects_tenant_id_idx (cost=0.00..12.50 rows=100 width=0) (actual time=0.030..0.030 rows=100 loops=1)
Index Cond: (tenant_id = current_setting('jwt.claims.tenant_id')::uuid)
Planning Time: 0.210 ms
Execution Time: 0.075 ms
This is orders of magnitude faster. The planner uses the tenant_id index to quickly narrow down the search space to only the relevant tenant's data before applying the other filters.
3. Connection Pooling Safety
We've mentioned it multiple times, but it bears repeating: always use SET LOCAL. If you use SET (which sets the parameter for the entire session), a connection returned to the pool will retain its settings. The next request that picks up that connection will inadvertently operate as the previous user until it sets its own context, leading to data leakage. Wrapping every unit of work in a transaction and using SET LOCAL is the only safe way to use this pattern with a connection pool.
Production Readiness Checklist
Before deploying an RLS-based authorization system, validate your implementation against this checklist:
ALTER TABLE ... FORCE ROW LEVEL SECURITY on all protected tables?SET LOCAL to define the RLS context?BYPASSRLS for migrations and operational tasks? Is access to this role's credentials strictly controlled?tenant_id) properly indexed?current_setting() to the correct column types within your policies?WITH CHECK Validation: Do all of your write policies (INSERT, UPDATE) include a WITH CHECK clause to prevent tenants from writing data into another tenant's scope?By moving authorization from the application to the database, you create a system that is not only more secure and robust but also cleaner and easier to maintain. The initial setup is more complex than adding a simple WHERE clause, but the long-term architectural benefits for any serious multi-tenant application are immense.