Postgres RLS for Dynamic Data Masking in Multi-Tenant SaaS
The Fallacy of Application-Layer Data Masking
In a mature multi-tenant SaaS platform, simple row-level tenancy (WHERE tenant_id = ?) is table stakes. The real complexity emerges when different user roles within the same tenant require different views of the same data. A common example is Personally Identifiable Information (PII). A tenant's 'admin' user might need to see a customer's full email address and social security number, while a 'support' agent should only see a masked version (e.g., --1234).
The default approach for many teams is to implement this masking logic in the application layer. An API endpoint might fetch the full data record and then, based on the authenticated user's role, selectively mask fields before sending the JSON response. This pattern is fundamentally flawed for several reasons:
True data security enforces policy at the closest possible point to the data itself: the database. This article details a robust, production-ready pattern using PostgreSQL's Row-Level Security (RLS) combined with SECURITY BARRIER views and custom functions to enforce both tenancy and dynamic, role-based data masking directly within the database engine.
The Core Architecture: A Multi-Layered Database Defense
Our goal is to build a system where a simple SELECT FROM sensitive_data from the application will automatically* return the correctly filtered and masked data based on the user's identity and role, which are established at the beginning of each transaction. We'll achieve this by composing several PostgreSQL features:
SET LOCAL to create transaction-scoped variables (e.g., app.tenant_id, app.user_role) that hold the application user's context. This is crucial for compatibility with connection poolers.STABLE PL/pgSQL function will contain the core masking logic, taking a user role and a data value as input and returning the appropriately masked version.SECURITY BARRIER View: This is the linchpin. We'll create a view on top of the base table. This view's SELECT list will apply the masking function to sensitive columns. Marking it as a SECURITY BARRIER is critical to prevent attackers from bypassing the RLS policy on the underlying table by using leaky functions in query conditions.The application will only ever interact with the view, never the base table. The database becomes the single, authoritative source of truth for data access policy.
Step 1: Schema and Context Setup
Let's define a simplified schema for a multi-tenant CRM. We have tenants, users (with roles), and customers with sensitive PII.
-- Create a separate schema for private tables to enforce access control
CREATE SCHEMA IF NOT EXISTS private;
-- Tenants table
CREATE TABLE tenants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL
);
-- Users table with roles
CREATE TYPE user_role AS ENUM ('admin', 'support', 'viewer');
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
email TEXT NOT NULL,
role user_role NOT NULL DEFAULT 'viewer',
UNIQUE(tenant_id, email)
);
-- The base table with raw, sensitive data. This lives in the private schema.
CREATE TABLE private.customers (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
full_name TEXT NOT NULL,
email TEXT NOT NULL,
ssn TEXT NOT NULL -- Social Security Number
);
-- Enable RLS on the private table
ALTER TABLE private.customers ENABLE ROW LEVEL SECURITY;
-- Create indexes for performance
CREATE INDEX ON users (tenant_id);
CREATE INDEX ON private.customers (tenant_id);
Setting the Application Context
Our RLS policies will depend on runtime parameters that define the current user's session. We'll use PostgreSQL's configuration parameters for this. Using current_setting() with a custom namespace (e.g., app.) is the standard practice.
-- A helper function to safely get the current tenant_id
-- It throws an error if the setting is not available, preventing accidental cross-tenant access.
CREATE OR REPLACE FUNCTION get_current_tenant_id() RETURNS UUID AS $$
BEGIN
RETURN current_setting('app.tenant_id')::UUID;
EXCEPTION
WHEN UNDEFINED_OBJECT THEN
RAISE EXCEPTION 'app.tenant_id is not set';
END;
$$ LANGUAGE plpgsql STABLE;
-- A helper function to get the current user's role
CREATE OR REPLACE FUNCTION get_current_user_role() RETURNS user_role AS $$
BEGIN
-- Fallback to 'viewer' if not explicitly set
RETURN current_setting('app.user_role', true)::user_role;
EXCEPTION
WHEN UNDEFINED_OBJECT THEN
RETURN 'viewer'::user_role;
END;
$$ LANGUAGE plpgsql STABLE;
Key Design Choice: These functions are marked STABLE, meaning they return the same result for all rows within a single statement. This allows the query planner to optimize them more effectively than VOLATILE functions.
Step 2: RLS Policy for Strict Tenancy
First, we apply a standard RLS policy to the private.customers table. This is our foundational security layer. Its only job is to ensure that queries can only ever see rows belonging to the current tenant.
-- Drop any existing policy to ensure a clean state
DROP POLICY IF EXISTS tenant_isolation ON private.customers;
-- This policy enforces that all SELECT, INSERT, UPDATE, DELETE operations
-- are constrained to the tenant_id set in the session context.
CREATE POLICY tenant_isolation ON private.customers
AS PERMISSIVE
FOR ALL
TO public
USING (tenant_id = get_current_tenant_id())
WITH CHECK (tenant_id = get_current_tenant_id());
With this policy in place, even if a user gains SQL access, they cannot see data from another tenant without being able to control the app.tenant_id setting, which is managed by the application backend.
Step 3: The Dynamic Masking Function
Now we create the PL/pgSQL function that encapsulates our masking business logic. This centralizes the rules and makes them easily auditable and updatable.
CREATE OR REPLACE FUNCTION public.mask_pii(
role user_role,
value TEXT,
mask_type TEXT -- e.g., 'email', 'ssn'
) RETURNS TEXT AS $$
BEGIN
IF role = 'admin' THEN
-- Admins see the full, unmasked value
RETURN value;
ELSIF role = 'support' THEN
-- Support agents see a partially masked value
CASE mask_type
WHEN 'ssn' THEN
RETURN '***-**-' || right(value, 4);
WHEN 'email' THEN
RETURN regexp_replace(value, '(?<=.).(?=[^@]*?.@)', '*', 'g'); -- [email protected] -> a*@e******.com
ELSE
RETURN '********';
END CASE;
ELSE
-- All other roles (e.g., 'viewer') see a fully redacted value
RETURN 'REDACTED';
END IF;
END;
$$ LANGUAGE plpgsql STABLE;
Note on Security: This function should ideally be SECURITY DEFINER if it needs to access tables the calling user doesn't have permissions for. In our case, it's a pure function operating on its inputs, so SECURITY INVOKER (the default) is fine. Making it STABLE is crucial for performance.
Step 4: The `SECURITY BARRIER` View for Masking
RLS policies are for filtering rows. They cannot be used to rewrite the values within a column for a SELECT statement. Attempting to do so is a common misconception. The correct pattern is to expose a view to the application that performs this transformation.
Using a SECURITY BARRIER view is non-negotiable for security. It prevents the query planner from reordering operations in a way that could leak data. For example, an attacker could craft a WHERE clause with a leaky, user-defined function that gets executed before the RLS policy is applied, potentially revealing data from other tenants.
-- This view is the ONLY way the application should access customer data.
CREATE OR REPLACE VIEW public.customers AS
SELECT
id,
tenant_id,
full_name, -- Not considered PII in this example
-- Apply dynamic masking based on the current user's role
public.mask_pii(
get_current_user_role(),
email,
'email'
) AS email,
public.mask_pii(
get_current_user_role(),
ssn,
'ssn'
) AS ssn
FROM
private.customers;
-- This is the critical part for security.
ALTER VIEW public.customers SET (security_barrier = true);
-- Grant usage on the public schema and select on the view to application roles
GRANT USAGE ON SCHEMA public TO my_app_role;
GRANT SELECT, INSERT, UPDATE, DELETE ON public.customers TO my_app_role;
-- Explicitly DENY access to the underlying table for the application role
GRANT USAGE ON SCHEMA private TO my_app_role;
REVOKE ALL ON private.customers FROM my_app_role;
Now, our setup is complete. The application role my_app_role can query public.customers but has no access to private.customers. The combination of the RLS policy on the base table and the masking logic in the security_barrier view provides our multi-layered defense.
Step 5: Production Application Integration (Node.js/Express Example)
Integrating this into a backend application requires setting the context variables for every database transaction. Here's a complete example using Node.js, Express, and the pg library.
First, a middleware to extract user context from a JWT and attach it to the request object.
// middleware/auth.js
const jwt = require('jsonwebtoken');
function authenticateToken(req, res, next) {
const authHeader = req.headers['authorization'];
const token = authHeader && authHeader.split(' ')[1];
if (token == null) return res.sendStatus(401);
jwt.verify(token, process.env.JWT_SECRET, (err, user) => {
if (err) return res.sendStatus(403);
// user object in JWT payload: { id: '...', tenantId: '...', role: '...' }
req.user = user;
next();
});
}
module.exports = authenticateToken;
Next, a database service that correctly wraps queries in a transaction and sets the RLS context.
// services/db.js
const { Pool } = require('pg');
const pool = new Pool({
// ... connection config
});
// A higher-order function to wrap a database operation with RLS context
async function withRlsContext(userContext, operation) {
if (!userContext || !userContext.tenantId || !userContext.role) {
throw new Error('User context for RLS is missing.');
}
const client = await pool.connect();
try {
await client.query('BEGIN');
// Use SET LOCAL to scope settings to the current transaction.
// This is VITAL for connection pooling to work correctly.
await client.query(`SET LOCAL app.tenant_id = '${userContext.tenantId}'`);
await client.query(`SET LOCAL app.user_role = '${userContext.role}'`);
const result = await operation(client);
await client.query('COMMIT');
return result;
} catch (e) {
await client.query('ROLLBACK');
throw e;
} finally {
client.release();
}
}
// Example usage function
async function getCustomers(userContext) {
return withRlsContext(userContext, async (client) => {
// The application code is clean and unaware of RLS or masking.
// It just queries the public view.
const res = await client.query('SELECT id, full_name, email, ssn FROM public.customers LIMIT 10');
return res.rows;
});
}
module.exports = { getCustomers };
Finally, the Express route ties it all together.
// routes/customers.js
const express = require('express');
const router = express.Router();
const authenticateToken = require('../middleware/auth');
const { getCustomers } = require('../services/db');
router.get('/', authenticateToken, async (req, res) => {
try {
// req.user is populated by the auth middleware
const customers = await getCustomers(req.user);
res.json(customers);
} catch (error) {
console.error('Failed to get customers:', error);
res.status(500).send('Internal Server Error');
}
});
module.exports = router;
When this endpoint is called, the result will vary based on the JWT provided:
* JWT with role: 'admin': [ { "id": "...", "full_name": "Jane Doe", "email": "[email protected]", "ssn": "123-45-6789" } ]
JWT with role: 'support': [ { "id": "...", "full_name": "Jane Doe", "email": "j.d@e.com", "ssn": "-*-6789" } ]
* JWT with role: 'viewer': [ { "id": "...", "full_name": "Jane Doe", "email": "REDACTED", "ssn": "REDACTED" } ]
The application code remains blissfully ignorant of the underlying security complexity.
Performance Analysis and Edge Cases
This pattern is powerful but not without overhead. It's crucial to understand the performance implications.
Analyzing the Query Plan
Let's analyze a query as an admin user.
-- Simulate an admin session
BEGIN;
SET LOCAL app.tenant_id = 'your-tenant-id';
SET LOCAL app.user_role = 'admin';
EXPLAIN ANALYZE SELECT * FROM public.customers WHERE id = 'some-customer-id';
-- Query Plan output (simplified)
Index Scan using customers_pkey on customers c (cost=0.43..8.45 rows=1 width=108) (actual time=0.046..0.047 rows=1 loops=1)
Index Cond: (id = 'some-customer-id'::uuid)
Filter: (tenant_id = get_current_tenant_id()) -- RLS policy is applied!
Planning Time: 0.215 ms
Execution Time: 0.081 ms
COMMIT;
The key takeaway is that the RLS policy (Filter: (tenant_id = ...) ) is applied directly to the scan on the underlying table. The query planner is smart enough to push the predicate down. The overhead of the get_current_tenant_id() and mask_pii() functions is minimal for simple queries because they are STABLE and their results can be cached within the query.
However, for large table scans, the cost of executing the masking function for every single row can add up. If you have a workload that scans millions of rows to produce an aggregate report, the CPU cost of masking might become significant. In such cases, consider creating specific reporting roles and views that do not perform masking, or perform masking at the aggregation level.
Edge Case: The Connection Pooler Trap (PgBouncer)
This is the most common and dangerous pitfall when implementing context-based RLS. Many deployments use PgBouncer in session pooling mode. In this mode, when a client disconnects, the connection is returned to the pool, but its session state (like our app.tenant_id setting) is not reset. The next request, potentially for a different tenant, could pick up this connection and inherit the old settings, completely breaking the security model.
The only safe solution is to use PgBouncer in transaction pooling mode.
In pgbouncer.ini:
pool_mode = transaction
In transaction pooling mode, the connection is returned to the pool after every transaction (COMMIT/ROLLBACK). PgBouncer ensures the connection is reset to a clean state. This is why our Node.js code is so careful to wrap every unit of work in BEGIN...COMMIT and use SET LOCAL. SET LOCAL ensures the setting only lasts for the duration of the transaction, providing an extra layer of safety.
Edge Case: Superusers and Background Jobs
By default, superusers and any role with the BYPASSRLS attribute will ignore all RLS policies. This is necessary for database administration, but it's a security risk for application logic.
Rule: Your primary application role (my_app_role in our example) must NOT be a superuser and must NOT have BYPASSRLS.
So how do you run background jobs or administrative scripts that need broader access (e.g., a data migration script)?
migration_runner) that is used only for these offline processes. This role can be granted BYPASSRLS.SECURITY DEFINER Functions: For controlled escalation, create a SECURITY DEFINER function owned by a privileged user. The application can call this function to perform a specific, pre-defined cross-tenant action. The function's body runs with the permissions of its owner, bypassing the caller's RLS constraints.Example of a safe administrative function:
-- This function is owned by a superuser or a role with BYPASSRLS
CREATE OR REPLACE FUNCTION admin_tasks.recalculate_all_customer_stats()
RETURNS void AS $$
BEGIN
-- This code will bypass the RLS policy on private.customers
-- because the function is a SECURITY DEFINER owned by a privileged role.
TRUNCATE TABLE public.customer_stats;
INSERT INTO public.customer_stats (tenant_id, customer_count)
SELECT tenant_id, count(*) FROM private.customers GROUP BY tenant_id;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
-- The app role can be granted execute on just this function
GRANT EXECUTE ON FUNCTION admin_tasks.recalculate_all_customer_stats() TO my_app_role;
Conclusion: Centralized, Bulletproof Data Governance
By combining RLS for tenancy with SECURITY BARRIER views for dynamic masking, we elevate security policy from a distributed, fragile application concern to a centralized, robust database guarantee. The application's contract becomes simple: establish user context at the start of a transaction, then execute standard SQL against a public-facing view. The database handles the rest.
This pattern provides:
* Defense in Depth: An attacker must not only compromise the application but also find a way to manipulate the PostgreSQL session context or exploit a flaw in the SECURITY BARRIER implementation.
* Reduced Application Complexity: Application developers can focus on business logic without constantly worrying about data filtering and masking.
* Auditability: The entire access policy is defined in one place (the SQL schema), making it easy to audit and verify.
While the initial setup is more complex than application-level logic, the long-term security and maintenance benefits for any non-trivial multi-tenant SaaS application are profound. This is the architecture that lets senior engineers sleep soundly, knowing their most sensitive data is protected by the database itself.