PostgreSQL RLS & pg_tle: Dynamic Data Masking in Multi-Tenant Systems
The Multi-Tenant Data Conundrum: Beyond Row Isolation
In any multi-tenant architecture, the cardinal rule is absolute data isolation. The standard tool for this in PostgreSQL is Row-Level Security (RLS), and for good reason. A simple policy can effectively prevent one tenant from ever seeing another's data. For senior engineers, this is foundational knowledge.
A typical baseline RLS implementation looks something like this:
-- Enable RLS on the table
ALTER TABLE invoices ENABLE ROW LEVEL SECURITY;
-- A session variable to hold the current tenant's ID
-- This is typically set by a trusted middleware upon user authentication.
-- Example: SET app.current_tenant_id = 'tenant-abc-123';
-- The policy itself
CREATE POLICY tenant_isolation_policy ON invoices
FOR ALL
USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid)
WITH CHECK (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
This works flawlessly for partitioning the table virtually by tenant_id
. But production systems are rarely this simple. The real challenge arises not between tenants, but within a single tenant. Consider a SaaS application with different user roles: tenant_admin
, support_agent
, and billing_clerk
. They all need access to the same users
table, but their view of the data should be drastically different:
* A tenant_admin
should see all user data for their tenant.
A support_agent
should see a user's name and a masked email (j
@example.com) but a fully redacted Personal Identification Number (PIN).
* A billing_clerk
might only need to see user IDs and names, with all other PII completely masked.
This is the problem of dynamic data masking: altering the representation of data within a column based on the accessor's privileges, not just filtering the visibility of the entire row. The naive approach is to handle this in the application layer. This is a pattern fraught with peril:
True security demands that this logic resides where the data lives: in the database. This post details a robust, performant, and maintainable pattern for achieving this using a powerful combination of RLS, security barrier views, and PostgreSQL's Trusted Language Extensions (pg_tle
).
The Shortcomings of Standard SQL in RLS Policies
Your first instinct might be to bake this logic into a complex view or a CASE
statement. Let's explore this path to understand its limitations. Imagine a users
table:
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
full_name TEXT NOT NULL,
email TEXT NOT NULL UNIQUE,
social_security_number TEXT NOT NULL -- Encrypted at rest, but needs masking in transit
);
We could create a view that attempts to perform the masking:
CREATE VIEW v_users AS
SELECT
id,
tenant_id,
full_name,
CASE current_setting('app.current_user_role', true)
WHEN 'tenant_admin' THEN email
WHEN 'support_agent' THEN regexp_replace(email, '(?<=.).(?=.*@)', '*', 'g')
ELSE '******** [REDACTED] ********'
END AS email,
CASE current_setting('app.current_user_role', true)
WHEN 'tenant_admin' THEN social_security_number
WHEN 'support_agent' THEN '***-**-XXXX'
ELSE '******** [REDACTED] ********'
END AS social_security_number
FROM users;
Then, we apply RLS to the underlying users
table to handle tenant isolation, and all application queries target v_users
. This works, but it quickly becomes a maintenance nightmare:
* Logic Duplication: The CASE
statement for roles is repeated for every sensitive column.
* Complexity: As new roles and masking rules are added, these CASE
statements become unreadable and error-prone.
* Lack of Reusability: If another table, say support_tickets
, also contains PII, you have to copy-paste this entire complex CASE
structure into a new view for that table.
* Performance Obfuscation: The query planner has to contend with these complex, volatile expressions on every query, which can sometimes lead to suboptimal plans.
What we need is a way to encapsulate this complex, stateful masking logic into a single, reusable, and secure function that can be deployed and versioned within the database itself. This is precisely the problem pg_tle
was designed to solve.
Enter `pg_tle`: Secure, In-Database Extensibility
PostgreSQL Trusted Language Extensions (pg_tle
) is a modern extension for managing other PostgreSQL extensions. Crucially, it allows you to package custom functions written in trusted languages (like PL/pgSQL, JavaScript via PL/v8, etc.) into versioned, manageable units that can be installed by a DBA without requiring superuser privileges or filesystem access for the installation script itself.
Think of it as an in-database package manager. It provides a secure sandbox for creating powerful, reusable logic that feels like a native database feature.
For our use case, pg_tle
allows us to:
- Write our complex role-based masking logic once in a PL/pgSQL function.
my_app_security
).- Install and upgrade this extension across our database fleet using standard SQL commands.
- Call this clean, simple function from our security views, dramatically simplifying their implementation.
This pattern separates the mechanism of security (the view) from the logic of security (the TLE function), leading to a vastly more maintainable and robust system.
Production Implementation: A Step-by-Step Guide
Let's build this system from the ground up. We'll set up a multi-tenant schema, create a custom masking TLE, and integrate it with a security barrier view and RLS policies.
Step 4.1: Schema Setup
First, we define our tables. We'll include a way to manage user roles within each tenant.
-- For setting transaction-local configuration variables without name clashes
-- This is a common pattern for passing application context to the database.
ALTER DATABASE my_database SET app.current_tenant_id = '';
ALTER DATABASE my_database SET app.current_user_id = '';
ALTER DATABASE my_database SET app.current_user_role = '';
CREATE TABLE tenants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE
);
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
full_name TEXT NOT NULL,
email TEXT NOT NULL,
social_security_number TEXT NOT NULL, -- Assume this is already encrypted at rest
CONSTRAINT users_tenant_email_unique UNIQUE (tenant_id, email)
);
CREATE TABLE roles (
id SERIAL PRIMARY KEY,
role_name TEXT NOT NULL UNIQUE -- e.g., 'tenant_admin', 'support_agent'
);
CREATE TABLE user_roles (
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
role_id INT NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
PRIMARY KEY (user_id, role_id)
);
-- Populate with some data
INSERT INTO tenants (id, name) VALUES
('11111111-1111-1111-1111-111111111111', 'Tenant A'),
('22222222-2222-2222-2222-222222222222', 'Tenant B');
INSERT INTO roles (role_name) VALUES ('tenant_admin'), ('support_agent'), ('billing_clerk');
INSERT INTO users (id, tenant_id, full_name, email, social_security_number) VALUES
('a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1', '11111111-1111-1111-1111-111111111111', 'Alice Admin', '[email protected]', '123-45-6789'),
('b2b2b2b2-b2b2-b2b2-b2b2-b2b2b2b2b2b2', '11111111-1111-1111-1111-111111111111', 'Bob Support', '[email protected]', '987-65-4321'),
('c3c3c3c3-c3c3-c3c3-c3c3-c3c3c3c3c3c3', '22222222-2222-2222-2222-222222222222', 'Charles Admin', '[email protected]', '555-44-3333');
INSERT INTO user_roles (user_id, role_id) VALUES
('a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1', (SELECT id FROM roles WHERE role_name = 'tenant_admin')),
('b2b2b2b2-b2b2-b2b2-b2b2-b2b2b2b2b2b2', (SELECT id FROM roles WHERE role_name = 'support_agent'));
Step 4.2: Installing and Configuring `pg_tle`
This step is typically done once by a database administrator. On managed services like Amazon RDS, it's often as simple as enabling it in the configuration.
-- This needs to be run by a privileged user
CREATE EXTENSION IF NOT EXISTS pg_tle;
-- Grant usage to our application user role
GRANT pg_tle_admin TO my_app_user;
Step 4.3: Creating the Custom Masking TLE
Now for the core of our solution. We'll write a PL/pgSQL function to handle all masking logic and package it into a TLE named dynamic_masking
.
-- Run this as the user with the pg_tle_admin role
SELECT pgtle.install_extension(
'dynamic_masking', -- Extension Name
'1.0', -- Version
'Dynamic PII Masking Utilities', -- Description
-- The SQL script to be executed when the extension is created
$$
-- We place our function inside a dedicated schema for organization
CREATE SCHEMA IF NOT EXISTS masking_utils;
CREATE OR REPLACE FUNCTION masking_utils.mask_value(p_role TEXT, p_value TEXT, p_field_type TEXT)
RETURNS TEXT AS
$function$
BEGIN
-- If no value, return null
IF p_value IS NULL THEN
RETURN NULL;
END IF;
-- Admins see everything
IF p_role = 'tenant_admin' THEN
RETURN p_value;
END IF;
-- Support Agent rules
IF p_role = 'support_agent' THEN
CASE p_field_type
WHEN 'email' THEN
-- Masks middle of email user part, e.g., 'bob.support@...' -> 'b*********t@...'
RETURN regexp_replace(p_value, '(?<=.).(?=.*@)', '*', 'g');
WHEN 'ssn' THEN
RETURN '***-**-' || substr(p_value, 8, 4);
ELSE
RETURN '******** [MASKED] ********';
END CASE;
END IF;
-- Default for any other role (e.g., 'billing_clerk') is full redaction
RETURN '******** [REDACTED] ********';
END;
$function$ LANGUAGE plpgsql IMMUTABLE;
-- Grant execute permission to all roles that will use this
GRANT EXECUTE ON FUNCTION masking_utils.mask_value(TEXT, TEXT, TEXT) TO public;
$$
);
-- Now, enable the extension in the current database
CREATE EXTENSION dynamic_masking;
Notice a few critical details here:
* Versioning: The extension is versioned (1.0
), allowing for controlled upgrades.
* Namespacing: The function is placed in its own schema (masking_utils
) to avoid conflicts.
IMMUTABLE
: We mark the function as IMMUTABLE
because, for the same inputs (role, value, type), it will always* produce the same output. This is a crucial performance hint for the query planner, allowing it to cache results and perform optimizations it couldn't with a VOLATILE
function.
Step 4.4: Integrating with RLS and a Security Barrier View
RLS policies are for filtering rows, not transforming column data. Attempting to transform data inside a USING
clause is an anti-pattern. The correct approach is to combine RLS on the base table with a security barrier view that performs the column-level transformations.
-- Step 1: Apply RLS to the base table for tenant isolation.
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
-- This policy ensures users can only ever see data from their own tenant.
CREATE POLICY user_tenant_isolation ON users
FOR SELECT
USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
-- Step 2: Create the security barrier view that applies the masking.
-- The 'WITH (security_barrier = true)' is critical. It prevents the optimizer
-- from pushing down qualifiers into the view's query, which could leak data
-- from conditions in the outer query's WHERE clause before the RLS policy is applied.
CREATE OR REPLACE VIEW v_secure_users
WITH (security_barrier = true) AS
SELECT
id,
tenant_id,
full_name,
-- Here we call our clean, reusable TLE function!
masking_utils.mask_value(
current_setting('app.current_user_role', true),
email,
'email'
) AS email,
masking_utils.mask_value(
current_setting('app.current_user_role', true),
social_security_number,
'ssn'
) AS social_security_number
FROM users;
-- Grant access to the view, not the underlying table
GRANT SELECT ON v_secure_users TO my_app_user;
REVOKE SELECT ON users FROM my_app_user; -- Ensure they can't bypass the view
The application now only ever queries v_secure_users
. Let's see it in action.
-- Simulate a login for Alice, the Tenant A admin
BEGIN;
SET LOCAL app.current_tenant_id = '11111111-1111-1111-1111-111111111111';
SET LOCAL app.current_user_role = 'tenant_admin';
-- Alice sees everything within her tenant
SELECT * FROM v_secure_users;
-- Result:
-- id | tenant_id | full_name | email | social_security_number
-- --------------------------------------+--------------------------------------+-------------+-------------------------+------------------------
-- a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1 | 11111111-1111-1111-1111-111111111111 | Alice Admin | [email protected] | 123-45-6789
-- b2b2b2b2-b2b2-b2b2-b2b2-b2b2b2b2b2b2 | 11111111-1111-1111-1111-111111111111 | Bob Support | [email protected] | 987-65-4321
-- (She cannot see Charles from Tenant B due to the RLS policy on the base table)
COMMIT;
-- Now, simulate a login for Bob, the Tenant A support agent
BEGIN;
SET LOCAL app.current_tenant_id = '11111111-1111-1111-1111-111111111111';
SET LOCAL app.current_user_role = 'support_agent';
-- Bob sees masked data
SELECT * FROM v_secure_users;
-- Result:
-- id | tenant_id | full_name | email | social_security_number
-- --------------------------------------+--------------------------------------+-------------+-------------------------+------------------------
-- a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1 | 11111111-1111-1111-1111-111111111111 | Alice Admin | a***********[email protected] | ***-**-6789
-- b2b2b2b2-b2b2-b2b2-b2b2-b2b2b2b2b2b2 | 11111111-1111-1111-1111-111111111111 | Bob Support | b*********[email protected] | ***-**-4321
COMMIT;
This architecture is clean, secure, and maintainable. The RLS policy handles coarse-grained row filtering, and the security barrier view handles fine-grained column transformation by delegating the complex logic to our version-controlled pg_tle
function.
Advanced Scenarios and Edge Cases
A production system has more complexities. Let's address them.
Performance Considerations
The overhead of calling a PL/pgSQL function per row is non-zero. Let's analyze it. We'll use a larger dataset.
INSERT INTO users SELECT gen_random_uuid(), '11111111-1111-1111-1111-111111111111', 'User ' || g, g || '@example.com', '555-00-' || lpad(g::text, 4, '0') FROM generate_series(1, 100000) g;
ANALYZE users;
Query without masking (as admin):
BEGIN;
SET LOCAL app.current_tenant_id = '11111111-1111-1111-1111-111111111111';
SET LOCAL app.current_user_role = 'tenant_admin';
EXPLAIN ANALYZE SELECT * FROM v_secure_users WHERE full_name = 'User 50000';
COMMIT;
-- Result (yours may vary):
Index Scan using users_full_name_idx on users ... (cost=0.43..8.45 rows=1 width=100) (actual time=0.045..0.046 rows=1 loops=1)
Index Cond: (full_name = 'User 50000'::text)
Filter: (tenant_id = '11111111-1111-1111-1111-111111111111'::uuid)
Planning Time: 0.150 ms
Execution Time: 0.065 ms
Query with masking (as support agent):
BEGIN;
SET LOCAL app.current_tenant_id = '11111111-1111-1111-1111-111111111111';
SET LOCAL app.current_user_role = 'support_agent';
EXPLAIN ANALYZE SELECT * FROM v_secure_users WHERE full_name = 'User 50000';
COMMIT;
-- Result:
Index Scan using users_full_name_idx on users ... (cost=0.43..8.45 rows=1 width=100) (actual time=0.039..0.040 rows=1 loops=1)
Index Cond: (full_name = 'User 50000'::text)
Filter: (tenant_id = '11111111-1111-1111-1111-111111111111'::uuid)
Planning Time: 0.132 ms
Execution Time: 0.102 ms
The execution time increased from 0.065ms
to 0.102ms
. This is a measurable but often acceptable overhead for the security gain. For a full table scan, the difference would be more pronounced. The key is that because our function is IMMUTABLE
, the planner knows it's safe to evaluate. If the session variables were passed as arguments and the function were STABLE
, the planner could still perform significant optimizations.
Handling Writes (`INSERT`, `UPDATE`)
Our current setup is for SELECT
only. How does a support agent update a user's name without also having to provide a masked email and SSN? We need separate RLS policies for write operations.
-- A policy for UPDATE operations. It only checks tenant isolation.
-- The user provides the raw, unmasked data, and this policy allows it through.
CREATE POLICY user_tenant_update ON users
FOR UPDATE
USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid)
WITH CHECK (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
-- A policy for INSERT operations.
CREATE POLICY user_tenant_insert ON users
FOR INSERT
WITH CHECK (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
-- Now, we need a way to write. We can't write to the view directly without
-- INSTEAD OF triggers. A simpler pattern is to grant write access on the base
-- table, protected by the write policies.
GRANT UPDATE(full_name), INSERT ON users TO my_app_user;
The flow is now:
v_secure_users
and gets masked data.UPDATE users SET full_name = 'New Name' WHERE id = ...
command. The user_tenant_update
RLS policy kicks in, verifies the tenant_id
, and allows the write to the raw table. The user never needs to handle or even see the unmasked sensitive data they aren't allowed to see.Lifecycle Management: Upgrading the Masking Logic
What if we need to change a masking rule? For example, we now want to mask the domain of support agent emails. Because we used pg_tle
, this is a controlled, versioned operation.
-- Create the script for version 1.1
SELECT pgtle.install_extension(
'dynamic_masking',
'1.1', -- New version number
'Dynamic PII Masking Utilities - Updated email rule',
$$
CREATE SCHEMA IF NOT EXISTS masking_utils;
-- This is the updated function logic
CREATE OR REPLACE FUNCTION masking_utils.mask_value(p_role TEXT, p_value TEXT, p_field_type TEXT)
RETURNS TEXT AS
$function$
BEGIN
IF p_value IS NULL THEN RETURN NULL; END IF;
IF p_role = 'tenant_admin' THEN RETURN p_value; END IF;
IF p_role = 'support_agent' THEN
CASE p_field_type
WHEN 'email' THEN
-- NEW LOGIC: Mask user part and domain part
RETURN regexp_replace(p_value, '(?<=.).(?=.*@)', '*', 'g')
|| '@' ||
regexp_replace(split_part(p_value, '@', 2), '(?<=.).', '*', 'g');
WHEN 'ssn' THEN
RETURN '***-**-' || substr(p_value, 8, 4);
ELSE
RETURN '******** [MASKED] ********';
END CASE;
END IF;
RETURN '******** [REDACTED] ********';
END;
$function$ LANGUAGE plpgsql IMMUTABLE;
GRANT EXECUTE ON FUNCTION masking_utils.mask_value(TEXT, TEXT, TEXT) TO public;
$$,
'1.0' -- Specify the version we are upgrading FROM
);
-- Now, run the upgrade in the database
ALTER EXTENSION dynamic_masking UPDATE TO '1.1';
All existing views and policies that use masking_utils.mask_value
will now automatically use the new logic on the next query. This provides a clean, transactional way to manage your security code's lifecycle, which is impossible with ad-hoc functions or view definitions scattered across your DDL.
Conclusion: A Tiered, Database-Enforced Security Model
By moving beyond basic RLS and embracing a tiered security model, we achieve a robust, maintainable, and highly secure architecture for multi-tenant data.
pg_tle
Trusted Language Extensions: The logic core. Encapsulates complex, reusable, and version-controlled masking logic into a secure, manageable package. This decouples the security rules from their enforcement, which is the hallmark of a mature system.This pattern centralizes security logic at the database layer, making it the single source of truth. It eliminates the risks of inconsistent or bypassable application-layer masking and provides a framework for managing the lifecycle of your security rules as your application evolves. For senior engineers building complex SaaS platforms, this database-centric approach to dynamic data masking isn't just a best practice—it's a necessity.