PostgreSQL RLS & pg_tle: Dynamic Data Masking in Multi-Tenant Systems

17 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Multi-Tenant Data Conundrum: Beyond Row Isolation

In any multi-tenant architecture, the cardinal rule is absolute data isolation. The standard tool for this in PostgreSQL is Row-Level Security (RLS), and for good reason. A simple policy can effectively prevent one tenant from ever seeing another's data. For senior engineers, this is foundational knowledge.

A typical baseline RLS implementation looks something like this:

sql
-- Enable RLS on the table
ALTER TABLE invoices ENABLE ROW LEVEL SECURITY;

-- A session variable to hold the current tenant's ID
-- This is typically set by a trusted middleware upon user authentication.
-- Example: SET app.current_tenant_id = 'tenant-abc-123';

-- The policy itself
CREATE POLICY tenant_isolation_policy ON invoices
FOR ALL
USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid)
WITH CHECK (tenant_id = current_setting('app.current_tenant_id', true)::uuid);

This works flawlessly for partitioning the table virtually by tenant_id. But production systems are rarely this simple. The real challenge arises not between tenants, but within a single tenant. Consider a SaaS application with different user roles: tenant_admin, support_agent, and billing_clerk. They all need access to the same users table, but their view of the data should be drastically different:

* A tenant_admin should see all user data for their tenant.

A support_agent should see a user's name and a masked email (j@example.com) but a fully redacted Personal Identification Number (PIN).

* A billing_clerk might only need to see user IDs and names, with all other PII completely masked.

This is the problem of dynamic data masking: altering the representation of data within a column based on the accessor's privileges, not just filtering the visibility of the entire row. The naive approach is to handle this in the application layer. This is a pattern fraught with peril:

  • Brittleness: Every new service, API endpoint, or background job that accesses the data must correctly re-implement the masking logic.
  • Inconsistency: Different services may implement masking slightly differently, leading to data leakage or bugs.
  • Bypassable: A developer with direct database access or a new, unsecured internal service can easily bypass the application-layer rules and access raw PII.
  • True security demands that this logic resides where the data lives: in the database. This post details a robust, performant, and maintainable pattern for achieving this using a powerful combination of RLS, security barrier views, and PostgreSQL's Trusted Language Extensions (pg_tle).

    The Shortcomings of Standard SQL in RLS Policies

    Your first instinct might be to bake this logic into a complex view or a CASE statement. Let's explore this path to understand its limitations. Imagine a users table:

    sql
    CREATE TABLE users (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        tenant_id UUID NOT NULL,
        full_name TEXT NOT NULL,
        email TEXT NOT NULL UNIQUE,
        social_security_number TEXT NOT NULL -- Encrypted at rest, but needs masking in transit
    );

    We could create a view that attempts to perform the masking:

    sql
    CREATE VIEW v_users AS
    SELECT
        id,
        tenant_id,
        full_name,
        CASE current_setting('app.current_user_role', true)
            WHEN 'tenant_admin' THEN email
            WHEN 'support_agent' THEN regexp_replace(email, '(?<=.).(?=.*@)', '*', 'g')
            ELSE '******** [REDACTED] ********'
        END AS email,
        CASE current_setting('app.current_user_role', true)
            WHEN 'tenant_admin' THEN social_security_number
            WHEN 'support_agent' THEN '***-**-XXXX'
            ELSE '******** [REDACTED] ********'
        END AS social_security_number
    FROM users;

    Then, we apply RLS to the underlying users table to handle tenant isolation, and all application queries target v_users. This works, but it quickly becomes a maintenance nightmare:

    * Logic Duplication: The CASE statement for roles is repeated for every sensitive column.

    * Complexity: As new roles and masking rules are added, these CASE statements become unreadable and error-prone.

    * Lack of Reusability: If another table, say support_tickets, also contains PII, you have to copy-paste this entire complex CASE structure into a new view for that table.

    * Performance Obfuscation: The query planner has to contend with these complex, volatile expressions on every query, which can sometimes lead to suboptimal plans.

    What we need is a way to encapsulate this complex, stateful masking logic into a single, reusable, and secure function that can be deployed and versioned within the database itself. This is precisely the problem pg_tle was designed to solve.

    Enter `pg_tle`: Secure, In-Database Extensibility

    PostgreSQL Trusted Language Extensions (pg_tle) is a modern extension for managing other PostgreSQL extensions. Crucially, it allows you to package custom functions written in trusted languages (like PL/pgSQL, JavaScript via PL/v8, etc.) into versioned, manageable units that can be installed by a DBA without requiring superuser privileges or filesystem access for the installation script itself.

    Think of it as an in-database package manager. It provides a secure sandbox for creating powerful, reusable logic that feels like a native database feature.

    For our use case, pg_tle allows us to:

    • Write our complex role-based masking logic once in a PL/pgSQL function.
  • Package this function into a named, versioned extension (e.g., my_app_security).
    • Install and upgrade this extension across our database fleet using standard SQL commands.
    • Call this clean, simple function from our security views, dramatically simplifying their implementation.

    This pattern separates the mechanism of security (the view) from the logic of security (the TLE function), leading to a vastly more maintainable and robust system.

    Production Implementation: A Step-by-Step Guide

    Let's build this system from the ground up. We'll set up a multi-tenant schema, create a custom masking TLE, and integrate it with a security barrier view and RLS policies.

    Step 4.1: Schema Setup

    First, we define our tables. We'll include a way to manage user roles within each tenant.

    sql
    -- For setting transaction-local configuration variables without name clashes
    -- This is a common pattern for passing application context to the database.
    ALTER DATABASE my_database SET app.current_tenant_id = '';
    ALTER DATABASE my_database SET app.current_user_id = '';
    ALTER DATABASE my_database SET app.current_user_role = '';
    
    CREATE TABLE tenants (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT NOT NULL UNIQUE
    );
    
    CREATE TABLE users (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        tenant_id UUID NOT NULL REFERENCES tenants(id),
        full_name TEXT NOT NULL,
        email TEXT NOT NULL,
        social_security_number TEXT NOT NULL, -- Assume this is already encrypted at rest
        CONSTRAINT users_tenant_email_unique UNIQUE (tenant_id, email)
    );
    
    CREATE TABLE roles (
        id SERIAL PRIMARY KEY,
        role_name TEXT NOT NULL UNIQUE -- e.g., 'tenant_admin', 'support_agent'
    );
    
    CREATE TABLE user_roles (
        user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
        role_id INT NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
        PRIMARY KEY (user_id, role_id)
    );
    
    -- Populate with some data
    INSERT INTO tenants (id, name) VALUES
    ('11111111-1111-1111-1111-111111111111', 'Tenant A'),
    ('22222222-2222-2222-2222-222222222222', 'Tenant B');
    
    INSERT INTO roles (role_name) VALUES ('tenant_admin'), ('support_agent'), ('billing_clerk');
    
    INSERT INTO users (id, tenant_id, full_name, email, social_security_number) VALUES
    ('a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1', '11111111-1111-1111-1111-111111111111', 'Alice Admin', '[email protected]', '123-45-6789'),
    ('b2b2b2b2-b2b2-b2b2-b2b2-b2b2b2b2b2b2', '11111111-1111-1111-1111-111111111111', 'Bob Support', '[email protected]', '987-65-4321'),
    ('c3c3c3c3-c3c3-c3c3-c3c3-c3c3c3c3c3c3', '22222222-2222-2222-2222-222222222222', 'Charles Admin', '[email protected]', '555-44-3333');
    
    INSERT INTO user_roles (user_id, role_id) VALUES
    ('a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1', (SELECT id FROM roles WHERE role_name = 'tenant_admin')),
    ('b2b2b2b2-b2b2-b2b2-b2b2-b2b2b2b2b2b2', (SELECT id FROM roles WHERE role_name = 'support_agent'));

    Step 4.2: Installing and Configuring `pg_tle`

    This step is typically done once by a database administrator. On managed services like Amazon RDS, it's often as simple as enabling it in the configuration.

    sql
    -- This needs to be run by a privileged user
    CREATE EXTENSION IF NOT EXISTS pg_tle;
    
    -- Grant usage to our application user role
    GRANT pg_tle_admin TO my_app_user;

    Step 4.3: Creating the Custom Masking TLE

    Now for the core of our solution. We'll write a PL/pgSQL function to handle all masking logic and package it into a TLE named dynamic_masking.

    sql
    -- Run this as the user with the pg_tle_admin role
    SELECT pgtle.install_extension(
      'dynamic_masking', -- Extension Name
      '1.0',             -- Version
      'Dynamic PII Masking Utilities', -- Description
      -- The SQL script to be executed when the extension is created
      $$
        -- We place our function inside a dedicated schema for organization
        CREATE SCHEMA IF NOT EXISTS masking_utils;
    
        CREATE OR REPLACE FUNCTION masking_utils.mask_value(p_role TEXT, p_value TEXT, p_field_type TEXT)
        RETURNS TEXT AS
        $function$
        BEGIN
          -- If no value, return null
          IF p_value IS NULL THEN
            RETURN NULL;
          END IF;
    
          -- Admins see everything
          IF p_role = 'tenant_admin' THEN
            RETURN p_value;
          END IF;
    
          -- Support Agent rules
          IF p_role = 'support_agent' THEN
            CASE p_field_type
              WHEN 'email' THEN
                -- Masks middle of email user part, e.g., 'bob.support@...' -> 'b*********t@...'
                RETURN regexp_replace(p_value, '(?<=.).(?=.*@)', '*', 'g');
              WHEN 'ssn' THEN
                RETURN '***-**-' || substr(p_value, 8, 4);
              ELSE
                RETURN '******** [MASKED] ********';
            END CASE;
          END IF;
    
          -- Default for any other role (e.g., 'billing_clerk') is full redaction
          RETURN '******** [REDACTED] ********';
        END;
        $function$ LANGUAGE plpgsql IMMUTABLE;
    
        -- Grant execute permission to all roles that will use this
        GRANT EXECUTE ON FUNCTION masking_utils.mask_value(TEXT, TEXT, TEXT) TO public;
      $$
    );
    
    -- Now, enable the extension in the current database
    CREATE EXTENSION dynamic_masking;

    Notice a few critical details here:

    * Versioning: The extension is versioned (1.0), allowing for controlled upgrades.

    * Namespacing: The function is placed in its own schema (masking_utils) to avoid conflicts.

    IMMUTABLE: We mark the function as IMMUTABLE because, for the same inputs (role, value, type), it will always* produce the same output. This is a crucial performance hint for the query planner, allowing it to cache results and perform optimizations it couldn't with a VOLATILE function.

    Step 4.4: Integrating with RLS and a Security Barrier View

    RLS policies are for filtering rows, not transforming column data. Attempting to transform data inside a USING clause is an anti-pattern. The correct approach is to combine RLS on the base table with a security barrier view that performs the column-level transformations.

    sql
    -- Step 1: Apply RLS to the base table for tenant isolation.
    ALTER TABLE users ENABLE ROW LEVEL SECURITY;
    
    -- This policy ensures users can only ever see data from their own tenant.
    CREATE POLICY user_tenant_isolation ON users
    FOR SELECT
    USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
    
    -- Step 2: Create the security barrier view that applies the masking.
    -- The 'WITH (security_barrier = true)' is critical. It prevents the optimizer
    -- from pushing down qualifiers into the view's query, which could leak data
    -- from conditions in the outer query's WHERE clause before the RLS policy is applied.
    CREATE OR REPLACE VIEW v_secure_users
    WITH (security_barrier = true) AS
    SELECT
        id,
        tenant_id,
        full_name,
        -- Here we call our clean, reusable TLE function!
        masking_utils.mask_value(
            current_setting('app.current_user_role', true),
            email,
            'email'
        ) AS email,
        masking_utils.mask_value(
            current_setting('app.current_user_role', true),
            social_security_number,
            'ssn'
        ) AS social_security_number
    FROM users;
    
    -- Grant access to the view, not the underlying table
    GRANT SELECT ON v_secure_users TO my_app_user;
    REVOKE SELECT ON users FROM my_app_user; -- Ensure they can't bypass the view

    The application now only ever queries v_secure_users. Let's see it in action.

    sql
    -- Simulate a login for Alice, the Tenant A admin
    BEGIN;
    SET LOCAL app.current_tenant_id = '11111111-1111-1111-1111-111111111111';
    SET LOCAL app.current_user_role = 'tenant_admin';
    
    -- Alice sees everything within her tenant
    SELECT * FROM v_secure_users;
    -- Result:
    -- id                                   | tenant_id                            | full_name   | email                   | social_security_number
    -- --------------------------------------+--------------------------------------+-------------+-------------------------+------------------------
    -- a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1 | 11111111-1111-1111-1111-111111111111 | Alice Admin | [email protected] | 123-45-6789
    -- b2b2b2b2-b2b2-b2b2-b2b2-b2b2b2b2b2b2 | 11111111-1111-1111-1111-111111111111 | Bob Support | [email protected] | 987-65-4321
    -- (She cannot see Charles from Tenant B due to the RLS policy on the base table)
    COMMIT;
    
    -- Now, simulate a login for Bob, the Tenant A support agent
    BEGIN;
    SET LOCAL app.current_tenant_id = '11111111-1111-1111-1111-111111111111';
    SET LOCAL app.current_user_role = 'support_agent';
    
    -- Bob sees masked data
    SELECT * FROM v_secure_users;
    -- Result:
    -- id                                   | tenant_id                            | full_name   | email                   | social_security_number
    -- --------------------------------------+--------------------------------------+-------------+-------------------------+------------------------
    -- a1a1a1a1-a1a1-a1a1-a1a1-a1a1a1a1a1a1 | 11111111-1111-1111-1111-111111111111 | Alice Admin | a***********[email protected] | ***-**-6789
    -- b2b2b2b2-b2b2-b2b2-b2b2-b2b2b2b2b2b2 | 11111111-1111-1111-1111-111111111111 | Bob Support | b*********[email protected] | ***-**-4321
    COMMIT;

    This architecture is clean, secure, and maintainable. The RLS policy handles coarse-grained row filtering, and the security barrier view handles fine-grained column transformation by delegating the complex logic to our version-controlled pg_tle function.

    Advanced Scenarios and Edge Cases

    A production system has more complexities. Let's address them.

    Performance Considerations

    The overhead of calling a PL/pgSQL function per row is non-zero. Let's analyze it. We'll use a larger dataset.

    sql
    INSERT INTO users SELECT gen_random_uuid(), '11111111-1111-1111-1111-111111111111', 'User ' || g, g || '@example.com', '555-00-' || lpad(g::text, 4, '0') FROM generate_series(1, 100000) g;
    ANALYZE users;

    Query without masking (as admin):

    sql
    BEGIN;
    SET LOCAL app.current_tenant_id = '11111111-1111-1111-1111-111111111111';
    SET LOCAL app.current_user_role = 'tenant_admin';
    EXPLAIN ANALYZE SELECT * FROM v_secure_users WHERE full_name = 'User 50000';
    COMMIT;
    text
    -- Result (yours may vary):
    Index Scan using users_full_name_idx on users ... (cost=0.43..8.45 rows=1 width=100) (actual time=0.045..0.046 rows=1 loops=1)
      Index Cond: (full_name = 'User 50000'::text)
      Filter: (tenant_id = '11111111-1111-1111-1111-111111111111'::uuid)
    Planning Time: 0.150 ms
    Execution Time: 0.065 ms

    Query with masking (as support agent):

    sql
    BEGIN;
    SET LOCAL app.current_tenant_id = '11111111-1111-1111-1111-111111111111';
    SET LOCAL app.current_user_role = 'support_agent';
    EXPLAIN ANALYZE SELECT * FROM v_secure_users WHERE full_name = 'User 50000';
    COMMIT;
    text
    -- Result:
    Index Scan using users_full_name_idx on users ... (cost=0.43..8.45 rows=1 width=100) (actual time=0.039..0.040 rows=1 loops=1)
      Index Cond: (full_name = 'User 50000'::text)
      Filter: (tenant_id = '11111111-1111-1111-1111-111111111111'::uuid)
    Planning Time: 0.132 ms
    Execution Time: 0.102 ms

    The execution time increased from 0.065ms to 0.102ms. This is a measurable but often acceptable overhead for the security gain. For a full table scan, the difference would be more pronounced. The key is that because our function is IMMUTABLE, the planner knows it's safe to evaluate. If the session variables were passed as arguments and the function were STABLE, the planner could still perform significant optimizations.

    Handling Writes (`INSERT`, `UPDATE`)

    Our current setup is for SELECT only. How does a support agent update a user's name without also having to provide a masked email and SSN? We need separate RLS policies for write operations.

    sql
    -- A policy for UPDATE operations. It only checks tenant isolation.
    -- The user provides the raw, unmasked data, and this policy allows it through.
    CREATE POLICY user_tenant_update ON users
    FOR UPDATE
    USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid)
    WITH CHECK (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
    
    -- A policy for INSERT operations.
    CREATE POLICY user_tenant_insert ON users
    FOR INSERT
    WITH CHECK (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
    
    -- Now, we need a way to write. We can't write to the view directly without
    -- INSTEAD OF triggers. A simpler pattern is to grant write access on the base
    -- table, protected by the write policies.
    GRANT UPDATE(full_name), INSERT ON users TO my_app_user;

    The flow is now:

  • Read: Application reads from v_secure_users and gets masked data.
  • Write: Application sends an UPDATE users SET full_name = 'New Name' WHERE id = ... command. The user_tenant_update RLS policy kicks in, verifies the tenant_id, and allows the write to the raw table. The user never needs to handle or even see the unmasked sensitive data they aren't allowed to see.
  • Lifecycle Management: Upgrading the Masking Logic

    What if we need to change a masking rule? For example, we now want to mask the domain of support agent emails. Because we used pg_tle, this is a controlled, versioned operation.

    sql
    -- Create the script for version 1.1
    SELECT pgtle.install_extension(
      'dynamic_masking',
      '1.1', -- New version number
      'Dynamic PII Masking Utilities - Updated email rule',
      $$
        CREATE SCHEMA IF NOT EXISTS masking_utils;
    
        -- This is the updated function logic
        CREATE OR REPLACE FUNCTION masking_utils.mask_value(p_role TEXT, p_value TEXT, p_field_type TEXT)
        RETURNS TEXT AS
        $function$
        BEGIN
          IF p_value IS NULL THEN RETURN NULL; END IF;
          IF p_role = 'tenant_admin' THEN RETURN p_value; END IF;
    
          IF p_role = 'support_agent' THEN
            CASE p_field_type
              WHEN 'email' THEN
                -- NEW LOGIC: Mask user part and domain part
                RETURN regexp_replace(p_value, '(?<=.).(?=.*@)', '*', 'g')
                       || '@' ||
                       regexp_replace(split_part(p_value, '@', 2), '(?<=.).', '*', 'g');
              WHEN 'ssn' THEN
                RETURN '***-**-' || substr(p_value, 8, 4);
              ELSE
                RETURN '******** [MASKED] ********';
            END CASE;
          END IF;
    
          RETURN '******** [REDACTED] ********';
        END;
        $function$ LANGUAGE plpgsql IMMUTABLE;
    
        GRANT EXECUTE ON FUNCTION masking_utils.mask_value(TEXT, TEXT, TEXT) TO public;
      $$,
      '1.0' -- Specify the version we are upgrading FROM
    );
    
    -- Now, run the upgrade in the database
    ALTER EXTENSION dynamic_masking UPDATE TO '1.1';

    All existing views and policies that use masking_utils.mask_value will now automatically use the new logic on the next query. This provides a clean, transactional way to manage your security code's lifecycle, which is impossible with ad-hoc functions or view definitions scattered across your DDL.

    Conclusion: A Tiered, Database-Enforced Security Model

    By moving beyond basic RLS and embracing a tiered security model, we achieve a robust, maintainable, and highly secure architecture for multi-tenant data.

  • Row-Level Security (RLS): The first layer of defense. It provides the coarse-grained, non-negotiable isolation between tenants on the base tables.
  • Security Barrier Views: The presentation layer. This is where column-level security is applied. It ensures that no matter what query is run against it, the data is transformed before any other operations, preventing leaks.
  • pg_tle Trusted Language Extensions: The logic core. Encapsulates complex, reusable, and version-controlled masking logic into a secure, manageable package. This decouples the security rules from their enforcement, which is the hallmark of a mature system.
  • This pattern centralizes security logic at the database layer, making it the single source of truth. It eliminates the risks of inconsistent or bypassable application-layer masking and provides a framework for managing the lifecycle of your security rules as your application evolves. For senior engineers building complex SaaS platforms, this database-centric approach to dynamic data masking isn't just a best practice—it's a necessity.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles