Dynamic Partial Indexing for Hot Tenants in Multi-Tenant PostgreSQL

October 14, 2025

19 min read

Goh Ling Yong

Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Inevitable Bottleneck of Naive Multi-Tenancy Indexing

In any mature multi-tenant SaaS application running on PostgreSQL, the (tenant_id, ...) composite index is the workhorse. It's the first line of defense, ensuring that queries scoped to a single tenant are reasonably performant. However, as tables grow to hundreds of millions or billions of rows, and data distribution becomes heavily skewed, this pattern reveals its limitations. The Pareto principle is often in full effect: 80% of your data and activity comes from 20% (or less) of your tenants. We call these 'hot tenants'.

When a users or documents table is dominated by a few hyper-active tenants, the B-tree for your (tenant_id, column_name) index becomes incredibly deep and bloated with data from thousands of 'cold' tenants. While an index scan on (tenant_id, status) is far better than a sequential scan, the database still traverses a large index structure, leading to increased I/O and non-trivial performance degradation for your most important customers.

This article presents a surgical, dynamic strategy to address this specific problem. We will not re-hash the basics of indexing. Instead, we will design and implement an automated system that identifies hot tenants and creates highly-efficient, specialized partial indexes for them, all without application downtime.

Baseline Analysis: The Composite Index Under Load

Let's establish a concrete scenario. Consider a typical analytics_events table in a SaaS product.

sql

CREATE TABLE analytics_events (
    id BIGSERIAL PRIMARY KEY,
    tenant_id UUID NOT NULL,
    event_type TEXT NOT NULL,
    payload JSONB,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- The standard, go-to index
CREATE INDEX idx_analytics_events_tenant_id_event_type 
ON analytics_events (tenant_id, event_type);

Now, let's simulate a heavily skewed data distribution. We'll insert 10 million rows across 5,000 tenants, but we'll make two tenants, tenant_A and tenant_B, responsible for 80% of the data.

sql

-- Fictional UUIDs for our hot tenants
-- In a real scenario, these would be your actual tenant IDs.
-- '00000000-0000-0000-0000-00000000000a'
-- '00000000-0000-0000-0000-00000000000b'

-- Generate skewed data (conceptual script)
INSERT INTO analytics_events (tenant_id, event_type, created_at)
SELECT 
    -- 80% of data goes to two tenants
    CASE 
        WHEN random() < 0.4 THEN '00000000-0000-0000-0000-00000000000a'::uuid
        WHEN random() < 0.8 THEN '00000000-0000-0000-0000-00000000000b'::uuid
        ELSE gen_random_uuid()
    END,
    'page_view',
    NOW() - (s * INTERVAL '1 second')
FROM generate_series(1, 10000000) s;

ANALYZE analytics_events;

Let's check the size of our table and index. The exact numbers will vary, but the proportions are what matter.

sql

-- Table size might be ~1.5 GB
SELECT pg_size_pretty(pg_total_relation_size('analytics_events'));

-- Index size might be ~400 MB
SELECT pg_size_pretty(pg_relation_size('idx_analytics_events_tenant_id_event_type'));

The index is substantial because it must map every single row in the table. Now, let's query for one of our hot tenants.

sql

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, created_at
FROM analytics_events
WHERE tenant_id = '00000000-0000-0000-0000-00000000000a'
  AND event_type = 'page_view'
ORDER BY created_at DESC
LIMIT 100;

Sample EXPLAIN Output (Composite Index):

text

Limit  (cost=0.56..241.97 rows=100 width=16) (actual time=0.129..3.894 rows=100 loops=1)
  Buffers: shared hit=2841
  ->  Index Scan Backward using idx_analytics_events_tenant_id_event_type on analytics_events ...
        Index Cond: (tenant_id = '00000000-0000-0000-0000-00000000000a'::uuid AND event_type = 'page_view'::text)
        Buffers: shared hit=2841
Planning Time: 0.158 ms
Execution Time: 3.921 ms

The planner correctly uses the index. The execution time isn't terrible, but look at the Buffers: shared hit=2841. That's nearly 3000 8kB pages that PostgreSQL had to read from memory (or disk) to satisfy this query. This is the cost of traversing a large B-tree that is mostly filled with data we don't care about for this specific query.

The Partial Index Advantage: A Static Demonstration

A partial index is an index built on a subset of a table's rows, defined by a WHERE clause. It's smaller, more compact, and consequently, faster to scan. Let's create one specifically for our hot tenant.

CRITICAL NOTE: In production, always use CREATE INDEX CONCURRENTLY to avoid taking a write lock on your table. For this demonstration, we'll omit it for simplicity, but we will use it in our final automated solution.

sql

CREATE INDEX idx_events_hot_tenant_a_event_type
ON analytics_events (event_type)
WHERE tenant_id = '00000000-0000-0000-0000-00000000000a';

ANALYZE analytics_events;

Let's check the size of this new index. It will be a fraction of the composite index, perhaps around 150MB, as it only contains data for the ~4 million rows of tenant_A.

Now, re-run the exact same query:

sql

EXPLAIN (ANALYZE, BUFFERS)
SELECT id, created_at
FROM analytics_events
WHERE tenant_id = '00000000-0000-0000-0000-00000000000a'
  AND event_type = 'page_view'
ORDER BY created_at DESC
LIMIT 100;

Sample EXPLAIN Output (Partial Index):

text

Limit  (cost=0.43..93.21 rows=100 width=16) (actual time=0.035..0.451 rows=100 loops=1)
  Buffers: shared hit=105
  ->  Index Scan Backward using idx_events_hot_tenant_a_event_type on analytics_events ...
        Index Cond: (event_type = 'page_view'::text)
        Buffers: shared hit=105
Planning Time: 0.213 ms
Execution Time: 0.478 ms

The difference is stark:

Metric	Composite Index	Partial Index	Improvement
Execution Time	3.921 ms	0.478 ms	~8.2x
Buffer Hits	2841	105	~27x

We achieved an ~8x speedup in execution time and a ~27x reduction in buffer reads. The query planner is smart enough to see that a smaller, more specific index exists that satisfies the WHERE clause and uses it. This is the power we want to harness, but manually creating these indexes is not a scalable solution.

The Core Strategy: A Dynamic Management System

Our goal is to automate the creation and eventual cleanup of these partial indexes for tenants that cross a certain activity or size threshold. This requires three components: Identification, Management, and Automation.

1. Identification: Finding Your Hot Tenants

First, we need a reliable way to define and find 'hot' tenants. The definition will vary by application—it could be based on row count, query frequency, or revenue. A simple and effective starting point is row count.

We can run a periodic query to find the top N tenants by data volume.

sql

-- Find the top 5 tenants by row count in the analytics_events table
SELECT tenant_id, COUNT(*)
FROM analytics_events
GROUP BY tenant_id
ORDER BY COUNT(*) DESC
LIMIT 5;

This is a heavy query to run frequently on a massive table. For a more sophisticated, lower-overhead approach, you can leverage PostgreSQL's statistics extensions:

* pg_stat_statements: If enabled, you can parse query strings to find which tenant_id values appear most frequently in WHERE clauses. This identifies tenants with high query traffic, not just large data volume.

* Application-Level Metrics: Your application itself is often the best source. You can maintain counters in a system like Redis for tenant activity and periodically flush the leaders to a database table for the index manager to consume.

For our implementation, we will stick with the row count approach as it is self-contained within the database.

2. Management: A Schema for Tracking Dynamic Indexes

We cannot blindly create indexes. We need a manifest to track which indexes we've created, for which tenants, and on which tables. This prevents duplicate creation and provides a mechanism for cleanup.

sql

CREATE TABLE system_managed_partial_indexes (
    id SERIAL PRIMARY KEY,
    schema_name TEXT NOT NULL,
    table_name TEXT NOT NULL,
    index_name TEXT NOT NULL UNIQUE,
    index_definition TEXT NOT NULL,
    tenant_id UUID NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    last_verified_at TIMESTAMPTZ,
    decommissioned_at TIMESTAMPTZ,
    -- A unique constraint to prevent creating the same logical index twice
    UNIQUE(schema_name, table_name, tenant_id, index_definition)
);

* index_name: We will generate a unique, predictable name for each index.

* index_definition: Stores the core columns of the index (e.g., (event_type, created_at)).

* tenant_id: The tenant this index is for.

* decommissioned_at: A soft-delete flag for our cleanup process.

3. Automation: The PL/pgSQL Management Function

This is where we tie everything together. We'll create a stored procedure that performs the logic of identifying, creating, and logging the indexes. This function can be run periodically by a cron job or a database scheduler like pg_cron.

Here is a production-grade PL/pgSQL function to manage this process. It's designed to be idempotent and safe.

sql

CREATE OR REPLACE FUNCTION manage_hot_tenant_partial_indexes(
    p_schema_name TEXT,
    p_table_name TEXT,
    p_index_definition TEXT, -- e.g., '(event_type, created_at DESC)'
    p_tenant_threshold BIGINT, -- Minimum row count to be considered 'hot'
    p_max_hot_tenants INT -- Max number of tenants to manage
) RETURNS VOID AS $$
DECLARE
    hot_tenant RECORD;
    v_index_name TEXT;
    v_sql TEXT;
BEGIN
    -- Find tenants exceeding the threshold that don't already have a managed index
    FOR hot_tenant IN 
        SELECT tenant_id FROM (
            SELECT tenant_id, COUNT(*) as row_ct
            FROM pg_catalog.pg_class c
            JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace,
                 LATERAL (
                    SELECT tenant_id FROM public.analytics_events -- This should be dynamic
                 ) AS t
            WHERE n.nspname = p_schema_name AND c.relname = p_table_name
            GROUP BY tenant_id
            HAVING COUNT(*) >= p_tenant_threshold
            ORDER BY row_ct DESC
            LIMIT p_max_hot_tenants
        ) as top_tenants
        WHERE NOT EXISTS (
            SELECT 1
            FROM system_managed_partial_indexes smpi
            WHERE smpi.schema_name = p_schema_name
              AND smpi.table_name = p_table_name
              AND smpi.tenant_id = top_tenants.tenant_id
              AND smpi.index_definition = p_index_definition
              AND smpi.decommissioned_at IS NULL
        )
    LOOP
        RAISE NOTICE 'Found hot tenant % for table %.% requiring an index.', hot_tenant.tenant_id, p_schema_name, p_table_name;

        -- Generate a unique and descriptive index name
        v_index_name := format('dyn_partial_idx_%s_%s_%s', 
                               p_table_name, 
                               -- Sanitize index definition for the name
                               translate(p_index_definition, '() ,', '___'), 
                               -- Use first 8 chars of tenant UUID for brevity
                               left(hot_tenant.tenant_id::text, 8));

        -- Construct the CREATE INDEX CONCURRENTLY statement
        -- Using format() with %I and %L for safe identifier and literal quoting
        v_sql := format(
            'CREATE INDEX CONCURRENTLY %I ON %I.%I USING btree %s WHERE tenant_id = %L;',
            v_index_name, 
            p_schema_name, 
            p_table_name, 
            p_index_definition, 
            hot_tenant.tenant_id
        );

        RAISE NOTICE 'Executing: %', v_sql;

        -- Execute the DDL
        BEGIN
            EXECUTE v_sql;

            -- If successful, log it to our management table
            INSERT INTO system_managed_partial_indexes 
                (schema_name, table_name, index_name, index_definition, tenant_id)
            VALUES 
                (p_schema_name, p_table_name, v_index_name, p_index_definition, hot_tenant.tenant_id);
            RAISE NOTICE 'Successfully created and logged index %', v_index_name;

        EXCEPTION 
            WHEN duplicate_object THEN
                RAISE WARNING 'Index % already exists. It may have been created outside this management script. Consider importing it.', v_index_name;
            WHEN others THEN
                RAISE WARNING 'Failed to create index for tenant %: %', hot_tenant.tenant_id, SQLERRM;
        END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

To run this:

sql

-- Create a partial index on (event_type) for any tenant with over 2 million rows
-- but only manage up to 10 such tenants at a time.
SELECT manage_hot_tenant_partial_indexes(
    p_schema_name := 'public',
    p_table_name := 'analytics_events',
    p_index_definition := '(event_type)',
    p_tenant_threshold := 2000000,
    p_max_hot_tenants := 10
);

This function is the heart of our system. It's safe to re-run, will only act on new hot tenants, and uses CONCURRENTLY to ensure the application remains online during index creation.

Decommissioning: Handling Tenants That Cool Down

What happens when a hot tenant's activity drops? We need a process to clean up unused indexes to reclaim space and reduce write overhead.

An aggressive approach would be to drop indexes for any tenant that falls below the threshold. A more conservative, safer strategy is to only drop indexes that have been unused for a significant period. This requires more sophisticated usage tracking (e.g., from pg_stat_user_indexes).

A balanced approach is to soft-decommission first, then hard-delete later.

sql

CREATE OR REPLACE FUNCTION decommission_cold_tenant_indexes(
    p_schema_name TEXT,
    p_table_name TEXT,
    p_tenant_threshold BIGINT
) RETURNS VOID AS $$
DECLARE
    cold_tenant_index RECORD;
    v_sql TEXT;
BEGIN
    -- Find managed indexes for tenants that are now BELOW the threshold
    FOR cold_tenant_index IN
        SELECT smpi.id, smpi.index_name, smpi.tenant_id
        FROM system_managed_partial_indexes smpi
        WHERE smpi.schema_name = p_schema_name
          AND smpi.table_name = p_table_name
          AND smpi.decommissioned_at IS NULL
          AND (
              SELECT COUNT(*) FROM public.analytics_events -- Should be dynamic
              WHERE tenant_id = smpi.tenant_id
          ) < p_tenant_threshold
    LOOP
        RAISE NOTICE 'Tenant % has cooled down. Decommissioning index %.', cold_tenant_index.tenant_id, cold_tenant_index.index_name;

        -- Soft delete: mark as decommissioned
        UPDATE system_managed_partial_indexes
        SET decommissioned_at = NOW()
        WHERE id = cold_tenant_index.id;

        -- Actually drop the index
        v_sql := format('DROP INDEX CONCURRENTLY IF EXISTS %I.%I;', p_schema_name, cold_tenant_index.index_name);
        RAISE NOTICE 'Executing: %', v_sql;

        EXECUTE v_sql;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

This function can be run less frequently than the creation function (e.g., weekly vs. daily) to avoid thrashing—creating and dropping indexes for a tenant whose activity fluctuates around the threshold.

Advanced Considerations and Edge Cases

This strategy is powerful, but it's not a magic bullet. Senior engineers must consider the trade-offs.

* Write Performance Overhead: Every additional index incurs a penalty on INSERT, UPDATE, and DELETE operations for the affected rows. For a hot tenant, this means writes become slightly slower. You must benchmark this impact. If the table is extremely write-heavy (e.g., logging), the benefit to read queries may not justify the write penalty.

* Query Planner Complexity: With more available indexes, the query planner has more choices. In rare cases, it might make a suboptimal choice. Regularly monitor query performance for your hot tenants using pg_stat_statements to ensure the correct partial indexes are being used. Tools like pg_hint_plan are a last resort for forcing a plan in pathological cases.

The CONCURRENTLY Caveat: CREATE INDEX CONCURRENTLY is not atomic. It can fail midway, leaving behind an invalid index. Your management script must be able to handle this, and you should have monitoring in place for invalid indexes (SELECT FROM pg_index WHERE NOT indisvalid;). Similarly, DROP INDEX CONCURRENTLY can also fail, requiring manual intervention.

* Autovacuum and Statistics: After creating a new index, ANALYZE must be run on the table for the planner to generate accurate statistics and decide to use it. A well-tuned autovacuum daemon is essential for this strategy to work effectively.

Alternative Strategy: When to Use Table Partitioning

Dynamic partial indexing is a surgical optimization. The heavyweight alternative is Table Partitioning. Using PARTITION BY LIST (tenant_id), you can create a separate physical table for each tenant (or groups of tenants).

Feature	Dynamic Partial Indexing	Table Partitioning
Granularity	Optimizes queries for specific tenants within one large table.	Physically isolates data for each tenant.
Implementation	Can be added to an existing system with minimal disruption.	Major architectural change; best done from the start.
Primary Use Case	Read-heavy workloads with skewed data distribution.	Tenant data isolation, bulk deletion (DROP PARTITION).
Overhead	Moderate write overhead, small planner overhead.	High planner overhead, tricky foreign keys, DDL complexity.
Flexibility	Easy to add/remove optimizations for individual tenants.	Rigid structure. Adding a new tenant requires adding a partition.

Heuristic: Use dynamic partial indexing when you need to surgically improve query performance for a subset of tenants on an existing, monolithic table. Use table partitioning when tenant data isolation is a core architectural requirement from day one, and you anticipate needing to manage data at the tenant level (e.g., tenant-specific backup/restore, easy data deletion for GDPR, etc.).

Conclusion

The standard (tenant_id, ...) composite index is the foundation of multi-tenant application performance, but it is a blunt instrument. For systems operating at scale, data skew is not an edge case; it is the norm. By implementing a robust, automated system for managing partial indexes, you can create a two-tiered performance model: a solid baseline for all customers provided by the composite index, and a supercharged, optimized experience for your largest and most active customers.

This dynamic partial indexing strategy is an advanced technique that requires careful monitoring and an understanding of its trade-offs. However, when applied correctly, it provides a powerful, targeted solution to one of the most common and challenging performance problems in multi-tenant architecture, allowing your database to scale gracefully with the success of your customers.