Postgres Time-Series Retention with pg_partman & Declarative Partitioning

12 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Inevitable Failure Mode of Monolithic Time-Series Tables

Your application, instrumented for observability, generates millions of metric data points per hour. Your metrics table, once manageable, is now approaching several terabytes. The DELETE FROM metrics WHERE event_timestamp < now() - interval '90 days' query, once a nightly routine, now runs for hours, acquiring locks that cause cascading performance degradation across your entire system. VACUUM struggles to reclaim space, leading to table and index bloat. This isn't a hypothetical scenario; it's the standard lifecycle endpoint for any successful system logging time-series data into a single, monolithic table. The fundamental problem is that deleting massive volumes of rows is one of the most inefficient operations in a B-tree-based database. The solution is to architect your system so you never have to.

This article details the production-ready pattern for managing large-scale time-series data in PostgreSQL by combining its powerful native declarative partitioning with the pg_partman extension for automation. We will not cover the basics of what partitioning is. We assume you know why you need it. Instead, we will focus on the practical implementation of a fully automated lifecycle system: creating new partitions for upcoming data, enforcing retention policies by dropping old partitions, and navigating the performance nuances and operational edge cases that arise in a production environment.

Why Native Declarative Partitioning Is Only Half the Solution

PostgreSQL 10+ introduced a robust declarative partitioning system, a massive improvement over the old trigger-based inheritance model. You can define a parent table and a partitioning strategy with ease:

sql
CREATE TABLE metrics (
    metric_id BIGSERIAL,
    device_id UUID NOT NULL,
    event_timestamp TIMESTAMPTZ NOT NULL,
    metric_name TEXT NOT NULL,
    metric_value DOUBLE PRECISION NOT NULL,
    PRIMARY KEY (metric_id, event_timestamp)
)
PARTITION BY RANGE (event_timestamp);

-- Manually create a partition for the current month
CREATE TABLE metrics_p2024_07 PARTITION OF metrics
    FOR VALUES FROM ('2024-07-01 00:00:00+00') TO ('2024-08-01 00:00:00+00');

This is excellent for query performance. When you query with a WHERE clause on event_timestamp, the query planner performs partition pruning, scanning only the relevant child tables (partitions) and ignoring the rest. An EXPLAIN plan will clearly show this:

text
->  Append
    ->  Index Scan using metrics_p2024_07_pkey on metrics_p2024_07

However, this native functionality presents two significant operational gaps for a time-series workload:

  • No Automatic Partition Creation: When the clock ticks over to August 1st, any INSERT statement will fail with ERROR: no partition of relation "metrics" found for row. An operator must manually create the metrics_p2024_08 partition beforehand.
  • No Automatic Retention Management: When data from April needs to be purged, there's no built-in mechanism to automatically DROP the metrics_p2024_04 partition. This, too, requires manual intervention.
  • This is where pg_partman enters the architecture. It's a PostgreSQL extension that acts as a management layer on top of the native partitioning system, automating these exact lifecycle tasks.

    Production Implementation: `pg_partman` Configuration

    Let's build a robust, automated partitioning scheme for our metrics table. We'll aim for daily partitions and a 90-day retention policy.

    1. Initial Setup and Schema

    First, ensure the extension is installed and enabled in your database.

    sql
    -- This step might require OS-level installation of the pg_partman package
    -- (e.g., `apt-get install postgresql-15-partman`)
    CREATE EXTENSION IF NOT EXISTS pg_partman;

    Now, we'll redefine our table. A critical performance decision here is the primary key. For effective partition pruning, the partition key (event_timestamp) must be part of the primary key and any unique indexes. This is a hard requirement from PostgreSQL.

    sql
    -- Drop the old table if it exists
    DROP TABLE IF EXISTS metrics CASCADE;
    
    -- Create the parent, partitioned table
    CREATE TABLE metrics (
        metric_id BIGSERIAL NOT NULL,
        device_id UUID NOT NULL,
        event_timestamp TIMESTAMPTZ NOT NULL,
        metric_name TEXT NOT NULL,
        metric_value DOUBLE PRECISION NOT NULL,
        -- The partition key MUST be part of the primary key.
        PRIMARY KEY (metric_id, event_timestamp)
    )
    PARTITION BY RANGE (event_timestamp);
    
    -- Create an index that will be propagated to all child partitions.
    -- Placing event_timestamp first is optimal for time-range queries.
    CREATE INDEX idx_metrics_timestamp_device ON metrics (event_timestamp DESC, device_id);

    2. Configuring `pg_partman` for Lifecycle Management

    Instead of manually creating partitions, we'll register the table with pg_partman using its create_parent function. This function is the control center for your partitioning strategy.

    sql
    -- Complete pg_partman setup for the 'metrics' table
    DO $$
    BEGIN
        PERFORM partman.create_parent(
            p_parent_table := 'public.metrics',
            p_control := 'event_timestamp',
            p_interval := '1 day',
            p_type := 'native',
            p_premake := 4,
            p_start_partition := (now() - interval '3 days')::text
        );
    END;
    $$;

    Let's break down these critical parameters:

    * p_parent_table: The table to manage.

    * p_control: The column to partition by (our timestamp).

    * p_interval: The duration of each partition. We've chosen '1 day'. This could also be '1 hour', '1 week', '1 month', etc.

    * p_type: Must be 'native' to use PostgreSQL's declarative partitioning.

    * p_premake: This is a crucial performance tuning parameter. It tells pg_partman to always maintain a certain number of future-dated partitions. With p_premake := 4, it will ensure that partitions for the next 4 days always exist. This prevents a latency spike on the first insert of a new day, which would otherwise have to wait for pg_partman to create the new partition on-the-fly.

    * p_start_partition: This is only necessary if you need to create partitions in the past, for example, during an initial migration. For a new system, you can often omit this. Here, we're creating partitions for the last 3 days to handle potentially delayed data.

    After running this, you can inspect your schema. You'll see that pg_partman has created several partitions (metrics_p2024_...) and populated its configuration table, partman.part_config.

    3. Automating Maintenance with `run_maintenance_proc`

    pg_partman doesn't run as a background daemon. You must schedule its maintenance function to run periodically. This function is responsible for creating new partitions (respecting premake) and dropping old ones (based on retention rules we'll set next).

    The recommended way is via a scheduler like cron or pg_cron.

    Using pg_cron:

    sql
    -- Assuming pg_cron extension is installed and configured in postgresql.conf
    
    -- Schedule the maintenance to run every hour.
    -- A higher frequency is safe and recommended to handle any issues promptly.
    SELECT cron.schedule(
        'pgpartman-maintenance',
        '0 * * * *',  -- Every hour at minute 0
        $$CALL partman.run_maintenance_proc()$$ 
    );

    This single procedure call will inspect all tables registered in partman.part_config and perform the necessary operations. Running it hourly ensures that your premake buffer is always filled and retention policies are applied in a timely manner.

    4. Implementing the 90-Day Retention Policy

    Now we configure the automatic deletion of old data. This is done by updating the partman.part_config entry for our table.

    sql
    UPDATE partman.part_config
    SET
        retention = '90 days',
        retention_keep_table = false,
        retention_keep_index = false
    WHERE parent_table = 'public.metrics';

    * retention: Sets the policy. Partitions where all data is older than this interval will be targeted for deletion.

    * retention_keep_table: When false (the default), pg_partman will execute a DROP TABLE on the old partition. This is a non-blocking, metadata-only operation that is nearly instantaneous, regardless of the partition's size. This is the core performance benefit.

    * retention_keep_index: Also set to false to drop associated indexes.

    Advanced Strategy: DETACH for Cold Storage Archival

    In some systems, you can't simply drop old data. You may need to archive it to cheaper storage like Amazon S3. pg_partman supports this workflow by detaching partitions instead of dropping them.

    To implement this, you would change retention_keep_table to true. When the maintenance job runs, it will execute ALTER TABLE metrics DETACH PARTITION metrics_p. The old partition becomes a standalone, regular table. Your archival process can then connect, dump the contents of this standalone table (e.g., using COPY ... TO ...), and then finally execute DROP TABLE metrics_p. This provides a safe, transactional window for archival.

    Advanced Scenarios and Edge Case Management

    A production system is never as clean as a simple setup. Here’s how to handle common complex scenarios.

    Edge Case 1: Backfilling Historical Data

    You need to import a year's worth of historical metrics. If you just start inserting, pg_partman will try to create partitions on-the-fly, which is slow and inefficient for bulk loads. The correct, high-performance approach is to manually create partitions, load data directly, and then attach them.

    Workflow for backfilling Q1 2023 data:

  • Temporarily disable pg_partman control for this period to prevent conflicts.
  • Create the target tables with the exact same structure as the parent, but as standalone tables.
  • sql
        -- Create a standalone table for January 2023 data
        CREATE TABLE metrics_backfill_2023_01 (
            LIKE metrics INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES
        );
  • Use COPY for a high-performance data load into the standalone table.
  • sql
        COPY metrics_backfill_2023_01 FROM '/path/to/data/jan_2023.csv' WITH (FORMAT CSV);
  • Add the partition constraint to the table. This is a critical step. PostgreSQL will perform a rapid check to ensure all data in the table conforms to the constraint.
  • sql
        ALTER TABLE metrics_backfill_2023_01 ADD CONSTRAINT check_partition
            CHECK (event_timestamp >= '2023-01-01 00:00:00+00' AND event_timestamp < '2023-02-01 00:00:00+00');
  • Attach the prepared table as a partition. This is a metadata-only operation and is instantaneous.
  • sql
        ALTER TABLE metrics ATTACH PARTITION metrics_backfill_2023_01
            FOR VALUES FROM ('2023-01-01 00:00:00+00') TO ('2023-02-01 00:00:00+00');

    This CREATE -> COPY -> ADD CONSTRAINT -> ATTACH pattern is vastly more performant for bulk loading than INSERTing into the parent table.

    Edge Case 2: Changing the Partitioning Interval

    Your system grows, and daily partitions are now too granular. You decide to switch to weekly partitions. This is a non-trivial migration that requires careful planning.

  • Stop Data Ingestion: You must prevent writes to the parent table during the migration.
  • Update pg_partman Configuration: Change the interval in partman.part_config.
  • sql
        UPDATE partman.part_config
        SET interval = '1 week'
        WHERE parent_table = 'public.metrics';
  • Manually Run Maintenance: Run CALL partman.run_maintenance_proc() to create the new weekly-partitioned tables going forward.
  • Handle Existing Daily Partitions: You have a choice:
  • * Leave them: pg_partman will continue to manage and eventually drop them according to the retention policy. Your queries will still work, but the planner will have to scan more partitions for a week-long query.

    * Consolidate them (Advanced): This involves creating a new weekly partition, moving the data from the seven daily partitions into it, and then dropping the old daily partitions. This is a complex data migration that should be scripted and tested thoroughly.

    Edge Case 3: Monitoring and Failure Alerting

    What happens if run_maintenance_proc() fails? Perhaps due to a permissions issue, disk space exhaustion, or a conflicting lock. If it fails silently, you might not notice until INSERT statements start failing because a new partition wasn't created. Robust monitoring is essential.

    pg_partman logs errors to its own table: partman.partman_error_log.

    You should have a separate monitoring process that checks this table.

    Example Monitoring Query:

    sql
    -- This query should be run by your monitoring system (e.g., Nagios, Datadog, Prometheus pg_exporter)
    -- An alert should be triggered if it returns any rows.
    SELECT
        error_time,
        parent_table,
        child_table,
        error_message
    FROM partman.partman_error_log
    WHERE error_time > now() - interval '1 hour';

    Additionally, monitor the number of future partitions. If this number drops below your premake value, it's a sign that maintenance is not running correctly.

    sql
    -- Check if the number of premade partitions is sufficient.
    -- Alert if this returns a value less than the configured premake (e.g., < 4).
    SELECT COUNT(*)
    FROM partman.show_partitions('public.metrics', 'DESC')
    WHERE partition_schemaname = 'public' AND partition_tablename > partman.get_current_partition('public.metrics');

    Performance Deep Dive: Indexes and Partition Pruning

    The entire benefit of partitioning hinges on the query planner's ability to prune partitions. This is directly tied to your indexing strategy.

    Consider these two queries:

    sql
    -- Query 1: Optimal - Pruning key is in WHERE clause
    EXPLAIN ANALYZE
    SELECT * FROM metrics
    WHERE event_timestamp >= '2024-07-15 00:00:00+00'
      AND event_timestamp <  '2024-07-16 00:00:00+00';
    
    -- Query 2: Sub-optimal - Pruning key is not constrained
    EXPLAIN ANALYZE
    SELECT * FROM metrics
    WHERE device_id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d';

    Query 1's plan will show it only scans a single partition (metrics_p2024_07_15). The query time is independent of how many total months or years of data you have.

    Query 2's plan, however, will show an Append node that iterates over every single partition. It must scan the idx_metrics_timestamp_device index on every child table because it cannot rule any of them out based on the WHERE clause. The performance of this query will degrade linearly as you add more partitions.

    Key Takeaway: For high-performance queries, you must include a constraint on the partition key (event_timestamp) in your WHERE clauses whenever possible. If you frequently need to query by device_id without a time range, this partitioning strategy may not be sufficient on its own, and you might need to consider a complementary architecture, such as a separate table or materialized view that indexes devices and their most recent metric data.

    Conclusion: An Architectural Pattern for Scalability

    By layering the pg_partman extension on top of PostgreSQL's native declarative partitioning, you move from a manual, error-prone process to a fully automated, self-managing data lifecycle system. This architectural pattern is a prerequisite for scaling time-series workloads to the terabyte level and beyond.

    The core principle is the replacement of expensive, row-based DELETE operations with instantaneous, metadata-only DROP PARTITION commands. This eliminates table bloat, avoids long-running locks, and ensures predictable database performance, regardless of the total data volume.

    Effective implementation, however, requires moving beyond the basic setup. It demands a deep understanding of performance tuning (like premake), robust operational procedures for edge cases like backfilling and schema migrations, and comprehensive monitoring to ensure the automation is functioning as expected. This is not just a database feature; it's a foundational component of a scalable data platform.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles