Bi-Temporal Modeling in Postgres with Range Types & Exclusion Constraints

15 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Bi-Temporal Challenge: Beyond Standard Versioning

As senior engineers, we're well-versed in versioning data. The most common pattern is System-Versioned Tables, often implemented as a Slowly Changing Dimension (SCD) Type 2. This uni-temporal model answers the question: "What did this record look like at time T?" It tracks the history of a record's state over its valid time—the time it was true in the real world.

However, a more complex and critical requirement exists in domains like finance, insurance, and legal systems: bi-temporality. A bi-temporal model must answer a fundamentally harder question: "What did we think this record looked like at time T?"

This introduces a second time dimension: transaction time. This is the period during which a given fact was recorded and considered current within the database itself.

  • Valid Time (valid_range): The time range when a fact is true in the real world. This is business-driven and can be in the past, present, or future. Example: A price change is effective from January 1st to June 30th.
  • Transaction Time (tx_range): The time range when a fact is stored in the database. The start is when the record is inserted, and the end is when it is superseded by a new version. This is system-driven and always moves forward. Example: We recorded this price change on December 15th.
  • Managing this bi-temporal state in application logic is a recipe for disaster. It's prone to race conditions, complex validation rules, and subtle bugs that corrupt your historical data. The only robust solution is to enforce these complex temporal constraints directly within the database. PostgreSQL, with its powerful range types and exclusion constraints, is uniquely equipped for this task.

    This article will demonstrate how to build a production-grade, constraint-enforced bi-temporal model that makes data corruption virtually impossible.


    The Schema: From Brittle Timestamps to Resilient Ranges

    A naive approach might use four timestamp columns: valid_from, valid_to, tx_from, tx_to. This immediately presents problems:

  • Constraint Complexity: How do you enforce that for a given entity, no two records have overlapping valid and transaction times? This requires complex CHECK constraints or triggers that are difficult to write and maintain.
  • Querying Hell: Queries become a mess of BETWEEN clauses, handling NULL for open-ended times, and are notoriously difficult to get right and index effectively.
  • Data Integrity Gaps: Without a multi-dimensional constraint, it's trivial for concurrent transactions or application bugs to insert overlapping data, silently corrupting your audit trail.
  • The Superior Approach: `tsrange` and `EXCLUDE USING GIST`

    We will leverage two advanced PostgreSQL features:

  • tsrange: A native data type representing a range of timestamps. It has a lower and upper bound and understands concepts like inclusion/exclusion of boundaries and infinity.
  • Exclusion Constraints (EXCLUDE): A generalization of UNIQUE constraints. They ensure that if any two rows are compared on the specified columns or expressions using the specified operators, at least one of these operator comparisons will return false or null. This is precisely what we need to prevent overlapping time ranges.
  • Let's design our schema for a products table where we need to track price history bi-temporally.

    First, we need the btree_gist extension, which allows us to use B-Tree-like operators (like equality =) within a GiST index, which is required for range operators.

    sql
    -- Enable the extension required for the exclusion constraint
    CREATE EXTENSION IF NOT EXISTS btree_gist;

    Now, the table definition. This is the core of our model.

    sql
    CREATE TABLE products (
        -- A surrogate primary key for the row itself
        id BIGSERIAL PRIMARY KEY,
    
        -- A stable business identifier for the product entity
        product_uuid UUID NOT NULL,
    
        -- Bi-temporal validity periods
        valid_range TSTZRANGE NOT NULL,
        tx_range TSTZRANGE NOT NULL,
    
        -- The actual data we are versioning
        price NUMERIC(10, 2) NOT NULL,
        currency CHAR(3) NOT NULL,
    
        -- The magic: A multi-dimensional constraint preventing overlaps.
        -- This ensures that for any given product_uuid, no two rows can have
        -- an overlapping valid_range AND an overlapping tx_range.
        EXCLUDE USING GIST (
            product_uuid WITH =,
            valid_range WITH &&,
            tx_range WITH &&
        )
    );
    
    -- Create a multi-column index for efficient querying
    CREATE INDEX products_temporal_idx ON products USING GIST (product_uuid, valid_range, tx_range);

    Let's break down the EXCLUDE constraint:

  • product_uuid WITH =: For any two rows being compared, if their product_uuid is the same...
  • valid_range WITH &&: ...and their valid_range overlaps (the && operator)...
  • tx_range WITH &&: ...and their tx_range also overlaps...
    • ...then the constraint is violated, and the operation is rejected.

    This single constraint provides the atomicity and consistency that would require pages of complex application logic. It is the bedrock of our bi-temporal model.


    Implementing Write Operations: The Transactional Dance

    In a bi-temporal model, you never truly UPDATE or DELETE data. Every change is an INSERT of a new reality, coupled with the closing of a previous reality's transaction timeline. This requires careful transactional logic, best encapsulated in stored procedures.

    Let's define our conventions:

  • An open-ended range will have its upper bound as infinity (NULL in tsrange representation).
  • All operations will use statement_timestamp() to ensure a consistent timestamp throughout a transaction.
  • 1. Initial Product Creation

    This is the simplest operation. A new product is created with a price that is valid from now until forever, and this fact is recorded in the database from now until forever.

    sql
    CREATE OR REPLACE FUNCTION create_product(
        p_product_uuid UUID,
        p_price NUMERIC(10, 2),
        p_currency CHAR(3)
    ) RETURNS VOID AS $$
    DECLARE
        current_ts TIMESTAMPTZ := statement_timestamp();
    BEGIN
        INSERT INTO products (product_uuid, valid_range, tx_range, price, currency)
        VALUES (
            p_product_uuid,
            -- Valid in the real world from now until forever
            tstzrange(current_ts, 'infinity', '[)'),
            -- Recorded in the DB from now until forever
            tstzrange(current_ts, 'infinity', '[)'),
            p_price,
            p_currency
        );
    END;
    $$ LANGUAGE plpgsql;

    2. The Bi-Temporal "Update": A Correction

    This is the most critical and complex operation. Imagine we quoted a price of $99.99, but we realize an hour later it should have been $95.99. This is a correction. The original fact was never true in the real world; we just recorded it incorrectly.

    Here's the process:

  • Find the current active record for the product (the one with an open tx_range).
  • "End" its transaction time by setting the upper bound of tx_range to now(). This preserves it as a historical record of what we used to think was true.
  • INSERT a new record with the corrected data. This new record inherits the same valid_range as the record it's correcting, but its tx_range starts from now().
  • sql
    CREATE OR REPLACE FUNCTION correct_product_price(
        p_product_uuid UUID,
        p_new_price NUMERIC(10, 2)
    ) RETURNS VOID AS $$
    DECLARE
        current_ts TIMESTAMPTZ := statement_timestamp();
        current_record products%ROWTYPE;
    BEGIN
        -- Step 1: Find the current active record and lock it
        SELECT * INTO current_record
        FROM products
        WHERE product_uuid = p_product_uuid
          AND upper(tx_range) = 'infinity'
        FOR UPDATE;
    
        IF NOT FOUND THEN
            RAISE EXCEPTION 'Product with UUID % not found or has no active record.', p_product_uuid;
        END IF;
    
        -- Step 2: End the transaction time of the old record
        UPDATE products
        SET tx_range = tstzrange(lower(tx_range), current_ts, '[)')
        WHERE id = current_record.id;
    
        -- Step 3: Insert the new, corrected record.
        -- Note: valid_range is carried over from the original record.
        INSERT INTO products (product_uuid, valid_range, tx_range, price, currency)
        VALUES (
            p_product_uuid,
            current_record.valid_range, -- Inherit the original valid_range
            tstzrange(current_ts, 'infinity', '[)'),
            p_new_price,
            current_record.currency
        );
    END;
    $$ LANGUAGE plpgsql;

    3. The Bi-Temporal "Update": A Real-World Change

    Now consider a scheduled price change. The price is $95.99 today, but starting next month, it will be $105.00. This is a change in the valid time, not a correction.

    • Find the current active record.
  • End its valid_range at now(). This requires two steps in our model:
  • a. End the tx_range of the current record.

    b. INSERT a new version of the old record with its valid_range now closed.

  • INSERT a completely new record for the future price, with its valid_range starting from the effective date.
  • sql
    CREATE OR REPLACE FUNCTION schedule_price_change(
        p_product_uuid UUID,
        p_new_price NUMERIC(10, 2),
        p_effective_date TIMESTAMPTZ
    ) RETURNS VOID AS $$
    DECLARE
        current_ts TIMESTAMPTZ := statement_timestamp();
        current_record products%ROWTYPE;
    BEGIN
        IF p_effective_date <= current_ts THEN
            RAISE EXCEPTION 'Effective date must be in the future.';
        END IF;
    
        -- Step 1: Find and lock the current active record
        SELECT * INTO current_record
        FROM products
        WHERE product_uuid = p_product_uuid
          AND upper(tx_range) = 'infinity'
          AND p_effective_date <@ valid_range -- Ensure the change occurs within the current validity
        FOR UPDATE;
    
        IF NOT FOUND THEN
            RAISE EXCEPTION 'Product with UUID % not found or has no valid record for the effective date.', p_product_uuid;
        END IF;
    
        -- Step 2a: End the transaction time of the current record
        UPDATE products
        SET tx_range = tstzrange(lower(tx_range), current_ts, '[)')
        WHERE id = current_record.id;
    
        -- Step 2b: Insert the 'terminated' version of the old record
        -- Its validity now ends at the new price's effective date.
        INSERT INTO products (product_uuid, valid_range, tx_range, price, currency)
        VALUES (
            p_product_uuid,
            tstzrange(lower(current_record.valid_range), p_effective_date, '[)'),
            tstzrange(current_ts, 'infinity', '[)'),
            current_record.price,
            current_record.currency
        );
    
        -- Step 3: Insert the new record for the future price
        INSERT INTO products (product_uuid, valid_range, tx_range, price, currency)
        VALUES (
            p_product_uuid,
            tstzrange(p_effective_date, 'infinity', '[)'),
            tstzrange(current_ts, 'infinity', '[)'),
            p_new_price,
            current_record.currency
        );
    END;
    $$ LANGUAGE plpgsql;

    This transactional logic is complex, but by encapsulating it within the database, we guarantee that no matter how the application calls it, the bi-temporal invariants are always maintained thanks to our EXCLUDE constraint.


    Querying Bi-Temporal Data: Unlocking History

    The real power of this model is revealed when we query it. Our GiST index is crucial for making these queries performant.

    Let's populate some data to demonstrate.

    sql
    -- Let's use a fixed UUID for our example product
    DO $$
    DECLARE
      product_id UUID := 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11';
    BEGIN
      -- 2023-01-01: Product created with price $100
      PERFORM create_product(product_id, 100.00, 'USD');
    
      -- 2023-02-01: We realize the price should have been $95 all along (Correction)
      -- We need to manually set statement_timestamp for reproducible example
      SET LOCAL statement_timeout = 0;
      ALTER ROLE postgres SET statement_timestamp = '2023-02-01 10:00:00 UTC';
      PERFORM correct_product_price(product_id, 95.00);
      RESET statement_timestamp;
    
      -- 2023-03-01: We schedule a price increase to $110, effective 2023-04-01
      SET LOCAL statement_timeout = 0;
      ALTER ROLE postgres SET statement_timestamp = '2023-03-01 12:00:00 UTC';
      PERFORM schedule_price_change(product_id, 110.00, '2023-04-01 00:00:00 UTC');
      RESET statement_timestamp;
    END $$;

    After these operations, our products table will contain several rows representing the full history. Now, let's ask some questions.

    Query 1: What is the current price? ("As-of-Now")

    This is the simplest query. We want the record that is currently valid in the real world and is the latest version in our database.

    sql
    SELECT product_uuid, price, currency
    FROM products
    WHERE now() <@ valid_range
      AND upper(tx_range) = 'infinity';

    Query 2: What was the price on 2023-01-15? ("As-of" Query)

    This is a standard uni-temporal query. We want the record that was valid on a specific date, according to our current knowledge.

    sql
    SELECT product_uuid, price, currency
    FROM products
    WHERE '2023-01-15'::timestamptz <@ valid_range
      AND upper(tx_range) = 'infinity';
    
    -- Result: $95.00. We are querying with today's knowledge, which includes the correction.

    Query 3: What did we THINK the price was on 2023-01-15, if we had asked on 2023-01-20? ("As-was-known" Query)

    This is the quintessential bi-temporal query. We are time-traveling back to a point in the database's history and asking a question about a point in real-world time.

    sql
    SELECT product_uuid, price, currency
    FROM products
    WHERE '2023-01-20'::timestamptz <@ tx_range -- The state of the DB on this date
      AND '2023-01-15'::timestamptz <@ valid_range; -- The real-world validity on this date
    
    -- Result: $100.00. On Jan 20th, we had not yet made the correction, so we still thought the price was $100.

    Query 4: What did we THINK the price was on 2023-01-15, if we had asked on 2023-02-15?

    Same question, but now we ask after the correction was made.

    sql
    SELECT product_uuid, price, currency
    FROM products
    WHERE '2023-02-15'::timestamptz <@ tx_range -- The state of the DB on this date
      AND '2023-01-15'::timestamptz <@ valid_range; -- The real-world validity on this date
    
    -- Result: $95.00. By Feb 15th, the correction was in the database.

    These queries demonstrate the power of the model. The GiST index on (product_uuid, valid_range, tx_range) is critical for ensuring these lookups are efficient, as it can search across all three dimensions simultaneously.


    Performance and Edge Case Considerations

    This model is powerful but comes with trade-offs and requires careful management in production.

    Performance Implications

  • Write Amplification: Every logical UPDATE becomes at least one UPDATE and one INSERT. A change to the valid_range is an UPDATE and two INSERTs. This increases write load and can lead to table bloat faster than a traditional model. Aggressive autovacuum tuning is essential for these tables.
  • Index Size: GiST indexes are larger than B-Tree indexes. The exclusion constraint itself is implemented via a GiST index. Monitor index size and bloat.
  • Query Performance: While the GiST index is highly effective for point-in-time queries (<@ operator), complex analytical queries that scan large portions of history may require different strategies, such as materialized views or ETL processes into an analytical store.
  • Benchmarking is Key: Before deploying, benchmark your specific write and read patterns. For a write-heavy system, the overhead may be too high. For a read-heavy system requiring strong auditability, it's often a perfect fit.
  • Advanced Edge Cases

  • Backdating and Out-of-Order Data: What if you need to insert a fact that was valid in the past but you only learned about it today? Example: A supplier informs you on March 1st that a price was actually $80 for the period of Jan 15th to Jan 31st. This is extremely complex. It requires "splitting" the existing historical records during that period. The logic inside your PL/pgSQL function would need to find all affected historical records, end-date their transaction time, and insert a complex web of new records to patch the history. This is the ultimate test of your bi-temporal logic.
  • Schema Migrations: Adding a non-nullable column with a default to a bi-temporal table is a significant operation. You are not just altering a table; you are altering its entire history. You must decide if the new column's default value applies to all historical records or if they should be marked as NULL (if nullable). This often requires a carefully scripted data migration, not a simple ALTER TABLE.
  • Clock Skew: The model relies heavily on statement_timestamp(). In a distributed system with multiple application servers, clock skew can lead to transaction time inconsistencies. For high-stakes financial systems, it's often recommended to have a single, authoritative time source or to pass the transaction timestamp from a central service rather than relying on the database server's local clock in a multi-node cluster.
  • Timezone Hell: Always use TIMESTAMPTZ. Storing timestamps without timezone information in a temporal model will lead to incorrect query results and data corruption when data is written or read from different timezones.
  • Conclusion: A Powerful Pattern for Critical Systems

    Implementing a bi-temporal data model is not a trivial undertaking. It introduces complexity in your write path and requires a shift in thinking about how data is queried. However, by leveraging PostgreSQL's native range types and exclusion constraints, you can offload the most dangerous and error-prone part of this model—the enforcement of temporal invariants—directly to the database.

    This creates a system that is not just auditable but provably correct. Your application code is simplified because it no longer needs to worry about preventing temporal overlaps; it can simply perform its business logic, and the database will guarantee the integrity of your historical record. For any system where the history of what was known and when is as important as the data itself, this pattern is the gold standard for building resilient, trustworthy software.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles