Advanced Bi-temporal Modeling in Postgres with Range Types & Constraints

September 30, 2025

20 min read

Goh Ling Yong

Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

Beyond Simple Versioning: The Case for Bi-temporal Data

In systems engineering, tracking data changes is a solved problem, right? We have audit logs, updated_at timestamps, and perhaps versioning tables populated by triggers. These uni-temporal solutions track the state of data as it was recorded in the database over time. This is sufficient for many applications, but it fails catastrophically when the business logic requires distinguishing between when an event happened in the real world and when it was recorded in our system.

Consider a payroll system. An employee's salary is set to $80,000, effective January 1st. On February 15th, HR discovers a clerical error; the salary should have been $85,000, also effective January 1st. A simple UPDATE statement destroys information. We lose the fact that for 45 days, the system believed the salary was $80,000. This distinction is critical for auditing, dispute resolution, and accurate historical reporting.

This is the domain of bi-temporal modeling. We manage two distinct time axes:

Valid Time (valid_time): The time period during which a fact is true in the real world. In our example, the $85,000 salary is valid from January 1st onwards.

Transaction Time (transaction_time): The time period during which a fact is stored in the database. The incorrect $80,000 record was part of the database's state from its creation until February 15th. The corrected $85,000 record is part of the state from February 15th onwards.

Implementing this pattern from scratch can be a minefield of overlapping intervals, complex application logic, and subtle data integrity bugs. Fortunately, PostgreSQL provides a powerful, often underutilized, set of tools—namely range types and exclusion constraints—that allow us to enforce bi-temporal integrity directly at the database level.

This article is not an introduction. It's a production-focused guide to building a robust bi-temporal model in PostgreSQL, complete with schema design, advanced data manipulation functions, complex query patterns, and performance considerations.

Section 1: The Production-Grade Bi-temporal Schema

Our goal is to create a table that can immutably store the history of employee salaries, respecting both valid and transaction time. We will use tstzrange for our time axes, which represents a range of timestamp with time zone. The lower bound is inclusive, and the upper bound is exclusive, which is a perfect fit for time-based intervals.

First, let's enable the necessary extension:

sql

CREATE EXTENSION IF NOT EXISTS btree_gist;

Now, the core table structure. We will track salaries for employees.

sql

CREATE TABLE employee_salaries (
    -- A surrogate primary key for unique row identification
    id BIGSERIAL PRIMARY KEY,

    -- The business key identifying the employee
    employee_id INT NOT NULL,

    -- The actual data point we are versioning
    salary NUMERIC(10, 2) NOT NULL CHECK (salary > 0),

    -- VALID TIME: When this salary is effective in the real world
    valid_time TSTZRANGE NOT NULL,

    -- TRANSACTION TIME: When this record was considered current in the database
    transaction_time TSTZRANGE NOT NULL,

    -- This is the magic. It prevents data integrity violations.
    -- For a given employee, you cannot have two currently active records
    -- whose real-world validity periods (valid_time) overlap.
    CONSTRAINT no_overlapping_current_valid_time
    EXCLUDE USING GIST (employee_id WITH =, valid_time WITH &&)
    WHERE (UPPER(transaction_time) IS NULL OR UPPER(transaction_time) = 'infinity')
);

Deconstructing the Schema and Constraints

* id: A standard surrogate primary key. While our business logic will revolve around employee_id and the time ranges, having a simple, immutable row identifier is crucial for foreign key relationships and unambiguous row references.

* employee_id: The foreign key to an employees table. This is our partitioning key for the temporal data.

* valid_time: TSTZRANGE representing the real-world effective period. A range of ['2023-01-01', '2024-01-01') means the salary was valid from the start of Jan 1, 2023, up to, but not including, the start of Jan 1, 2024.

* transaction_time: TSTZRANGE representing the database record's lifespan. When we insert a new record, we'll typically set this to [now(), 'infinity'). When we supersede it, we'll update the upper bound to now().

* EXCLUDE USING GIST (...): This is the heart of our data integrity model. Let's break it down:

* EXCLUDE USING GIST: Specifies an exclusion constraint, which is only possible with index types that support the && (overlaps) operator, like GiST.

* employee_id WITH =: The constraint applies on a per-employee basis. The equality operator = checks that we are comparing rows for the same employee.

* valid_time WITH &&: This is the core check. It prevents any two constrained rows from having valid_time ranges that overlap (&& is the overlaps operator for range types).

WHERE (UPPER(transaction_time) IS NULL OR UPPER(transaction_time) = 'infinity'): This is the most critical and advanced part. The constraint only applies to currently active records. A record is considered active if its transaction time has not been closed (its upper bound is infinity). This allows us to have historical records with overlapping valid_time ranges because they represent different versions of history*. Without this WHERE clause, our bi-temporal model would be impossible to implement, as any correction would violate the constraint.

Indexing Strategy for Performance

Querying this table will be complex. We need indexes to support our access patterns. The EXCLUDE constraint already creates a GiST index, but we need more.

sql

-- The EXCLUDE constraint implicitly creates this index, but let's be explicit
-- for clarity. This is our primary tool for finding the current state for an employee.
CREATE INDEX idx_employee_salaries_current_valid
ON employee_salaries USING GIST (employee_id, valid_time)
WHERE (UPPER(transaction_time) IS NULL OR UPPER(transaction_time) = 'infinity');

-- An index to quickly find all versions of a fact for a given valid time period.
-- Crucial for auditing how our knowledge has changed.
CREATE INDEX idx_employee_salaries_valid_time_history
ON employee_salaries USING GIST (employee_id, valid_time, transaction_time);

-- A standard B-tree index for looking up all records for an employee, ordered by transaction start.
CREATE INDEX idx_employee_salaries_employee_id_transaction_start
ON employee_salaries (employee_id, LOWER(transaction_time));

This set of indexes provides good coverage for the types of queries we'll explore later.

Section 2: Advanced Data Manipulation via PL/pgSQL

Direct INSERT, UPDATE, and DELETE statements on this table are forbidden. All modifications must be managed through stored procedures to correctly maintain the bi-temporal invariants. An UPDATE is not a modification of a row; it is the end of one record's transaction time and the beginning of another's.

Initial Record Creation

This is the simplest operation: creating the first salary record for an employee.

sql

CREATE OR REPLACE FUNCTION create_initial_salary(
    p_employee_id INT,
    p_salary NUMERIC(10, 2),
    p_effective_from TIMESTAMPTZ
) RETURNS BIGINT AS $$
DECLARE
    new_salary_id BIGINT;
BEGIN
    INSERT INTO employee_salaries (employee_id, salary, valid_time, transaction_time)
    VALUES (
        p_employee_id,
        p_salary,
        tstzrange(p_effective_from, 'infinity', '[)'),
        tstzrange(NOW(), 'infinity', '[)')
    )
    RETURNING id INTO new_salary_id;

    RETURN new_salary_id;
END;
$$ LANGUAGE plpgsql;

Handling Standard Updates (Successor Records)

This function handles a standard salary change that takes effect from a certain date. It must find the current active record, end its validity, and insert a new one.

sql

CREATE OR REPLACE FUNCTION update_current_salary(
    p_employee_id INT,
    p_new_salary NUMERIC(10, 2),
    p_effective_from TIMESTAMPTZ
) RETURNS BIGINT AS $$
DECLARE
    current_record employee_salaries%ROWTYPE;
    new_salary_id BIGINT;
    transaction_now TIMESTAMPTZ := NOW();
BEGIN
    -- Find and lock the current active record for this employee that is valid now.
    -- The FOR UPDATE clause is critical to prevent race conditions.
    SELECT * INTO current_record
    FROM employee_salaries
    WHERE employee_id = p_employee_id
      AND UPPER(transaction_time) = 'infinity'
      AND valid_time @> p_effective_from
    FOR UPDATE;

    IF NOT FOUND THEN
        RAISE EXCEPTION 'No current salary record found for employee_id % to update at effective date %', p_employee_id, p_effective_from;
    END IF;

    -- Edge Case: Ensure the new effective date is not in the past relative to the current record's start.
    IF LOWER(current_record.valid_time) >= p_effective_from THEN
        RAISE EXCEPTION 'The new effective date (%) cannot be on or before the start of the current salary period (%). Use historical correction function instead.', p_effective_from, LOWER(current_record.valid_time);
    END IF;

    -- 1. End the old record's transaction time and valid time.
    UPDATE employee_salaries
    SET
        transaction_time = tstzrange(LOWER(transaction_time), transaction_now, '[)'),
        valid_time = tstzrange(LOWER(valid_time), p_effective_from, '[)')
    WHERE id = current_record.id;

    -- 2. Insert the new salary record, which is valid from the effective date onwards.
    INSERT INTO employee_salaries (employee_id, salary, valid_time, transaction_time)
    VALUES (
        p_employee_id,
        p_new_salary,
        tstzrange(p_effective_from, 'infinity', '[)'),
        tstzrange(transaction_now, 'infinity', '[)')
    )
    RETURNING id INTO new_salary_id;

    RETURN new_salary_id;
END;
$$ LANGUAGE plpgsql;

Key Points of update_current_salary:

* FOR UPDATE: This is non-negotiable. It places a row-level lock on the current salary record, preventing another transaction from attempting to update it simultaneously. Without this, you would have a classic race condition leading to data corruption.

* transaction_now variable: We capture NOW() once at the beginning of the function. Using NOW() multiple times within the same transaction is fine as it returns the transaction start time, but explicitly capturing it makes the logic clearer and safer, ensuring the end of the old record and the start of the new one are perfectly contiguous in transaction time.

* Immutability: We never truly UPDATE the salary value. We terminate the old record's time ranges and INSERT a new one.

The Bi-temporal Powerhouse: Correcting History

This is the most complex and powerful operation. We need to correct a historical record. For example, a salary that was thought to be $80k from Jan 1st was actually supposed to be $85k.

sql

CREATE OR REPLACE FUNCTION correct_salary_history(
    p_employee_id INT,
    p_corrected_salary NUMERIC(10, 2),
    p_correction_valid_from TIMESTAMPTZ,
    p_correction_valid_to TIMESTAMPTZ -- Use NULL for 'infinity'
) RETURNS SETOF BIGINT AS $$
DECLARE
    transaction_now TIMESTAMPTZ := NOW();
    correction_valid_range TSTZRANGE;
    affected_record employee_salaries%ROWTYPE;
    new_id BIGINT;
    return_ids BIGINT[] := ARRAY[]::BIGINT[];
BEGIN
    correction_valid_range := tstzrange(p_correction_valid_from, p_correction_valid_to, '[)');

    -- Find all currently active records that overlap with the correction period.
    -- This loop is necessary because a correction might span multiple historical periods.
    FOR affected_record IN
        SELECT * FROM employee_salaries
        WHERE employee_id = p_employee_id
          AND UPPER(transaction_time) = 'infinity'
          AND valid_time && correction_valid_range
        ORDER BY valid_time
        FOR UPDATE
    LOOP
        -- 1. Terminate the transaction time of the incorrect historical record.
        -- We are saying "As of `transaction_now`, this record is no longer what we believe to be true".
        UPDATE employee_salaries
        SET transaction_time = tstzrange(LOWER(transaction_time), transaction_now, '[)')
        WHERE id = affected_record.id;

        -- 2. Re-insert the portions of the old record that are NOT being corrected.
        -- This creates "islands" of truth around the corrected period.

        -- Part before the correction
        IF LOWER(affected_record.valid_time) < LOWER(correction_valid_range) THEN
            INSERT INTO employee_salaries (employee_id, salary, valid_time, transaction_time)
            VALUES (affected_record.employee_id, affected_record.salary,
                    tstzrange(LOWER(affected_record.valid_time), LOWER(correction_valid_range), '[)'),
                    tstzrange(transaction_now, 'infinity', '[)'))
            RETURNING id INTO new_id;
            return_ids := array_append(return_ids, new_id);
        END IF;

        -- Part after the correction
        IF UPPER(affected_record.valid_time) > UPPER(correction_valid_range) THEN
            INSERT INTO employee_salaries (employee_id, salary, valid_time, transaction_time)
            VALUES (affected_record.employee_id, affected_record.salary,
                    tstzrange(UPPER(correction_valid_range), UPPER(affected_record.valid_time), '[)'),
                    tstzrange(transaction_now, 'infinity', '[)'))
            RETURNING id INTO new_id;
            return_ids := array_append(return_ids, new_id);
        END IF;
    END LOOP;

    -- 3. Insert the new, corrected record for the specified validity period.
    INSERT INTO employee_salaries (employee_id, salary, valid_time, transaction_time)
    VALUES (p_employee_id, p_corrected_salary,
            correction_valid_range,
            tstzrange(transaction_now, 'infinity', '[)'))
    RETURNING id INTO new_id;
    return_ids := array_append(return_ids, new_id);
    
    -- Unnest the array of new IDs to return as a set
    RETURN QUERY SELECT unnest(return_ids);

END;
$$ LANGUAGE plpgsql;

This function is dense. It finds every currently-true record that overlaps with the correction period. For each one, it:

Terminates it in transaction time (retires it).

Re-creates the parts of that record's validity period that fall outside the correction window. This is crucial for preserving history accurately.

Finally, after processing all affected historical records, it inserts the single new corrected record.

This preserves the full audit trail. We can see the old incorrect records, when they were retired, and the new corrected records that replaced them.

Section 3: The Four Essential Bi-temporal Queries

Now that we have a robust way to store the data, how do we retrieve it? Querying bi-temporal data requires a shift in thinking. We are always querying based on one or two time dimensions.

Let's populate some data to make our queries concrete.

sql

-- Scenario Setup
-- Employee 101 starts on Jan 1, 2023 with $80k salary
SELECT create_initial_salary(101, 80000, '2023-01-01');

-- On July 1, 2023, they get a raise to $90k
-- We will need to set the transaction time manually for this example to be reproducible
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT update_current_salary(101, 90000, '2023-07-01');
-- Let's pretend this transaction happened on 2023-06-20
UPDATE employee_salaries SET transaction_time = tstzrange('2023-06-20', 'infinity', '[)') WHERE salary = 90000;
UPDATE employee_salaries SET transaction_time = tstzrange(lower(transaction_time), '2023-06-20', '[)') WHERE salary = 80000 AND employee_id = 101;
COMMIT;

-- On Feb 15, 2024, we discover the initial salary was wrong. It should have been $85k from Jan 1 to July 1.
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT correct_salary_history(101, 85000, '2023-01-01', '2023-07-01');
-- Let's pretend this transaction happened on 2024-02-15
UPDATE employee_salaries SET transaction_time = tstzrange('2024-02-15', 'infinity', '[)') WHERE salary = 85000;
UPDATE employee_salaries SET transaction_time = tstzrange(lower(transaction_time), '2024-02-15', '[)') WHERE salary = 80000 AND employee_id = 101;
COMMIT;

Query 1: As-Of Query (What is true now?)

This is the most common query. It asks for the state of the world according to the latest information in the database.

Question: What is the full salary history for employee 101 as we know it today?

sql

SELECT
    employee_id,
    salary,
    valid_time
FROM employee_salaries
WHERE employee_id = 101
  AND UPPER(transaction_time) = 'infinity'
ORDER BY valid_time;

Result:

employee_id	salary	valid_time
101	85000	["2023-01-01 00:00:00+00", "2023-07-01 00:00:00+00")
101	90000	["2023-07-01 00:00:00+00", "infinity")

This shows the corrected history. The incorrect $80,000 record is not present because its transaction_time has been closed.

Query 2: As-Was Query (What did we think was true?)

This is the quintessential audit query. It allows us to travel back in time and see the state of the database itself.

Question: What did we believe was employee 101's salary on January 30, 2024, for the date of March 15, 2023?

sql

SELECT
    employee_id,
    salary,
    valid_time,
    transaction_time
FROM employee_salaries
WHERE employee_id = 101
  -- The state of the database as it was on Jan 30, 2024
  AND transaction_time @> '2024-01-30'::timestamptz
  -- The real-world fact we are interested in (March 15, 2023)
  AND valid_time @> '2023-03-15'::timestamptz;

Result:

employee_id	salary	valid_time	transaction_time
101	80000	["2023-01-01 00:00:00+00", "2023-07-01 00:00:00+00")	["2023-01-01 ...", "2024-02-15 00:00:00+00")

This query correctly shows that on Jan 30, 2024 (before the correction on Feb 15), we believed the salary for that period was $80,000.

Query 3: Evolution of a Fact

This query shows how our knowledge about a specific period in time has changed.

Question: Show me the entire history of what we thought the salary was for March 15, 2023.

sql

SELECT
    salary,
    transaction_time
FROM employee_salaries
WHERE employee_id = 101
  AND valid_time @> '2023-03-15'::timestamptz
ORDER BY LOWER(transaction_time);

Result:

salary	transaction_time
80000	["2023-01-01 ...", "2024-02-15 00:00:00+00")
85000	["2024-02-15 00:00:00+00", "infinity")

This powerful audit trail shows that we initially recorded the salary as $80,000, and this belief was held until Feb 15, 2024, at which point it was corrected to $85,000, which is our current belief.

Query 4: Point-in-Time Reconstruction

This query reconstructs the salary history for an employee as it was known at a specific transaction time.

Question: Show me the full salary history for employee 101 as we knew it on Jan 30, 2024.

sql

WITH history_at_point_in_time AS (
    SELECT
        id,
        employee_id,
        salary,
        valid_time,
        transaction_time
    FROM employee_salaries
    WHERE employee_id = 101
      AND transaction_time @> '2024-01-30'::timestamptz
)
SELECT
    employee_id,
    salary,
    valid_time
FROM history_at_point_in_time
ORDER BY valid_time;

Result:

employee_id	salary	valid_time
101	80000	["2023-01-01 00:00:00+00", "2023-07-01 00:00:00+00")
101	90000	["2023-07-01 00:00:00+00", "infinity")

This correctly reconstructs the history before the correction was made, showing the $80,000 salary as the active record for the first half of the year.

Section 4: Performance, Edge Cases, and Production Hardening

While powerful, this pattern introduces complexity that requires careful management.

Performance and Partitioning

For tables with millions or billions of rows, performance will degrade. The GiST indexes are effective, but they are not as fast as B-tree indexes for all query types. The primary strategy for scaling is partitioning.

PostgreSQL's declarative partitioning is a perfect fit here. You would almost certainly partition the employee_salaries table by employee_id (either HASH or LIST partitioning).

sql

-- Example of partitioning by HASH
CREATE TABLE employee_salaries_partitioned (
    -- ... same columns as before ...
) PARTITION BY HASH (employee_id);

CREATE TABLE employee_salaries_p0 PARTITION OF employee_salaries_partitioned FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE employee_salaries_p1 PARTITION OF employee_salaries_partitioned FOR VALUES WITH (MODULUS 4, REMAINDER 1);
-- etc.

This ensures that any query for a specific employee only ever has to scan one small partition and its corresponding index, dramatically improving performance.

Timezone Management

We deliberately used TIMESTAMPTZ. This is non-negotiable. All timestamps are stored in UTC. Your application server, database connection, and any clients must have their timezones configured correctly. All input timestamps should be unambiguous. The AT TIME ZONE construct in PostgreSQL becomes your best friend for converting user-input local times into correct TIMESTAMPTZ values for querying.

Edge Case: Contiguous Periods

The update_current_salary function assumes a simple succession where one period ends and the next begins. What if there is a gap? For example, an employee leaves and is rehired later. The logic must be adapted to handle this. The update function would fail because it wouldn't find a record where valid_time @> p_effective_from. This is correct behavior; you would need a separate rehire_employee function that simply inserts a new record without trying to supersede a previous one.

Edge Case: Correction at the Boundary

What happens if a correction is made exactly at the boundary of two existing valid time periods? The logic in correct_salary_history is designed to handle this by splitting and re-inserting the non-affected parts. However, this requires rigorous testing. Off-by-one errors with range boundaries are a common source of bugs. The use of inclusive-lower [ and exclusive-upper ) is the standard, robust way to model this and avoid such errors.

Conclusion: A Powerful but Sharp Tool

Bi-temporal data modeling in PostgreSQL using range types and exclusion constraints is a formidable pattern for building systems with high-integrity, auditable data. It moves complex temporal consistency logic from the application layer—where it is difficult to enforce and prone to race conditions—into the database, which is designed to manage data integrity and concurrency.

This approach is not for every problem. It introduces complexity in data manipulation and requires a new way of thinking about querying. However, for financial systems, insurance, healthcare records, or any domain where the distinction between real-world truth and recorded information is paramount, it provides a level of correctness and auditability that is nearly impossible to achieve with traditional versioning schemes. By mastering these advanced PostgreSQL features, senior engineers can build truly robust and trustworthy systems.

Beyond Simple Versioning: The Case for Bi-temporal Data

Section 1: The Production-Grade Bi-temporal Schema

Deconstructing the Schema and Constraints

Indexing Strategy for Performance

Section 2: Advanced Data Manipulation via PL/pgSQL

Initial Record Creation

Handling Standard Updates (Successor Records)

The Bi-temporal Powerhouse: Correcting History

Section 3: The Four Essential Bi-temporal Queries

Query 1: As-Of Query (What is true *now*?)

Query 2: As-Was Query (What did we *think* was true?)

Query 3: Evolution of a Fact

Query 4: Point-in-Time Reconstruction

Section 4: Performance, Edge Cases, and Production Hardening

Performance and Partitioning

Timezone Management

Edge Case: Contiguous Periods

Edge Case: Correction at the Boundary

Conclusion: A Powerful but Sharp Tool

Found this article helpful?

Query 1: As-Of Query (What is true now?)

Query 2: As-Was Query (What did we think was true?)