PostgreSQL Bi-temporal Modeling with Range Types & Exclusion Constraints
The Flaw in Simple Temporal Models
As systems mature, the requirement to track not just the current state but the history of state becomes paramount. Senior engineers have all encountered the initial, seemingly logical approach: adding valid_from and valid_to columns to a table. This pattern, known as a uni-temporal model, attempts to capture when a fact was true in the real world (its valid time).
However, this model has a critical, often overlooked, flaw: it fails to record when the database knew about the fact. This second dimension of time, the transaction time, is what separates simple versioning from a true bi-temporal model.
A bi-temporal model answers two distinct questions:
valid_time)?transaction_time)?The inability to answer the second question makes auditing, reproducing historical reports, and debugging data corruption nearly impossible. The naive valid_from/valid_to approach also forces complex, bug-prone application logic to manage temporal integrity—preventing overlapping or gapped time periods. This article presents a superior, database-centric solution leveraging advanced PostgreSQL features.
Our thesis is that by combining tstzrange types with GIST-powered EXCLUDE constraints, we can build a bi-temporal model where the database itself declaratively enforces temporal consistency, dramatically simplifying application code and increasing data integrity.
Core Implementation: The Bi-temporal Schema
We will model a common business scenario: tracking the price of a product over time. A price is valid for a certain period, and we need to track when we recorded that price information.
Let's define our two temporal dimensions:
valid_time: A tstzrange (timestamp with timezone range) representing the interval during which the price is effective in the real world. An open-ended range signifies the price is valid indefinitely into the future.transaction_time: A tstzrange representing the interval during which this version of the fact was present in our database. An open-ended range signifies this is the current, active version of the fact.First, ensure the necessary extension is enabled:
CREATE EXTENSION IF NOT EXISTS btree_gist;
Now, let's define the table. We'll use a bigserial as a surrogate primary key for internal references, but the logical primary key is a combination of the product's identity and its temporal validity.
-- Filename: 01_create_product_prices_table.sql
CREATE TABLE product_prices (
-- Surrogate key for stable foreign key references
id bigserial PRIMARY KEY,
-- Natural key component
product_id uuid NOT NULL,
-- The actual data being tracked
price numeric(10, 2) NOT NULL CHECK (price > 0),
-- BI-TEMPORAL AXES
-- When the price is valid in the real world
valid_time tstzrange NOT NULL,
-- When this fact was recorded and active in the database
transaction_time tstzrange NOT NULL,
-- Constraint to ensure no two valid_time ranges overlap for the same product
-- for the currently active transaction records.
CONSTRAINT no_overlapping_valid_time_for_current_transactions
EXCLUDE USING gist (product_id WITH =, valid_time WITH &&)
WHERE (transaction_time @> 'infinity'::timestamptz)
);
-- Index for efficient "As-Of" queries (current state at a point in time)
CREATE INDEX idx_product_prices_current_valid_time
ON product_prices USING gist (product_id, valid_time)
WHERE (transaction_time @> 'infinity'::timestamptz);
-- Index for efficient "As-Was" queries (historical state)
CREATE INDEX idx_product_prices_historical_transaction_time
ON product_prices USING gist (product_id, transaction_time, valid_time);
Deconstructing the Schema
tstzrange: This is the cornerstone. A native range type is far superior to two separate timestamptz columns. It understands concepts like overlap (&&), containment (@>), and adjacency, allowing for more intuitive queries and constraints.btree_gist Extension: This is required to allow a non-GIST-native type like uuid to be included in a GIST-based EXCLUDE constraint.EXCLUDE Constraint: This is the magic. It's a generalization of a UNIQUE constraint. Instead of checking for equality, it checks for a specified operator—in our case, && (overlaps). This constraint declaratively tells PostgreSQL: "Do not allow any two rows to be inserted or updated where the product_id is the same (=) AND their valid_time ranges overlap (&&)".WHERE Clause on the Constraint: This is a critical, advanced detail. We only want to enforce the non-overlap rule for currently active records. Historical records (whose transaction_time is closed) should have overlapping valid_time ranges because they represent different versions of history. The condition transaction_time @> 'infinity'::timestamptz isolates the constraint to only those rows that are considered the present truth.EXCLUDE constraint, optimized for finding the current price at a specific point in time. The second is a comprehensive index for full bi-temporal queries, allowing the planner to efficiently search across both time dimensions.Temporal Operations: Querying Time
With this schema, complex temporal queries become surprisingly declarative. We use the range containment operator @> to find the version of a record that was active at a specific point in time.
Let's populate some data. We'll assume a product a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d has an initial price.
INSERT INTO product_prices (product_id, price, valid_time, transaction_time)
VALUES (
'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d',
100.00,
-- Valid from Jan 1, 2023 onwards
tstzrange('2023-01-01 00:00:00 UTC', 'infinity'),
-- This record is current as of now
tstzrange(now(), 'infinity')
);
The "As-Of" Query: Current State at a Point in Time
Question: What was the price for our product on June 15th, 2023, according to our current knowledge?
-- Filename: 02_as_of_query.sql
SELECT product_id, price
FROM product_prices
WHERE
product_id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d'
-- The valid_time range must contain our target date
AND valid_time @> '2023-06-15 00:00:00 UTC'::timestamptz
-- We only want the currently active version of this fact
AND transaction_time @> 'infinity'::timestamptz;
This query is efficient because it will use our partial GIST index idx_product_prices_current_valid_time.
The "As-Was" Query: A Look into the Past
This is the query that uni-temporal models cannot answer.
Scenario: Let's say on November 1st, 2023, we realize the price from January 1st was actually wrong. It should have been $95.00. We perform a correction (we'll see how to do this in the next section).
After the correction, the database contains both the old, incorrect record (with a closed transaction_time) and the new, correct record (with an open transaction_time).
Question: An auditor asks, "On October 30th, 2023, what would your system have reported as the price for January 15th, 2023?"
-- Filename: 03_as_was_query.sql
SELECT product_id, price
FROM product_prices
WHERE
product_id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d'
-- The valid_time range must contain the real-world date of interest
AND valid_time @> '2023-01-15 00:00:00 UTC'::timestamptz
-- The transaction_time range must contain the date we are querying from the past
AND transaction_time @> '2023-10-30 00:00:00 UTC'::timestamptz;
This query will correctly return $100.00, because on Oct 30th, that was the fact recorded in our system. The full GIST index idx_product_prices_historical_transaction_time is crucial for making this query performant on large datasets.
Handling Updates and Corrections: The Atomic Function
In a bi-temporal model, you almost never perform a SQL UPDATE or DELETE. These operations destroy history. Instead, every change is an INSERT of new versions and the closing of old transaction periods. This logic is complex and must be atomic. The best way to manage this in PostgreSQL is a well-designed PL/pgSQL function.
Scenario 1: A Standard Price Update
On December 1st, 2023, the price for our product will increase to $125.00.
Here's the logic we need to implement atomically:
transaction_time is infinite).transaction_time by setting the upper bound to now().valid_time truncated to end at the change date (2023-12-01). The new record's transaction_time starts from now().valid_time starting from the change date. Its transaction_time also starts from now().This is too complex to do reliably from application code. Let's encapsulate it.
-- Filename: 04_bitemporal_update_function.sql
CREATE OR REPLACE FUNCTION update_product_price(
p_product_id uuid,
p_new_price numeric(10, 2),
p_effective_from timestamptz
)
RETURNS void LANGUAGE plpgsql AS $$
DECLARE
current_record product_prices;
transaction_ts timestamptz := now();
BEGIN
-- Use a FOR UPDATE clause to lock the row and prevent race conditions
SELECT * INTO current_record
FROM product_prices
WHERE product_id = p_product_id
AND transaction_time @> 'infinity'::timestamptz
AND valid_time @> p_effective_from
FOR UPDATE;
IF NOT FOUND THEN
RAISE EXCEPTION 'No current price record found for product % at effective date %',
p_product_id, p_effective_from;
END IF;
-- 1. Close the transaction time of the current record
UPDATE product_prices
SET transaction_time = tstzrange(lower(transaction_time), transaction_ts, '[)')
WHERE id = current_record.id;
-- 2. Re-insert the old record with its valid_time truncated
-- This preserves the history that, until transaction_ts, we believed
-- the old price was valid for its full original range.
IF lower(current_record.valid_time) < p_effective_from THEN
INSERT INTO product_prices (
product_id, price, valid_time, transaction_time
)
VALUES (
current_record.product_id,
current_record.price,
tstzrange(lower(current_record.valid_time), p_effective_from, '[)'),
tstzrange(transaction_ts, 'infinity')
);
END IF;
-- 3. Insert the new price record, valid from the effective date
INSERT INTO product_prices (
product_id, price, valid_time, transaction_time
)
VALUES (
p_product_id,
p_new_price,
tstzrange(p_effective_from, upper(current_record.valid_time), '[)'),
tstzrange(transaction_ts, 'infinity')
);
END;
$$;
Usage:
-- Execute the price change
SELECT update_product_price(
'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d',
125.00,
'2023-12-01 00:00:00 UTC'
);
After running this, an "As-Of" query for a date before Dec 1st will show the old price, and a query for a date after will show the new price. Crucially, the history of this change is now permanently recorded.
Scenario 2: A Historical Correction (Back-dating)
This is where bi-temporal models truly shine. Let's implement the correction we discussed earlier: on Nov 1st, 2023, we realize the price that started on Jan 1st, 2023 should have been $95.00, not $100.00.
This is not a simple update; it's a correction of a fact that was believed to be true. The logic is different:
valid_time overlaps with the correction period.transaction_time for all these incorrect historical records.transaction_time from now().This logic can get extremely complex if a correction period spans multiple existing valid periods. A robust function would need to handle splitting, merging, and replacing multiple records. Here is a simplified function for correcting a single period:
-- Filename: 05_bitemporal_correction_function.sql
CREATE OR REPLACE FUNCTION correct_historical_product_price(
p_product_id uuid,
p_corrected_price numeric(10, 2),
p_correction_valid_time tstzrange
)
RETURNS void LANGUAGE plpgsql AS $$
DECLARE
record_to_correct product_prices;
transaction_ts timestamptz := now();
BEGIN
-- Find the active record that fully contains the correction period.
-- A production system would need to handle multiple overlapping records.
SELECT * INTO record_to_correct
FROM product_prices
WHERE product_id = p_product_id
AND transaction_time @> 'infinity'::timestamptz
AND valid_time @> p_correction_valid_time
FOR UPDATE;
IF NOT FOUND THEN
RAISE EXCEPTION 'No single active record fully contains the correction period for product %.', p_product_id;
END IF;
-- 1. Close the transaction for the incorrect record
UPDATE product_prices
SET transaction_time = tstzrange(lower(transaction_time), transaction_ts, '[)')
WHERE id = record_to_correct.id;
-- 2. Re-insert the portion of the old record before the correction (if any)
IF lower(record_to_correct.valid_time) < lower(p_correction_valid_time) THEN
INSERT INTO product_prices (product_id, price, valid_time, transaction_time)
VALUES (
record_to_correct.product_id, record_to_correct.price,
tstzrange(lower(record_to_correct.valid_time), lower(p_correction_valid_time), '[)'),
tstzrange(transaction_ts, 'infinity')
);
END IF;
-- 3. Insert the new, corrected record for the specified valid time
INSERT INTO product_prices (product_id, price, valid_time, transaction_time)
VALUES (
p_product_id, p_corrected_price,
p_correction_valid_time,
tstzrange(transaction_ts, 'infinity')
);
-- 4. Re-insert the portion of the old record after the correction (if any)
IF upper(record_to_correct.valid_time) > upper(p_correction_valid_time) THEN
INSERT INTO product_prices (product_id, price, valid_time, transaction_time)
VALUES (
record_to_correct.product_id, record_to_correct.price,
tstzrange(upper(p_correction_valid_time), upper(record_to_correct.valid_time), '[)'),
tstzrange(transaction_ts, 'infinity')
);
END IF;
END;
$$;
Performance Considerations and Edge Cases
While powerful, this model requires careful management.
Performance
EXPLAIN ANALYZE is Your Best Friend: Always verify your query plans. A Bitmap Heap Scan on a GIST index is expected. If you see a Seq Scan, your query is not using the index effectively.product_prices by LIST (product_id) or RANGE (lower(valid_time)). This keeps indexes smaller and allows for older, rarely-accessed partitions to be moved to slower, cheaper storage.Edge Cases and Gotchas
tstzrange over tsrange. Store all data in UTC and handle timezone conversions at the application/presentation layer. Inconsistent timezone handling is a primary source of bugs in temporal systems.[) means inclusive-exclusive). The default [) is usually correct for time, representing a moment up to, but not including, the next. Consistency is key.DELETE FROM product_prices. It is a temporal update that closes the valid_time of its current price record. This preserves the fact that the product existed and had a price up to that point.Conclusion
Implementing a bi-temporal data model is a significant architectural decision. The naive approach of using simple start and end date columns offloads immense complexity and responsibility onto the application layer, often resulting in subtle bugs, data integrity violations, and an inability to perform critical historical analysis.
By leveraging PostgreSQL's advanced, native features like tstzrange and EXCLUDE constraints, we can delegate the enforcement of temporal logic to the database itself. This declarative approach provides a robust, performant, and logically sound foundation. While the initial setup and the mutation logic encapsulated in database functions are more complex, the long-term benefits are profound: guaranteed data integrity, simplified application-level queries, and the powerful ability to query the state of your system not just "as-of" now, but "as-was" at any point in its history.