Production Bi-Temporal Modeling in Postgres with Range Types
The Shortcomings of Conventional History Tracking
As senior engineers, we've all implemented some form of history tracking. The most common approaches are simple audit logs or a Slowly Changing Dimension (SCD) Type 2 pattern, where we add start_date and end_date columns and mark old rows as inactive. While functional for basic auditing, these methods crumble under the weight of complex temporal queries and fail to robustly guarantee data integrity at the database level.
The core problem they fail to solve is the distinction between Valid Time and Transaction Time.
* Valid Time: The time period when a fact is true in the real world. For example, a product's price is $99.99 from January 1st to March 31st. This is a business-level concept.
* Transaction Time: The time period when a fact is stored in the database. This is a system-level concept, recording the state of our knowledge.
A system that tracks both is called a bi-temporal system. It can answer not only "What was the price on February 15th?" but also the far more complex question, "On January 20th, what did we think the price was for February 15th?". This is critical in financial reporting, insurance policy management, and any domain where retroactive changes and accurate historical auditing are legal or business necessities.
Traditional SCD patterns conflate these two time axes, leading to complex application logic, difficult-to-write queries, and data integrity vulnerabilities. This article presents a superior approach using modern PostgreSQL features to build a robust, performant, and declaratively correct bi-temporal model.
Core Schema: Leveraging `TSRANGETZ` for Temporal Axes
Our foundation will be PostgreSQL's powerful range types. Specifically, tsrangetz (timestamp with time zone range) is perfect for representing our two time axes. A range type is more than just two columns; it's a first-class data type with a rich set of operators (@>, &&, etc.) that are crucial for temporal logic and can be indexed efficiently with GIST.
Let's model a product_prices table. This table will store the price for a product, which is valid for a specific period and recorded in our system at a specific time.
-- Ensure the btree_gist extension is available for our exclusion constraint
CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE TABLE product_prices (
-- A surrogate key for the row itself
price_id BIGSERIAL PRIMARY KEY,
-- The natural key for the entity we are tracking
product_id UUID NOT NULL,
-- The actual data being tracked over time
price NUMERIC(10, 2) NOT NULL CHECK (price > 0),
currency CHAR(3) NOT NULL,
-- BI-TEMPORAL COLUMNS
-- Valid Time: When the price is effective in the real world.
-- We use an inclusive-exclusive range: [start, end)
valid_time TSRANGETZ NOT NULL,
-- Transaction Time: When this record was considered current in the database.
-- An open-ended range signifies the 'current' version.
transaction_time TSRANGETZ NOT NULL,
-- We will add the crucial constraint in the next section
);
Key Design Choices:
* price_id: A surrogate primary key is essential. The natural key (product_id + valid_time) is not unique across transaction time, so we need an immutable identifier for each historical row version.
* tsrangetz: We use timestamptz ranges to eliminate any ambiguity related to timezones. All temporal data should be stored with timezone information.
* Range Bounds: It is a common convention to use inclusive-exclusive bounds ([)) for time ranges to prevent overlapping at the boundaries. [2023-01-01, 2023-02-01) and [2023-02-01, 2023-03-01) are contiguous and non-overlapping.
Declarative Integrity: The Exclusion Constraint
This is the cornerstone of our model's robustness. We need a rule: for any given product, there can be no two currently active price records whose validity periods overlap. A simple UNIQUE constraint cannot handle this logic.
This is a perfect use case for an EXCLUDE constraint, which is a generalization of UNIQUE. It allows you to specify that no two rows can have values that satisfy a certain operator—in our case, the overlap operator (&&).
ALTER TABLE product_prices
ADD CONSTRAINT product_prices_temporal_integrity
EXCLUDE USING gist (
product_id WITH =,
valid_time WITH &&
)
WHERE (transaction_time @> 'infinity'::timestamptz);
Let's dissect this powerful statement:
EXCLUDE USING gist: Exclusion constraints require a GIST or SP-GIST index to efficiently check for conflicts.product_id WITH =: The constraint is partitioned by product_id. Overlaps are only checked for rows with the same product ID. This is analogous to the columns in a UNIQUE constraint.valid_time WITH &&: This is the core logic. It uses the && (overlaps) operator for the tsrangetz type. The constraint will fail if a new or updated row's valid_time overlaps with an existing row's valid_time (for the same product_id).WHERE (transaction_time @> 'infinity'::timestamptz): This is the most critical and subtle part. This partial constraint applies only to rows that are currently active in the database. We define "current" as any row whose transaction_time range extends to infinity. This allows us to have overlapping valid_time ranges in our history (e.g., a corrected price record), but never in the current view of the data. This single clause is what elevates the model to true bi-temporality.Let's test it. First, insert a valid price for a product, effective for the entire year 2024.
INSERT INTO product_prices (
product_id, price, currency, valid_time, transaction_time
)
VALUES (
'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11',
199.99,
'USD',
tsrangetz('2024-01-01 00:00:00Z', '2025-01-01 00:00:00Z', '[)'),
tsrangetz(now(), 'infinity', '[)')
);
This succeeds. Now, let's try to insert another price for the same product that overlaps in valid_time and is also marked as current.
-- THIS WILL FAIL
INSERT INTO product_prices (
product_id, price, currency, valid_time, transaction_time
)
VALUES (
'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', -- Same product_id
249.99,
'USD',
tsrangetz('2024-06-01 00:00:00Z', '2024-09-01 00:00:00Z', '[)'), -- Overlapping valid_time
tsrangetz(now(), 'infinity', '[)') -- Also current
);
-- ERROR: conflicting key value violates exclusion constraint "product_prices_temporal_integrity"
-- DETAIL: Key (product_id, valid_time)=(a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11, ["2024-06-01 00:00:00+00","2024-09-01 00:00:00+00")) conflicts with existing key (product_id, valid_time)=(a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11, ["2024-01-01 00:00:00+00","2025-01-01 00:00:00+00")).
The database itself, not the application, has prevented a state of data ambiguity. This is a massive win for data integrity.
Automating History: The Bi-Temporal Trigger
We cannot trust application logic to correctly manage the transaction_time column. An UPDATE or DELETE on a bi-temporal table is not a direct UPDATE or DELETE operation. Instead, it's a process of "closing out" the current record and inserting a new one.
* On UPDATE: The transaction_time of the current row is updated to end at now(). A new row is inserted with the updated data, with a transaction_time starting at now() and extending to infinity.
* On DELETE: The transaction_time of the current row is simply updated to end at now(). No new row is inserted.
This logic is a perfect candidate for a BEFORE UPDATE OR DELETE trigger.
CREATE OR REPLACE FUNCTION bi_temporal_manage_history()
RETURNS TRIGGER AS $$
BEGIN
-- End the transaction_time of the old/current row
-- We capture the current system time ONCE for consistency
DECLARE
transaction_now timestamptz := now();
BEGIN
UPDATE product_prices
SET transaction_time = tsrangetz(lower(transaction_time), transaction_now, '[)')
WHERE price_id = OLD.price_id;
-- If the operation is an UPDATE, insert a new row representing the new state
IF (TG_OP = 'UPDATE') THEN
-- First, check if the tracked data has actually changed.
-- Avoid creating new history for 'no-op' updates.
IF (NEW.price, NEW.currency) IS DISTINCT FROM (OLD.price, OLD.currency) THEN
INSERT INTO product_prices (
product_id, price, currency, valid_time, transaction_time
) VALUES (
NEW.product_id,
NEW.price,
NEW.currency,
NEW.valid_time,
tsrangetz(transaction_now, 'infinity', '[)')
);
END IF;
END IF;
END;
-- Return NULL to prevent the original UPDATE/DELETE from happening
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER product_prices_history_trigger
BEFORE UPDATE OR DELETE ON product_prices
FOR EACH ROW
EXECUTE FUNCTION bi_temporal_manage_history();
Key Implementation Details:
UPDATE and an optional INSERT within a single transaction, ensuring atomicity.TG_OP: This special variable tells us whether the trigger was fired by an UPDATE or DELETE.IS DISTINCT FROM check is critical. It prevents the creation of redundant historical records if an UPDATE statement is issued but doesn't actually change any of the temporally-tracked values.RETURN NULL: This is the most important part of the trigger. By returning NULL from a BEFORE trigger, we cancel the original operation (UPDATE or DELETE). Our trigger function has already performed the logical equivalent of the operation, so the original one must be suppressed.Let's see it in action. First, we need a current record.
-- Let's start clean
TRUNCATE product_prices;
-- Insert a price for our product effective for Q1 2024
INSERT INTO product_prices (product_id, price, currency, valid_time, transaction_time)
VALUES ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', 100.00, 'USD',
tsrangetz('2024-01-01Z', '2024-04-01Z', '[)'),
tsrangetz(now(), 'infinity', '[)'));
Now, let's issue a standard UPDATE statement to change the price.
UPDATE product_prices
SET price = 120.00
WHERE product_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'
AND transaction_time @> 'infinity'::timestamptz;
If we now query the entire table, we see the power of the trigger. The original row is not gone; its transaction_time is now closed, and a new row has been created.
SELECT
price_id,
price,
valid_time,
lower(transaction_time) as tx_start,
upper(transaction_time) as tx_end
FROM product_prices
WHERE product_id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'
ORDER BY tx_start;
Result:
| price_id | price | valid_time | tx_start | tx_end |
|---|---|---|---|---|
| 1 | 100.00 | ["2024-01-01Z","2024-04-01Z") | 2023-10-27 10:00:00.000Z | 2023-10-27 10:05:00.000Z |
| 2 | 120.00 | ["2024-01-01Z","2024-04-01Z") | 2023-10-27 10:05:00.000Z | infinity |
Our UPDATE statement was seamlessly translated into a history-preserving operation.
Advanced Querying: Unlocking Temporal Insight
The entire purpose of this complex setup is to enable powerful and precise temporal queries. Let's create a more interesting dataset to demonstrate.
-- On 2024-01-10, we set the price for Feb to be $50
INSERT INTO product_prices (product_id, price, currency, valid_time, transaction_time)
VALUES ('b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12', 50.00, 'CAD',
tsrangetz('2024-02-01Z', '2024-03-01Z', '[)'),
tsrangetz('2024-01-10Z', 'infinity', '[)'));
-- On 2024-01-20, we realize there was a mistake. The price for Feb should have been $55.
-- This is a retroactive correction of valid time.
UPDATE product_prices
SET price = 55.00
WHERE product_id = 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12'
AND valid_time @> '2024-02-15Z'::timestamptz
AND transaction_time @> 'infinity'::timestamptz;
Our table now contains the history of this correction.
Query Pattern 1: Current State ("As-Is")
This is the simplest query: what is the price of the product right now, for a given effective date?
-- What is the price for Feb 15th, according to our current knowledge?
SELECT price, currency
FROM product_prices
WHERE product_id = 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12'
AND valid_time @> '2024-02-15Z'::timestamptz -- The date we care about in the real world
AND transaction_time @> 'infinity'::timestamptz; -- The current state in the DB
Result: 55.00
This is correct. Our most recent update set the price to $55.
Query Pattern 2: Point-in-Time State ("As-Of")
What was the price for a given effective date, according to our current knowledge at a specific point in the past? This is still a relatively simple query.
-- What was the price for Feb 15th, as of Jan 25th?
SELECT price, currency
FROM product_prices
WHERE product_id = 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12'
AND valid_time @> '2024-02-15Z'::timestamptz
AND transaction_time @> '2024-01-25Z'::timestamptz;
Result: 55.00
This is also correct. By Jan 25th, we had already made the correction on Jan 20th.
Query Pattern 3: Historical State ("As-Was")
This is the quintessential bi-temporal query that most systems cannot answer. On a past date, what did we believe the price would be for a future date?
-- On Jan 15th, what did we think the price for Feb 15th was?
SELECT price, currency
FROM product_prices
WHERE product_id = 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12'
AND valid_time @> '2024-02-15Z'::timestamptz -- The effective date we are interested in
AND transaction_time @> '2024-01-15Z'::timestamptz; -- The date of our knowledge
Result: 50.00
This is the magic. On January 15th, the correction had not yet been made. The database correctly reports the state of its knowledge at that specific point in transaction time. This ability to query both time axes independently is immensely powerful.
Production Edge Cases & Performance Tuning
While powerful, this model requires careful handling of edge cases and performance considerations.
1. Performance and Indexing
The GIST index for the exclusion constraint is mandatory. However, it won't be optimal for all query patterns. For our "As-Was" queries, we are filtering by product_id and performing range checks on both valid_time and transaction_time.
A multi-column B-Tree index is highly recommended to accelerate these lookups:
CREATE INDEX idx_product_prices_queries
ON product_prices (product_id, valid_time, transaction_time);
However, a standard B-Tree index on range types only indexes the bounds. For @> (contains) queries, the GIST index is often superior. Let's verify with EXPLAIN ANALYZE.
EXPLAIN ANALYZE
SELECT price FROM product_prices
WHERE product_id = 'b1eebc99-9c0b-4ef8-bb6d-6bb9bd380a12'
AND valid_time @> '2024-02-15Z'::timestamptz
AND transaction_time @> '2024-01-15Z'::timestamptz;
Running this will likely show Postgres choosing the GIST index (product_prices_temporal_integrity). This index can efficiently handle the multi-dimensional filtering on product_id and valid_time. Performance is generally excellent for these lookups, even on very large tables, because the GIST index can narrow down the search space effectively.
2. Handling `valid_time` Modifications
Our trigger handles data changes, but what if the validity period itself needs to change? For example, a promotion that was supposed to run for all of February is now extended to mid-March.
This is not a simple UPDATE. You cannot just UPDATE product_prices SET valid_time = '...'. This would violate the trigger's logic and the historical record. A valid_time change is a logical deletion of the old validity period and an insertion of a new one.
The correct procedure is:
DELETE the current record (which our trigger will turn into a history-preserving update).INSERT a new record with the new valid_time and the same price data.BEGIN;
-- Step 1: End the current record for the February price
DELETE FROM product_prices
WHERE product_id = '...'
AND valid_time = tsrangetz('2024-02-01Z', '2024-03-01Z', '[)')
AND transaction_time @> 'infinity'::timestamptz;
-- Step 2: Insert the new, corrected record with the extended validity
INSERT INTO product_prices (product_id, price, currency, valid_time, transaction_time)
VALUES ('...', 55.00, 'CAD',
tsrangetz('2024-02-01Z', '2024-03-15Z', '[)'),
tsrangetz(now(), 'infinity', '[)'));
COMMIT;
This should be encapsulated in a stored procedure (correct_price_validity(...)) to ensure it's always performed correctly and atomically.
3. Abstraction with Views
Exposing the raw product_prices table to application developers is an anti-pattern. They will be tempted to write simple UPDATE and DELETE queries that, while handled by the trigger, don't account for the complexity of valid_time changes. The schema is complex.
The best practice is to provide a simplified view that only shows the current state of the world.
CREATE OR REPLACE VIEW current_product_prices AS
SELECT
price_id,
product_id,
price,
currency,
valid_time
FROM product_prices
WHERE transaction_time @> 'infinity'::timestamptz;
Developers can now query current_product_prices as if it were a simple, non-temporal table. For writes, you can create an INSTEAD OF trigger on the view to translate simple INSERT, UPDATE, DELETE operations on the view into the more complex, history-preserving operations on the base table. This creates a powerful abstraction layer, hiding the bi-temporal complexity from the main application code.
Conclusion
Implementing a bi-temporal data model is a significant architectural decision. It introduces complexity in the data layer and requires a shift in how developers think about data modification. However, for systems where auditability and point-in-time correctness are non-negotiable, it is an unparalleled solution.
By leveraging PostgreSQL's advanced features—specifically tsrangetz range types, GIST indexing, and partial exclusion constraints—we can move the enforcement of complex temporal business rules from fragile application code directly into the database. This declarative approach provides a level of data integrity and query capability that traditional history-tracking methods simply cannot match. The result is a system that is not only correct today but can prove its correctness for any point in its own history.