Bi-temporal Data Modeling in PostgreSQL with Range Types & Triggers
The Challenge: Moving Beyond Uni-temporal History
For senior engineers building systems that require a complete and incorruptible audit trail—think finance, insurance, or healthcare—simple versioning or Slowly Changing Dimension (SCD) Type 2 patterns are often insufficient. These uni-temporal models track the evolution of data over a single time axis, typically when the change was recorded. They can answer "What did this record look like on date X?"
Bi-temporality introduces a second, critical time axis:
valid_period): The time period during which a fact is true in the real world. This is controlled by the business. For example, a product price is valid from January 1st to March 31st.sys_period): The time period during which a fact is stored in the database. This is controlled by the system. It represents the database's knowledge of the world.This distinction is crucial for answering complex historical questions like: "On May 1st, what did we believe the product price was for February 15th?" This query requires navigating both time axes. A backdated price correction made on April 30th would change the answer compared to a query run on April 29th.
Implementing this robustly at the application level is a minefield of race conditions, complex business logic, and potential data corruption. The correct approach is to enforce these temporal constraints directly within the database. PostgreSQL, with its powerful range types, exclusion constraints, and procedural language, provides a uniquely elegant and robust toolkit for this task.
This article will walk through a production-grade implementation of a bi-temporal data model, focusing on the database-level mechanics that guarantee integrity.
The Core Bi-temporal Schema
We'll model a products table where attributes like price and name can change over time. The key is to avoid separate start_date and end_date columns, which are notoriously difficult to query and constrain. Instead, we leverage PostgreSQL's tstzrange (timestamp with time zone range) type.
Table Definition
Our schema will track a product's state using two range columns. We'll also introduce a row_id as a primary key for the physical row and a product_id to identify the logical entity across its history.
-- Ensure the btree_gist extension is available for the exclusion constraint
CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE TABLE products (
-- A unique identifier for the physical row itself
row_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
-- The logical identifier for the product entity across all its versions
product_id UUID NOT NULL,
-- Product attributes that change over time
name TEXT NOT NULL,
price NUMERIC(10, 2) NOT NULL CHECK (price > 0),
-- BI-TEMPORAL COLUMNS
-- 1. Valid Time: When the fact was true in the real world.
valid_period TSTZRANGE NOT NULL,
-- 2. Transaction Time: When the database knew about this fact.
sys_period TSTZRANGE NOT NULL,
-- The magic: A constraint to prevent overlapping valid_period ranges
-- for the same product_id. This is the core integrity guarantee.
CONSTRAINT products_bi_temporal_integrity
EXCLUDE USING GIST (product_id WITH =, valid_period WITH &&)
);
Deconstructing the Schema
* product_id vs. row_id: product_id groups all historical versions of a single product. row_id is the immutable primary key for a specific version record. This distinction is vital for foreign key relationships, which should typically point to the logical product_id.
* tstzrange: This type represents a range of timestamps. A value like '[2023-01-01 00:00:00+00, 2023-04-01 00:00:00+00)' includes the start time but excludes the end time. An unbound (infinite) end is represented by NULL, signifying "until further notice."
* The Exclusion Constraint (EXCLUDE USING GIST): This is the cornerstone of our model's integrity. It's more powerful than a UNIQUE constraint. It asserts that for any two rows in the table, the condition (product_id_A = product_id_B) AND (valid_period_A && valid_period_B) cannot be true. The && operator is the "overlaps" operator for range types. This database-level rule makes it impossible to insert, for instance, two different prices for the same product during the same validity period.
Notice that sys_period is not part of the exclusion constraint. This is intentional. We will have overlapping transaction periods when we correct historical data, as we'll see later.
Indexing for Performance
The exclusion constraint automatically creates a GiST index to enforce itself. However, for querying performance, we need additional, carefully crafted indexes.
-- Index for fast lookups of a product's history
CREATE INDEX idx_products_product_id_sys_period ON products USING GIST (product_id, sys_period);
-- A composite index to accelerate queries that filter by product and a point in valid time.
-- Note the use of GIST because we are indexing a range type.
CREATE INDEX idx_products_product_id_valid_period ON products USING GIST (product_id, valid_period);
B-Tree indexes are not effective for range-based queries like "overlaps" (&&) or "contains" (@>). GiST (Generalized Search Tree) indexes are designed for this and are essential for making bi-temporal queries performant on large datasets.
Automating History with a Trigger Function
Manually managing the sys_period is tedious and error-prone. An UPDATE or DELETE on a bi-temporal table isn't a simple UPDATE or DELETE statement. It's a combination of closing the current record's sys_period and, for updates, inserting a new one.
This is a perfect use case for a PL/pgSQL trigger function.
The Trigger Logic
Our trigger will fire BEFORE UPDATE OR DELETE on the products table. It will intercept the operation and rewrite it to maintain the bi-temporal history.
UPDATE:* Check if the new data is actually different from the old data. If not, do nothing.
* Set the upper bound of the OLD row's sys_period to now() (or transaction_timestamp()), effectively closing it.
* INSERT a NEW row. This new row will have the updated data, the same valid_period as the original update intended, and a new sys_period of [now(), infinity).
* Return NULL to cancel the original UPDATE command, as we've replaced it with our UPDATE (of the old row) and INSERT (of the new row).
DELETE:* This is a "logical delete." We never truly delete data from an audited system.
* Set the upper bound of the OLD row's sys_period to now(), marking it as no longer known to the database.
* Return NULL to cancel the original DELETE command.
The Implementation
Here is the complete, production-ready trigger function and its attachment to the table.
CREATE OR REPLACE FUNCTION bi_temporal_management()
RETURNS TRIGGER AS $$
BEGIN
-- On UPDATE operations
IF (TG_OP = 'UPDATE') THEN
-- Prevent direct modification of the temporal columns
IF (NEW.valid_period <> OLD.valid_period OR NEW.sys_period <> OLD.sys_period) THEN
RAISE EXCEPTION 'Direct modification of temporal periods is not allowed.';
END IF;
-- If nothing meaningful changed, just return the old row and do nothing
-- This prevents creating new history records for no-op updates.
IF (ROW(NEW.product_id, NEW.name, NEW.price) IS NOT DISTINCT FROM ROW(OLD.product_id, OLD.name, OLD.price)) THEN
RETURN OLD;
END IF;
-- End the transaction time of the old row
UPDATE products
SET sys_period = tstzrange(lower(OLD.sys_period), transaction_timestamp(), '[)')
WHERE row_id = OLD.row_id;
-- Insert the new version of the row
INSERT INTO products (product_id, name, price, valid_period, sys_period)
VALUES (NEW.product_id, NEW.name, NEW.price, NEW.valid_period, tstzrange(transaction_timestamp(), NULL, '[)'));
-- Cancel the original UPDATE command
RETURN NULL;
-- On DELETE operations
ELSIF (TG_OP = 'DELETE') THEN
-- Perform a logical delete by ending the transaction time of the row
UPDATE products
SET sys_period = tstzrange(lower(OLD.sys_period), transaction_timestamp(), '[)')
WHERE row_id = OLD.row_id;
-- Cancel the original DELETE command
RETURN NULL;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Attach the trigger to the table
CREATE TRIGGER products_bi_temporal_trigger
BEFORE UPDATE OR DELETE ON products
FOR EACH ROW EXECUTE FUNCTION bi_temporal_management();
This trigger now completely abstracts the complexity of history management. Developers can issue standard INSERT, UPDATE, and DELETE statements, and the database will correctly maintain the bi-temporal audit trail.
Interacting with the Bi-temporal Table: A Practical Walkthrough
Let's see this model in action. We'll simulate a product's lifecycle, including a pricing update and a backdated correction.
1. Initial Product Launch
We launch a new product, "Quantum SSD", with a price of $299.99, valid starting today.
-- Transaction Time: 2024-01-15 10:00:00 UTC
INSERT INTO products (product_id, name, price, valid_period, sys_period)
VALUES (
'a1b2c3d4-e5f6-a7b8-c9d0-e1f2a3b4c5d6',
'Quantum SSD',
299.99,
tstzrange('2024-01-15 00:00:00+00', NULL, '[)'), -- Valid from now until further notice
tstzrange('2024-01-15 10:00:00+00', NULL, '[)') -- Known from now until further notice
);
Table State:
| product_id | name | price | valid_period | sys_period |
|---|---|---|---|---|
| a1b2... | Quantum SSD | 299.99 | [2024-01-15, infinity) | [2024-01-15 10:00, infinity) |
2. A Standard Price Update
On February 1st, we decide to increase the price to $329.99, effective immediately.
-- Transaction Time: 2024-02-01 12:30:00 UTC
UPDATE products
SET price = 329.99
WHERE product_id = 'a1b2c3d4-e5f6-a7b8-c9d0-e1f2a3b4c5d6'
AND upper(sys_period) IS NULL; -- Target the currently active record
Thanks to our trigger, this simple UPDATE results in two changes:
sys_period of the original row is closed.- A new row is inserted with the new price.
Table State:
| product_id | name | price | valid_period | sys_period |
|---|---|---|---|---|
| a1b2... | Quantum SSD | 299.99 | [2024-01-15, infinity) | [2024-01-15 10:00, 2024-02-01 12:30) |
| a1b2... | Quantum SSD | 329.99 | [2024-01-15, infinity) | [2024-02-01 12:30, infinity) |
Note that valid_period remains unchanged. The price change is valid from the product's launch; we've simply updated our knowledge of what that price is.
3. The Edge Case: A Backdated Correction
This is where bi-temporal models truly shine. On March 10th, we discover a data entry error. The price was supposed to be $319.99 from February 15th onwards, not $329.99.
We cannot simply UPDATE the current record. That would wrongly imply the price was $319.99 since January 15th. We need to perform a more surgical operation that reflects this historical correction.
Step 1: Terminate the incorrect future state.
Logically delete the record that contains the incorrect price ($329.99). Our trigger will handle this by closing its sys_period.
-- Transaction Time: 2024-03-10 09:00:00 UTC
DELETE FROM products
WHERE product_id = 'a1b2c3d4-e5f6-a7b8-c9d0-e1f2a3b4c5d6'
AND price = 329.99
AND upper(sys_period) IS NULL;
Step 2: Split the original period and insert the corrected price.
The period from Jan 15th to Feb 15th was correct at $299.99. We need to re-assert this and then insert the new, corrected price for the period from Feb 15th onwards.
-- Transaction Time is still 2024-03-10 09:00:00 UTC
-- Re-insert the correct history for the period BEFORE the correction date.
INSERT INTO products (product_id, name, price, valid_period, sys_period)
VALUES (
'a1b2c3d4-e5f6-a7b8-c9d0-e1f2a3b4c5d6',
'Quantum SSD',
299.99,
-- This was valid from Jan 15th up to (but not including) Feb 15th.
tstzrange('2024-01-15 00:00:00+00', '2024-02-15 00:00:00+00', '[)'),
-- This new fact is known from now on.
tstzrange('2024-03-10 09:00:00+00', NULL, '[)')
);
-- Insert the corrected price for the period AFTER the correction date.
INSERT INTO products (product_id, name, price, valid_period, sys_period)
VALUES (
'a1b2c3d4-e5f6-a7b8-c9d0-e1f2a3b4c5d6',
'Quantum SSD',
319.99, -- The corrected price
-- This is valid from Feb 15th onwards.
tstzrange('2024-02-15 00:00:00+00', NULL, '[)'),
-- This new fact is also known from now on.
tstzrange('2024-03-10 09:00:00+00', NULL, '[)')
);
The exclusion constraint (products_bi_temporal_integrity) ensures these new valid_period ranges do not overlap for the same product_id, guaranteeing data integrity even during complex corrections.
Final Table State:
| product_id | name | price | valid_period | sys_period |
|---|
| a1b2... | Quantum SSD | 299.99 | [2024-01-15, inf) | [2024-01-15 10:00, 2024-02-01 12:30) | Superseded
| a1b2... | Quantum SSD | 329.99 | [2024-01-15, inf) | [2024-02-01 12:30, 2024-03-10 09:00) | Corrected (Logically Deleted)
| a1b2... | Quantum SSD | 299.99 | [2024-01-15, 2024-02-15) | [2024-03-10 09:00, inf) | New Corrected History
| a1b2... | Quantum SSD | 319.99 | [2024-02-15, inf) | [2024-03-10 09:00, inf) | New Corrected History
Our table now contains a complete, auditable history of every state and every change to our knowledge of that state.
The Payoff: Powerful Point-in-Time Querying
With our data structured this way, we can now answer those difficult historical questions with surprising ease using the range containment operator @>.
Query 1: What is the current price? ("As of now, as at now")
This is the simplest query: find the record that is currently valid and currently known.
SELECT price
FROM products
WHERE product_id = 'a1b2c3d4-e5f6-a7b8-c9d0-e1f2a3b4c5d6'
AND valid_period @> now() -- The validity period contains the current time
AND sys_period @> now(); -- The system period contains the current time
-- Result: 319.99
Query 2: What was the price on February 20th? ("As of" query)
This asks for the price that was valid on a specific date, according to our current knowledge.
SELECT price
FROM products
WHERE product_id = 'a1b2c3d4-e5f6-a7b8-c9d0-e1f2a3b4c5d6'
AND valid_period @> '2024-02-20'::timestamptz
AND sys_period @> now();
-- Result: 319.99 (our corrected price)
Query 3: On March 1st, what did we think the price was for February 20th? (Bi-temporal query)
This is the killer query. It travels back on both time axes. We are asking what the database believed (transaction time) about a real-world state (valid time).
SELECT price
FROM products
WHERE product_id = 'a1b2c3d4-e5f6-a7b8-c9d0-e1f2a3b4c5d6'
AND valid_period @> '2024-02-20'::timestamptz -- The state we care about
AND sys_period @> '2024-03-01'::timestamptz; -- The point in time of our knowledge
-- Result: 329.99
This is the correct answer. On March 1st, before the correction on March 10th, the database's active record for that validity period was the one with the price of $329.99.
Conclusion: Database-Enforced Truth
By leveraging PostgreSQL's advanced features—range types, GiST-indexed exclusion constraints, and trigger functions—we have built a bi-temporal data model that is not only powerful but also fundamentally safe. The integrity of the historical data is not left to application-level logic; it is enforced by the database at the transactional level.
This pattern provides complete auditability, enabling systems to answer not just "what is" but also "what was" and "what did we think it was?" While the initial setup is more complex than a standard table, the long-term benefits in data integrity, simplified application logic, and powerful analytical capabilities are immense for any system where history and auditability are paramount.