CockroachDB Geo-Partitioning for GDPR and Low-Latency APIs
The Multi-Region Conundrum: Latency vs. Compliance
In a globally distributed system, the laws of physics and the laws of nations are often in direct conflict. Your users in Sydney expect sub-50ms API response times, which necessitates infrastructure in the APAC region. Simultaneously, your users in Frankfurt are protected by the General Data Protection Regulation (GDPR), which mandates that their personal data remains within EU data centers. A naive, single-region deployment in us-east-1 is a non-starter, delivering poor performance to a global user base and violating international law.
The challenge is to build a single, logical database that can physically constrain data to specific geographies while still allowing for efficient cross-geo queries when necessary. This is where traditional database replication strategies fall short. Active-passive failover doesn't solve the latency problem for your active region's distant users, and multi-primary replication often runs afoul of data residency requirements by placing a full copy of the data in every region.
This is the precise problem space that CockroachDB's geo-partitioning features are designed to solve. This article is not an introduction; it is a deep dive for architects and senior engineers who already understand the fundamentals of distributed SQL. We will focus on the nuanced implementation details, production-ready schema patterns, query optimization strategies, and the operational edge cases you will inevitably face when deploying a geo-partitioned database at scale.
Our working scenario will be a multi-tenant SaaS application with a global user base spread across three primary regions: Europe (eu-west-1), North America (us-east-1), and Asia-Pacific (ap-southeast-1). Our goals are:
eu-west-1 region.Deep Dive: Geo-Partitioning Architecture Internals
Before we design our schema, we must understand the primitives CockroachDB provides. Geo-partitioning is built upon the database's core key-value storage layer, where data is sharded into ranges. Partitioning provides a mechanism to instruct the cluster's leaseholders—the replicas responsible for coordinating reads and writes for a given range—to remain on nodes with specific locality tags.
First, you must start your CockroachDB nodes with locality flags. For example, a node in an AWS Ireland data center would be started like this:
cockroach start --locality=region=eu-west-1,zone=eu-west-1a ...
This locality metadata is fundamental. The next step is to define these regions at the database level:
-- This command makes the database "region-aware"
ALTER DATABASE my_saas PRIMARY REGION "us-east-1";
ALTER DATABASE my_saas ADD REGION "eu-west-1";
ALTER DATABASE my_saas ADD REGION "ap-southeast-1";
The PRIMARY REGION is significant. It's the default region for data that isn't explicitly partitioned and for certain database-level metadata. For a new tenant signing up without a specified region, their data would land here.
Partitioning Strategies: `PARTITION BY LIST` vs. `REGIONAL BY ROW`
CockroachDB offers two primary models for geo-partitioning data:
PARTITION BY LIST (Explicit Partitioning): This gives you fine-grained control. You define partitions and explicitly map column values to them. This is the most powerful and flexible option, especially for complex compliance scenarios.REGIONAL BY ROW: This is a more automated approach. You add a hidden crdb_region column to your table, and CockroachDB automatically creates and manages partitions based on the values in this column. It's simpler to set up but offers less control than explicit partitioning.We will focus primarily on PARTITION BY LIST as it provides the explicitness required for strict GDPR compliance and complex multi-tenant architectures.
Production Implementation: A Multi-Tenant SaaS Schema
Let's design the core tables for our SaaS application. We'll have tenants, users, and documents.
Step 1: Defining the Regions and Types
First, we create an ENUM to represent our business regions. This adds a layer of application-level validation and makes the schema more readable.
-- Define our supported regions as a type for data integrity
CREATE TYPE region AS ENUM ('eu', 'us', 'apac');
-- Map this business logic enum to physical CockroachDB regions
ALTER DATABASE my_saas PLACEMENT RESTRICTED;
ALTER DATABASE my_saas ADD REGION "eu-west-1";
ALTER DATABASE my_saas ADD REGION "us-east-1";
ALTER DATABASE my_saas ADD REGION "ap-southeast-1";
Step 2: The `tenants` Table
This table will store information about each customer organization. The region column is the partition key.
CREATE TABLE tenants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name STRING NOT NULL,
region region NOT NULL,
created_at TIMESTAMPTZ DEFAULT now(),
INDEX (region) -- Indexing the partition key is critical for query performance
)
PARTITION BY LIST (region) (
PARTITION eu VALUES IN ('eu'),
PARTITION us VALUES IN ('us'),
PARTITION apac VALUES IN ('apac')
);
-- Pin each logical partition to a physical region
ALTER PARTITION eu OF TABLE tenants CONFIGURE ZONE USING
constraints = '[+region=eu-west-1]';
ALTER PARTITION us OF TABLE tenants CONFIGURE ZONE USING
constraints = '[+region=us-east-1]';
ALTER PARTITION apac OF TABLE tenants CONFIGURE ZONE USING
constraints = '[+region=ap-southeast-1]';
Here's what we've done:
PARTITION BY LIST (region) tells CockroachDB to shard this table based on the region column's value.eu, us, and apac.ALTER PARTITION ... CONFIGURE ZONE commands are the critical link. They instruct CockroachDB that all data (and its leaseholders) for the eu partition must reside on nodes with the +region=eu-west-1 locality tag.Step 3: The `users` and `documents` Tables
Users belong to tenants, and documents belong to users. A common and highly effective pattern is to cascade the partitioning key down through the relationships.
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
email STRING NOT NULL,
region region NOT NULL, -- Denormalized for partitioning and query optimization
created_at TIMESTAMPTZ DEFAULT now(),
UNIQUE (tenant_id, email),
INDEX (region, id) STORING (email) -- Composite index on partition key first
)
PARTITION BY LIST (region) (
PARTITION eu VALUES IN ('eu'),
PARTITION us VALUES IN ('us'),
PARTITION apac VALUES IN ('apac')
);
-- Pin partitions to regions, same as for tenants
ALTER PARTITION eu OF TABLE users CONFIGURE ZONE USING constraints = '[+region=eu-west-1]';
ALTER PARTITION us OF TABLE users CONFIGURE ZONE USING constraints = '[+region=us-east-1]';
ALTER PARTITION apac OF TABLE users CONFIGURE ZONE USING constraints = '[+region=ap-southeast-1]';
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
author_id UUID NOT NULL REFERENCES users(id),
tenant_id UUID NOT NULL, -- Denormalized for partitioning
region region NOT NULL, -- Denormalized for partitioning
title STRING NOT NULL,
content JSONB,
created_at TIMESTAMPTZ DEFAULT now(),
FOREIGN KEY (tenant_id, author_id) REFERENCES users(tenant_id, id),
INDEX (region, tenant_id, created_at DESC)
)
PARTITION BY LIST (region) (
PARTITION eu VALUES IN ('eu'),
PARTITION us VALUES IN ('us'),
PARTITION apac VALUES IN ('apac')
);
-- Pin partitions to regions
ALTER PARTITION eu OF TABLE documents CONFIGURE ZONE USING constraints = '[+region=eu-west-1]';
ALTER PARTITION us OF TABLE documents CONFIGURE ZONE USING constraints = '[+region=us-east-1]';
ALTER PARTITION apac OF TABLE documents CONFIGURE ZONE USING constraints = '[+region=ap-southeast-1]';
Key Architectural Decision: We are denormalizing the region column into users and documents. While this violates strict normalization principles, it is essential for performance in a distributed system. It allows the query planner to determine which partition to target without performing a join against the tenants table first. The cost of a small amount of redundant storage is negligible compared to the cost of a cross-region join.
Query Optimization Patterns for Geo-Distributed Data
With the schema in place, writing performant queries is paramount. A query that is fast in a single-node Postgres instance can be disastrously slow in a geo-distributed CockroachDB cluster if it's not partition-aware.
Our application servers should be deployed in each region (eu-west-1, us-east-1, ap-southeast-1), following the "Follow-the-Workload" pattern. An API request from a user in Europe should be handled by an application server in eu-west-1. This server will then include the tenant's region in its queries.
The Anti-Pattern: Partition-Unaware Queries
Consider a request to fetch a user by their ID, originating from the eu-west-1 application server. A naive query might look like this:
-- ANTI-PATTERN: This query does not include the partition key.
SELECT * FROM users WHERE id = '...';
Let's analyze what happens. The gateway node that receives this query has no idea where the user with this ID lives. It must perform a scatter-gather operation: it sends a request to at least one node in every partition (eu, us, and apac) to find the data. This involves multiple cross-continent network round trips.
EXPLAIN ANALYZE SELECT * FROM users WHERE id = '...';
-- Simplified output will show:
-- • Spans: /eu/users/..., /us/users/..., /apac/users/...
-- • Latency: >200ms (due to cross-region hops)
The Optimized Pattern: Partition-Aware Queries
The application layer must have access to the current user's or tenant's region. This is typically stored in a JWT, session, or fetched from a local cache at the start of a request. The optimized query includes the partition key in the WHERE clause.
-- OPTIMIZED: The query planner knows exactly where to go.
SELECT * FROM users WHERE region = 'eu' AND id = '...';
When the gateway node in eu-west-1 receives this query, the planner sees region = 'eu'. It knows this maps to the eu partition, which is pinned to the eu-west-1 region. The query is routed directly to a local node, avoiding any cross-continent network latency.
EXPLAIN ANALYZE SELECT * FROM users WHERE region = 'eu' AND id = '...';
-- Simplified output will show:
-- • Spans: /eu/users/... (ONLY ONE PARTITION SCANNED)
-- • Latency: <10ms (a local read)
This difference between >200ms and <10ms is the entire business case for geo-partitioning. Your application logic must be designed to leverage it.
Advanced Scenario: Cross-Partition Joins
What if an admin needs to run a report joining a partitioned table with a non-partitioned, global table? Let's introduce a plans table.
-- A GLOBAL table: replicated to all regions for fast local reads.
CREATE TABLE plans (
id UUID PRIMARY KEY,
name STRING NOT NULL
);
ALTER TABLE plans CONFIGURE ZONE USING
num_replicas = 3, -- Or more, depending on number of regions
constraints = '[]', -- No constraints, let replicas be placed anywhere
lease_preferences = '[[+region=eu-west-1],[+region=us-east-1],[+region=ap-southeast-1]]';
This lease_preferences setting tells CockroachDB to try and maintain a leaseholder in each of the specified regions, making reads from anywhere fast. Writes, however, will be slower as they require coordination across all replicas.
Now, a query to get all tenants in the EU and their plan names:
SELECT t.name, p.name
FROM tenants AS t
JOIN plans AS p ON t.plan_id = p.id
WHERE t.region = 'eu';
This query is efficient. The planner will:
t.region = 'eu' to perform a fast, local scan on the tenants table's eu partition.plans table. Since plans has a leaseholder in eu-west-1, this join is also a fast, local read.An inefficient join would be trying to join two different partitioned tables on a key that is not the partition key, forcing a distributed join and massive data shuffling across regions.
Edge Cases and Failure Scenarios in Production
Real-world systems are messy. Here's how to handle the inevitable complexities.
Edge Case 1: Moving a Tenant Between Regions
A US-based company is acquired by an EU-based one. You are now legally required to migrate all of their data from the us partition to the eu partition.
This is a multi-step, delicate operation that must be performed with zero downtime.
UPDATE. This transaction will be expensive as it involves a delete in one region and an insert in another. BEGIN;
UPDATE tenants SET region = 'eu' WHERE id = '...' AND region = 'us';
UPDATE users SET region = 'eu' WHERE tenant_id = '...' AND region = 'us';
UPDATE documents SET region = 'eu' WHERE tenant_id = '...' AND region = 'us';
COMMIT;
region value is updated, CockroachDB's replication layer automatically handles the rest. The data for the updated rows will be moved from ranges associated with the us partition to ranges associated with the eu partition. This process is called rebalancing. It happens in the background and is transparent to the application, but you can monitor its progress in the DB Console.SHOW RANGES.eu-west-1 and served locally.Edge Case 2: The High Cost of Cross-Region Transactions
Imagine a feature where two users from different regions (us and eu) can collaborate on a document. An action requires atomically updating the document (in one region) and a notification record for the other user (in the other region).
BEGIN;
-- Write to a document in the US partition
UPDATE documents SET content = '...' WHERE id = 'doc_123' AND region = 'us';
-- Write to a user's notification feed in the EU partition
INSERT INTO notifications (user_id, message, region) VALUES ('user_abc', '...', 'eu');
COMMIT;
This transaction will be extremely slow. To guarantee atomicity across geographic regions, CockroachDB must use a Two-Phase Commit (2PC) protocol. The transaction coordinator has to communicate back and forth with the leaseholders for the involved ranges in both us-east-1 and eu-west-1. The latency of this transaction will be bound by the speed of light between the US and Europe, resulting in commit latencies of 100-200ms or more.
Solution: Avoid cross-region synchronous transactions whenever possible. Redesign the application logic to be asynchronous.
* Better Pattern: The primary action (updating the document) is performed in a fast, single-region transaction. This transaction also writes an event to an outbound queue table (or a real message queue like Kafka). A separate asynchronous worker processes this queue and creates the notification for the other user.
* This trades strict, instantaneous consistency for high performance and fault tolerance, a common and necessary trade-off in distributed systems.
Edge Case 3: Regional Outages
What happens if the entire us-east-1 region goes offline?
This depends on your cluster's survival goal. If you configured your database for REGION survival (ALTER DATABASE my_saas SURVIVE REGION FAILURE), your cluster will remain online.
* Replication: For this to work, for every range of data, at least one replica must exist outside its home region. For the us partition data, CockroachDB would have placed 2 of its 3 replicas on nodes in us-east-1 and the 3rd replica on a node in eu-west-1 or ap-southeast-1.
* Failover: When us-east-1 goes down, the cluster detects the loss of the leaseholders. For the ranges belonging to the us partition, the surviving replica in another region will acquire the lease after a short timeout (typically ~10 seconds).
* Performance Impact: The database is still available, but performance for US tenants degrades significantly. Their application servers in us-east-1 are down, so traffic is rerouted to eu-west-1. Database queries for their data now have to cross the Atlantic to the new leaseholder. Reads and writes that were <10ms are now >100ms. The system is available but slow, which is the correct behavior for a disaster recovery scenario.
Conclusion: A Strategic Architectural Choice
CockroachDB's geo-partitioning is a powerful and elegant solution to the conflicting demands of data residency and low-latency performance for global applications. It is not, however, a magic bullet. Implementing it successfully requires a fundamental shift in mindset from traditional, single-center database design.
As senior engineers and architects, our key takeaways are:
WHERE clauses to enable fast, local queries. This must be a core consideration in your API and data access layers.By internalizing these principles and moving beyond high-level concepts to the detailed implementation patterns discussed here, you can build truly global, compliant, and high-performance systems that meet the stringent demands of modern software engineering.