CockroachDB Geo-Partitioning for GDPR and Low-Latency APIs

15 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Multi-Region Conundrum: Latency vs. Compliance

In a globally distributed system, the laws of physics and the laws of nations are often in direct conflict. Your users in Sydney expect sub-50ms API response times, which necessitates infrastructure in the APAC region. Simultaneously, your users in Frankfurt are protected by the General Data Protection Regulation (GDPR), which mandates that their personal data remains within EU data centers. A naive, single-region deployment in us-east-1 is a non-starter, delivering poor performance to a global user base and violating international law.

The challenge is to build a single, logical database that can physically constrain data to specific geographies while still allowing for efficient cross-geo queries when necessary. This is where traditional database replication strategies fall short. Active-passive failover doesn't solve the latency problem for your active region's distant users, and multi-primary replication often runs afoul of data residency requirements by placing a full copy of the data in every region.

This is the precise problem space that CockroachDB's geo-partitioning features are designed to solve. This article is not an introduction; it is a deep dive for architects and senior engineers who already understand the fundamentals of distributed SQL. We will focus on the nuanced implementation details, production-ready schema patterns, query optimization strategies, and the operational edge cases you will inevitably face when deploying a geo-partitioned database at scale.

Our working scenario will be a multi-tenant SaaS application with a global user base spread across three primary regions: Europe (eu-west-1), North America (us-east-1), and Asia-Pacific (ap-southeast-1). Our goals are:

  • GDPR Compliance: User and user-generated data for EU tenants must reside on nodes within the eu-west-1 region.
  • Low-Latency Reads/Writes: When a user in any region accesses their own data, the database operations should be served by nodes within their local region to minimize network latency.
  • Global Data Coherency: The system must operate as a single logical database, supporting cross-region queries and transactions, albeit with an understanding of the performance implications.
  • Deep Dive: Geo-Partitioning Architecture Internals

    Before we design our schema, we must understand the primitives CockroachDB provides. Geo-partitioning is built upon the database's core key-value storage layer, where data is sharded into ranges. Partitioning provides a mechanism to instruct the cluster's leaseholders—the replicas responsible for coordinating reads and writes for a given range—to remain on nodes with specific locality tags.

    First, you must start your CockroachDB nodes with locality flags. For example, a node in an AWS Ireland data center would be started like this:

    bash
    cockroach start --locality=region=eu-west-1,zone=eu-west-1a ...

    This locality metadata is fundamental. The next step is to define these regions at the database level:

    sql
    -- This command makes the database "region-aware"
    ALTER DATABASE my_saas PRIMARY REGION "us-east-1";
    ALTER DATABASE my_saas ADD REGION "eu-west-1";
    ALTER DATABASE my_saas ADD REGION "ap-southeast-1";

    The PRIMARY REGION is significant. It's the default region for data that isn't explicitly partitioned and for certain database-level metadata. For a new tenant signing up without a specified region, their data would land here.

    Partitioning Strategies: `PARTITION BY LIST` vs. `REGIONAL BY ROW`

    CockroachDB offers two primary models for geo-partitioning data:

  • PARTITION BY LIST (Explicit Partitioning): This gives you fine-grained control. You define partitions and explicitly map column values to them. This is the most powerful and flexible option, especially for complex compliance scenarios.
  • REGIONAL BY ROW: This is a more automated approach. You add a hidden crdb_region column to your table, and CockroachDB automatically creates and manages partitions based on the values in this column. It's simpler to set up but offers less control than explicit partitioning.
  • We will focus primarily on PARTITION BY LIST as it provides the explicitness required for strict GDPR compliance and complex multi-tenant architectures.

    Production Implementation: A Multi-Tenant SaaS Schema

    Let's design the core tables for our SaaS application. We'll have tenants, users, and documents.

    Step 1: Defining the Regions and Types

    First, we create an ENUM to represent our business regions. This adds a layer of application-level validation and makes the schema more readable.

    sql
    -- Define our supported regions as a type for data integrity
    CREATE TYPE region AS ENUM ('eu', 'us', 'apac');
    
    -- Map this business logic enum to physical CockroachDB regions
    ALTER DATABASE my_saas PLACEMENT RESTRICTED;
    ALTER DATABASE my_saas ADD REGION "eu-west-1";
    ALTER DATABASE my_saas ADD REGION "us-east-1";
    ALTER DATABASE my_saas ADD REGION "ap-southeast-1";

    Step 2: The `tenants` Table

    This table will store information about each customer organization. The region column is the partition key.

    sql
    CREATE TABLE tenants (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name STRING NOT NULL,
        region region NOT NULL,
        created_at TIMESTAMPTZ DEFAULT now(),
        INDEX (region) -- Indexing the partition key is critical for query performance
    )
    PARTITION BY LIST (region) (
        PARTITION eu VALUES IN ('eu'),
        PARTITION us VALUES IN ('us'),
        PARTITION apac VALUES IN ('apac')
    );
    
    -- Pin each logical partition to a physical region
    ALTER PARTITION eu OF TABLE tenants CONFIGURE ZONE USING
        constraints = '[+region=eu-west-1]';
    ALTER PARTITION us OF TABLE tenants CONFIGURE ZONE USING
        constraints = '[+region=us-east-1]';
    ALTER PARTITION apac OF TABLE tenants CONFIGURE ZONE USING
        constraints = '[+region=ap-southeast-1]';

    Here's what we've done:

  • PARTITION BY LIST (region) tells CockroachDB to shard this table based on the region column's value.
  • We defined three partitions: eu, us, and apac.
  • The ALTER PARTITION ... CONFIGURE ZONE commands are the critical link. They instruct CockroachDB that all data (and its leaseholders) for the eu partition must reside on nodes with the +region=eu-west-1 locality tag.
  • Step 3: The `users` and `documents` Tables

    Users belong to tenants, and documents belong to users. A common and highly effective pattern is to cascade the partitioning key down through the relationships.

    sql
    CREATE TABLE users (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        tenant_id UUID NOT NULL REFERENCES tenants(id),
        email STRING NOT NULL,
        region region NOT NULL, -- Denormalized for partitioning and query optimization
        created_at TIMESTAMPTZ DEFAULT now(),
        UNIQUE (tenant_id, email),
        INDEX (region, id) STORING (email) -- Composite index on partition key first
    )
    PARTITION BY LIST (region) (
        PARTITION eu VALUES IN ('eu'),
        PARTITION us VALUES IN ('us'),
        PARTITION apac VALUES IN ('apac')
    );
    
    -- Pin partitions to regions, same as for tenants
    ALTER PARTITION eu OF TABLE users CONFIGURE ZONE USING constraints = '[+region=eu-west-1]';
    ALTER PARTITION us OF TABLE users CONFIGURE ZONE USING constraints = '[+region=us-east-1]';
    ALTER PARTITION apac OF TABLE users CONFIGURE ZONE USING constraints = '[+region=ap-southeast-1]';
    
    CREATE TABLE documents (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        author_id UUID NOT NULL REFERENCES users(id),
        tenant_id UUID NOT NULL, -- Denormalized for partitioning
        region region NOT NULL, -- Denormalized for partitioning
        title STRING NOT NULL,
        content JSONB,
        created_at TIMESTAMPTZ DEFAULT now(),
        FOREIGN KEY (tenant_id, author_id) REFERENCES users(tenant_id, id),
        INDEX (region, tenant_id, created_at DESC)
    )
    PARTITION BY LIST (region) (
        PARTITION eu VALUES IN ('eu'),
        PARTITION us VALUES IN ('us'),
        PARTITION apac VALUES IN ('apac')
    );
    
    -- Pin partitions to regions
    ALTER PARTITION eu OF TABLE documents CONFIGURE ZONE USING constraints = '[+region=eu-west-1]';
    ALTER PARTITION us OF TABLE documents CONFIGURE ZONE USING constraints = '[+region=us-east-1]';
    ALTER PARTITION apac OF TABLE documents CONFIGURE ZONE USING constraints = '[+region=ap-southeast-1]';

    Key Architectural Decision: We are denormalizing the region column into users and documents. While this violates strict normalization principles, it is essential for performance in a distributed system. It allows the query planner to determine which partition to target without performing a join against the tenants table first. The cost of a small amount of redundant storage is negligible compared to the cost of a cross-region join.

    Query Optimization Patterns for Geo-Distributed Data

    With the schema in place, writing performant queries is paramount. A query that is fast in a single-node Postgres instance can be disastrously slow in a geo-distributed CockroachDB cluster if it's not partition-aware.

    Our application servers should be deployed in each region (eu-west-1, us-east-1, ap-southeast-1), following the "Follow-the-Workload" pattern. An API request from a user in Europe should be handled by an application server in eu-west-1. This server will then include the tenant's region in its queries.

    The Anti-Pattern: Partition-Unaware Queries

    Consider a request to fetch a user by their ID, originating from the eu-west-1 application server. A naive query might look like this:

    sql
    -- ANTI-PATTERN: This query does not include the partition key.
    SELECT * FROM users WHERE id = '...';

    Let's analyze what happens. The gateway node that receives this query has no idea where the user with this ID lives. It must perform a scatter-gather operation: it sends a request to at least one node in every partition (eu, us, and apac) to find the data. This involves multiple cross-continent network round trips.

    sql
    EXPLAIN ANALYZE SELECT * FROM users WHERE id = '...';
    
    -- Simplified output will show:
    --   • Spans: /eu/users/..., /us/users/..., /apac/users/...
    --   • Latency: >200ms (due to cross-region hops)

    The Optimized Pattern: Partition-Aware Queries

    The application layer must have access to the current user's or tenant's region. This is typically stored in a JWT, session, or fetched from a local cache at the start of a request. The optimized query includes the partition key in the WHERE clause.

    sql
    -- OPTIMIZED: The query planner knows exactly where to go.
    SELECT * FROM users WHERE region = 'eu' AND id = '...';

    When the gateway node in eu-west-1 receives this query, the planner sees region = 'eu'. It knows this maps to the eu partition, which is pinned to the eu-west-1 region. The query is routed directly to a local node, avoiding any cross-continent network latency.

    sql
    EXPLAIN ANALYZE SELECT * FROM users WHERE region = 'eu' AND id = '...';
    
    -- Simplified output will show:
    --   • Spans: /eu/users/... (ONLY ONE PARTITION SCANNED)
    --   • Latency: <10ms (a local read)

    This difference between >200ms and <10ms is the entire business case for geo-partitioning. Your application logic must be designed to leverage it.

    Advanced Scenario: Cross-Partition Joins

    What if an admin needs to run a report joining a partitioned table with a non-partitioned, global table? Let's introduce a plans table.

    sql
    -- A GLOBAL table: replicated to all regions for fast local reads.
    CREATE TABLE plans (
        id UUID PRIMARY KEY,
        name STRING NOT NULL
    );
    
    ALTER TABLE plans CONFIGURE ZONE USING
        num_replicas = 3, -- Or more, depending on number of regions
        constraints = '[]', -- No constraints, let replicas be placed anywhere
        lease_preferences = '[[+region=eu-west-1],[+region=us-east-1],[+region=ap-southeast-1]]';

    This lease_preferences setting tells CockroachDB to try and maintain a leaseholder in each of the specified regions, making reads from anywhere fast. Writes, however, will be slower as they require coordination across all replicas.

    Now, a query to get all tenants in the EU and their plan names:

    sql
    SELECT t.name, p.name
    FROM tenants AS t
    JOIN plans AS p ON t.plan_id = p.id
    WHERE t.region = 'eu';

    This query is efficient. The planner will:

  • Use t.region = 'eu' to perform a fast, local scan on the tenants table's eu partition.
  • For each tenant found, it will perform a lookup on the plans table. Since plans has a leaseholder in eu-west-1, this join is also a fast, local read.
  • An inefficient join would be trying to join two different partitioned tables on a key that is not the partition key, forcing a distributed join and massive data shuffling across regions.

    Edge Cases and Failure Scenarios in Production

    Real-world systems are messy. Here's how to handle the inevitable complexities.

    Edge Case 1: Moving a Tenant Between Regions

    A US-based company is acquired by an EU-based one. You are now legally required to migrate all of their data from the us partition to the eu partition.

    This is a multi-step, delicate operation that must be performed with zero downtime.

  • Put the Tenant in Maintenance Mode (Application Layer): Temporarily block writes for this specific tenant to prevent data inconsistencies during the move.
  • Update the Partition Key: The core operation is a simple UPDATE. This transaction will be expensive as it involves a delete in one region and an insert in another.
  • sql
        BEGIN;
        UPDATE tenants SET region = 'eu' WHERE id = '...' AND region = 'us';
        UPDATE users SET region = 'eu' WHERE tenant_id = '...' AND region = 'us';
        UPDATE documents SET region = 'eu' WHERE tenant_id = '...' AND region = 'us';
        COMMIT;
  • How CockroachDB Handles the Move: When the region value is updated, CockroachDB's replication layer automatically handles the rest. The data for the updated rows will be moved from ranges associated with the us partition to ranges associated with the eu partition. This process is called rebalancing. It happens in the background and is transparent to the application, but you can monitor its progress in the DB Console.
  • Verification: After the transaction commits and you've allowed time for rebalancing, you can verify the data's location using SHOW RANGES.
  • Disable Maintenance Mode: Once verified, re-enable writes for the tenant. Their subsequent requests will be routed to eu-west-1 and served locally.
  • Edge Case 2: The High Cost of Cross-Region Transactions

    Imagine a feature where two users from different regions (us and eu) can collaborate on a document. An action requires atomically updating the document (in one region) and a notification record for the other user (in the other region).

    sql
    BEGIN;
    -- Write to a document in the US partition
    UPDATE documents SET content = '...' WHERE id = 'doc_123' AND region = 'us';
    -- Write to a user's notification feed in the EU partition
    INSERT INTO notifications (user_id, message, region) VALUES ('user_abc', '...', 'eu');
    COMMIT;

    This transaction will be extremely slow. To guarantee atomicity across geographic regions, CockroachDB must use a Two-Phase Commit (2PC) protocol. The transaction coordinator has to communicate back and forth with the leaseholders for the involved ranges in both us-east-1 and eu-west-1. The latency of this transaction will be bound by the speed of light between the US and Europe, resulting in commit latencies of 100-200ms or more.

    Solution: Avoid cross-region synchronous transactions whenever possible. Redesign the application logic to be asynchronous.

    * Better Pattern: The primary action (updating the document) is performed in a fast, single-region transaction. This transaction also writes an event to an outbound queue table (or a real message queue like Kafka). A separate asynchronous worker processes this queue and creates the notification for the other user.

    * This trades strict, instantaneous consistency for high performance and fault tolerance, a common and necessary trade-off in distributed systems.

    Edge Case 3: Regional Outages

    What happens if the entire us-east-1 region goes offline?

    This depends on your cluster's survival goal. If you configured your database for REGION survival (ALTER DATABASE my_saas SURVIVE REGION FAILURE), your cluster will remain online.

    * Replication: For this to work, for every range of data, at least one replica must exist outside its home region. For the us partition data, CockroachDB would have placed 2 of its 3 replicas on nodes in us-east-1 and the 3rd replica on a node in eu-west-1 or ap-southeast-1.

    * Failover: When us-east-1 goes down, the cluster detects the loss of the leaseholders. For the ranges belonging to the us partition, the surviving replica in another region will acquire the lease after a short timeout (typically ~10 seconds).

    * Performance Impact: The database is still available, but performance for US tenants degrades significantly. Their application servers in us-east-1 are down, so traffic is rerouted to eu-west-1. Database queries for their data now have to cross the Atlantic to the new leaseholder. Reads and writes that were <10ms are now >100ms. The system is available but slow, which is the correct behavior for a disaster recovery scenario.

    Conclusion: A Strategic Architectural Choice

    CockroachDB's geo-partitioning is a powerful and elegant solution to the conflicting demands of data residency and low-latency performance for global applications. It is not, however, a magic bullet. Implementing it successfully requires a fundamental shift in mindset from traditional, single-center database design.

    As senior engineers and architects, our key takeaways are:

  • Schema Design is Paramount: Denormalizing the partitioning key down through related tables is a necessary pattern for performance. The schema must be designed from day one with data locality in mind.
  • The Application Must Be Partition-Aware: The burden of performance rests heavily on the application's ability to supply the partition key in WHERE clauses to enable fast, local queries. This must be a core consideration in your API and data access layers.
  • Embrace Asynchronicity: Avoid synchronous, cross-region transactions. Design application workflows around eventual consistency and asynchronous communication to maintain performance and system resilience.
  • Plan for Operations: Understand the operational playbook for complex tasks like tenant migration and the performance implications of regional failures. These are not just database concerns; they are product and system architecture concerns.
  • By internalizing these principles and moving beyond high-level concepts to the detailed implementation patterns discussed here, you can build truly global, compliant, and high-performance systems that meet the stringent demands of modern software engineering.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles