CockroachDB Follower Reads for Low-Latency Geo-Distributed APIs

12 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Tyranny of Light Speed: Geo-Distributed Read Latency

In any globally distributed, strongly consistent database like CockroachDB, the fundamental promise is survivability and serializable isolation. This is achieved through a consensus protocol, typically Raft, where writes and consistent reads must be coordinated through a single replica in a Raft group: the leaseholder. While this architecture provides correctness, it introduces an unavoidable performance bottleneck for read-heavy, globally-deployed applications: network latency.

Consider a three-region CockroachDB cluster (us-east-1, eu-west-1, ap-southeast-1) with an application server deployed in each region. If a user in Europe (eu-west-1) requests data whose leaseholder resides in the US (us-east-1), the request must traverse the Atlantic and back. This round trip alone can add 70-100ms of latency, even before the database performs any work. For applications requiring sub-100ms API response times, this is often a non-starter.

Here’s a simple visualization of the problem:

mermaid
graph TD
    subgraph eu-west-1
        AppEU[App Server]
        ReplicaEU[CRDB Replica]
    end

    subgraph us-east-1
        ReplicaUS[CRDB Replica (Leaseholder)]
    end

    AppEU -- Read Request --> ReplicaEU
    ReplicaEU -- Forward to Leaseholder --> ReplicaUS
    ReplicaUS -- Process Read --> ReplicaUS
    ReplicaUS -- Response --> ReplicaEU
    ReplicaEU -- Response --> AppEU

    linkStyle 1 stroke:#ff0000,stroke-width:2px,stroke-dasharray: 5 5
    linkStyle 3 stroke:#ff0000,stroke-width:2px,stroke-dasharray: 5 5

The dashed red lines represent the costly cross-continent network hops. For many read patterns—product catalogs, articles, user profiles, configuration data—this level of latency for perfect, up-to-the-millisecond data is overkill. The business requirement is often "recent enough," not "perfectly consistent."

This is where Follower Reads become a critical tool. They allow us to bypass the leaseholder and serve a read directly from a local replica, trading a small, configurable amount of data staleness for a massive reduction in latency.

Follower Reads: A Scalpel, Not a Sledgehammer

Follower Reads are a mechanism that allows any replica, not just the leaseholder, to serve a read request. The key is that the read is performed at a specific point in the past, ensuring that the data returned is consistent as of that timestamp. This is accomplished using the AS OF SYSTEM TIME clause in your SQL queries.

There are two primary ways to specify the timestamp:

  • follower_read_timestamp(): This is the most aggressive option for latency reduction. It instructs CockroachDB to use the most recent timestamp at which the local replica can guarantee a consistent historical view. This is typically a very small fraction of a second in the past, but the exact staleness is not guaranteed. It will always be served by the local node.
  • Bounded Staleness (e.g., '-5s'): This provides a service-level objective (SLO) for data freshness. You are telling the database, "Give me data that is no more than 5 seconds old." CockroachDB will attempt to serve this from a local replica. If the local replica's data is older than the requested bound, it will forward the request to a replica that can satisfy the time bound, potentially re-introducing cross-region latency.
  • The Core SQL Syntax

    A standard, strongly consistent read:

    sql
    SELECT id, name, price FROM products WHERE id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d';

    A follower read for maximum performance:

    sql
    SELECT id, name, price FROM products WHERE id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d'
    AS OF SYSTEM TIME follower_read_timestamp();

    A follower read with a bounded staleness of 10 seconds:

    sql
    BEGIN;
    SET TRANSACTION AS OF SYSTEM TIME '-10s';
    SELECT id, name FROM products WHERE id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d';
    SELECT sku, stock_level FROM inventory WHERE product_id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d';
    COMMIT;

    Using a transaction is crucial for ensuring that multiple statements within the same logical operation are all read from the same consistent snapshot in the past.

    Production Implementation: A Geo-Aware API Gateway in Go

    Sprinkling AS OF SYSTEM TIME throughout your application logic is a recipe for disaster. It tightly couples your business logic to your database's consistency model, making the code brittle and hard to maintain. A far superior pattern is to manage consistency at the architectural level, for example, in a middleware layer.

    Let's design a Go-based microservice for an e-commerce platform. Our goal is to serve product catalog reads with low latency globally, while ensuring that cart and order operations remain strongly consistent.

    Scenario:

    * Reads (/products/{id}): Can be slightly stale. Low latency is critical for user experience.

    * Writes (/cart): Must be strongly consistent. Latency is acceptable.

    Step 1: Define a Consistency-Aware Context

    We'll use Go's context package to propagate the desired consistency level for each request. This decouples the HTTP layer from the data access layer.

    go
    package consistency
    
    import "context"
    
    type key int
    
    const consistencyKey key = 0
    
    // Level defines the desired consistency for a database operation.
    type Level string
    
    const (
    	// Strong provides the default, highest level of consistency.
    	Strong Level = "strong"
    	// Stale allows for follower reads with minimal latency.
    	Stale Level = "stale"
    )
    
    // WithConsistency returns a new context with the desired consistency level.
    func WithConsistency(ctx context.Context, level Level) context.Context {
    	return context.WithValue(ctx, consistencyKey, level)
    }
    
    // FromContext extracts the consistency level from a context.
    // It defaults to Strong if no level is set.
    func FromContext(ctx context.Context) Level {
    	if level, ok := ctx.Value(consistencyKey).(Level); ok {
    		return level
    	}
    	return Strong
    }

    Step 2: The HTTP Middleware

    This middleware inspects the incoming request and sets the appropriate consistency level in the context. We can base this on the HTTP method, a specific header, or the URL path.

    go
    package main
    
    import (
    	"log"
    	"net/http"
    	"strings"
    
    	"yourapp/consistency"
    )
    
    func ConsistencyMiddleware(next http.Handler) http.Handler {
    	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
    		ctx := r.Context()
    		level := consistency.Strong
    
    		// Rule-based consistency selection
    		// For this example, any GET request to /products/ is considered stale-tolerant.
    		if r.Method == "GET" && strings.HasPrefix(r.URL.Path, "/products/") {
    			level = consistency.Stale
    		} else if r.Header.Get("X-Consistency-Level") == "stale" {
                // Allow clients to explicitly request stale data
                level = consistency.Stale
            }
    
    		ctx = consistency.WithConsistency(ctx, level)
    		log.Printf("Request %s %s assigned consistency level: %s", r.Method, r.URL.Path, level)
    		next.ServeHTTP(w, r.WithContext(ctx))
    	})
    }

    Step 3: The Data Access Layer

    This is where the magic happens. Our database repository will read the consistency level from the context and dynamically adjust the SQL it executes.

    go
    package main
    
    import (
    	"context"
    	"database/sql"
    	"fmt"
    
    	"github.com/jackc/pgx/v4/stdlib"
    	"yourapp/consistency"
    )
    
    type Product struct {
    	ID    string
    	Name  string
    	Price float64
    }
    
    type ProductRepository struct {
    	DB *sql.DB
    }
    
    func (r *ProductRepository) GetProductByID(ctx context.Context, id string) (*Product, error) {
    	level := consistency.FromContext(ctx)
    
    	tx, err := r.DB.BeginTx(ctx, &sql.TxOptions{ReadOnly: true})
    	if err != nil {
    		return nil, fmt.Errorf("failed to begin transaction: %w", err)
    	}
    	defer tx.Rollback() // Rollback is a no-op if Commit is called
    
    	if level == consistency.Stale {
    		// This is the key part: set the transaction to use a follower read.
    		_, err := tx.ExecContext(ctx, "SET TRANSACTION AS OF SYSTEM TIME follower_read_timestamp()")
    		if err != nil {
    			return nil, fmt.Errorf("failed to set follower read transaction: %w", err)
    		}
    	}
    
    	product := &Product{}
    	row := tx.QueryRowContext(ctx, "SELECT id, name, price FROM products WHERE id = $1", id)
    	if err := row.Scan(&product.ID, &product.Name, &product.Price); err != nil {
    		if err == sql.ErrNoRows {
    			return nil, nil // Or a custom not found error
    		}
    		return nil, fmt.Errorf("failed to scan product: %w", err)
    	}
    
    	if err := tx.Commit(); err != nil {
    		return nil, fmt.Errorf("failed to commit transaction: %w", err)
    	}
    
    	return product, nil
    }

    By wrapping the logic in a transaction, we ensure that even if our GetProductByID method needed to perform multiple reads, they would all be from the same historical snapshot. The SET TRANSACTION statement is scoped to the current transaction and does not affect other connections in the pool.

    Putting It All Together: The Main Application

    go
    package main
    
    import (
    	"context"
    	"database/sql"
    	"encoding/json"
    	"log"
    	"net/http"
    	"os"
    	"time"
    
    	"github.com/gorilla/mux"
    	_ "github.com/jackc/pgx/v4/stdlib"
    )
    
    // Stubs for consistency and repository from above
    
    func main() {
    	dbURL := os.Getenv("DATABASE_URL")
    	if dbURL == "" {
    		log.Fatal("DATABASE_URL environment variable not set")
    	}
    
    	db, err := sql.Open("pgx", dbURL)
    	if err != nil {
    		log.Fatalf("could not connect to database: %v", err)
    	}
    	defer db.Close()
    
    	db.SetMaxOpenConns(20)
    	db.SetMaxIdleConns(5)
    	db.SetConnMaxLifetime(5 * time.Minute)
    
    	repo := &ProductRepository{DB: db}
    
    	r := mux.NewRouter()
    	r.Use(ConsistencyMiddleware)
    
    	r.HandleFunc("/products/{id}", func(w http.ResponseWriter, r *http.Request) {
    		vars := mux.Vars(r)
    		id := vars["id"]
    
    		product, err := repo.GetProductByID(r.Context(), id)
    		if err != nil {
    			http.Error(w, err.Error(), http.StatusInternalServerError)
    			return
    		}
    		if product == nil {
    			http.NotFound(w, r)
    			return
    		}
    
    		w.Header().Set("Content-Type", "application/json")
    		json.NewEncoder(w).Encode(product)
    	}).Methods("GET")
    
    	// ... other handlers for cart, orders, etc. that don't use follower reads
    
    	log.Println("Server starting on :8080")
    	if err := http.ListenAndServe(":8080", r); err != nil {
    		log.Fatalf("could not start server: %v", err)
    	}
    }

    This architecture is clean, testable, and scalable. The business logic in the handlers is completely unaware of the underlying consistency model, which is controlled entirely by the middleware and data access layer.

    Advanced Topic: Bounded Staleness and Monitoring

    While follower_read_timestamp() is excellent for performance, the unbounded nature of its staleness can be problematic for some use cases. What if you need a guarantee that data is no more than 5 seconds old? This is where bounded staleness comes in.

    Changing our repository is simple:

    go
    // In ProductRepository.GetProductByID
    if level == consistency.Stale {
        stalenessBound := "-5s" // Configurable
        query := fmt.Sprintf("SET TRANSACTION AS OF SYSTEM TIME '%s'", stalenessBound)
        _, err := tx.ExecContext(ctx, query)
        // ...
    }

    The Trade-off: A bounded staleness query is not guaranteed to be served by a local replica. If the local replica's data is, for example, 7 seconds old, CockroachDB must forward the request to another node that can satisfy the 5-second bound. You might still pay the cross-region latency penalty. Therefore, you must align your staleness bound with your cluster's closed_timestamp_target_duration setting to maximize the chances of a local read.

    Monitoring Follower Read Performance

    How do you know if your follower reads are actually working? Blindly implementing this feature without monitoring is negligent. CockroachDB exposes crucial metrics via its Prometheus endpoint.

    Key Metrics:

    * sql.follower_reads.success_count: A counter for successful follower reads. This is your primary indicator that the feature is being used.

    * sql.exec.latency: This histogram/summary should be monitored with a label filter on your application name. You should see a bimodal distribution: one peak for fast, local follower reads and another for slower, cross-region strong reads.

    Example PromQL Query:

    To see the rate of follower reads per instance:

    promql
    rate(sql_follower_reads_success_count{app_name="your-app-name"}[5m])

    To see the 99th percentile latency for reads, distinguishing between those that likely hit the leaseholder vs. those that were local:

    promql
    # P99 latency for reads from your app in eu-west-1
    histogram_quantile(0.99, sum(rate(sql_exec_latency_bucket{app_name="your-app-name", region="eu-west-1"}[5m])) by (le))

    You would expect to see this latency number drop significantly after deploying the follower reads feature for your catalog APIs.

    Edge Cases and Anti-Patterns

    Anti-Pattern: Using Follower Reads for Writes.

    You cannot use AS OF SYSTEM TIME with INSERT, UPDATE, DELETE, or any statement that modifies data. CockroachDB will reject the transaction with an error. Follower reads are, as the name implies, for reads only.

    Edge Case: The Stale Read Retry Loop.

    When using bounded staleness, a query might fail with a retryable error SQLSTATE: 40001 with the message retry txn in a higher priority. This can happen if the node you've connected to cannot serve a read at the requested timestamp. Your application logic should be prepared to handle this.

    A robust implementation might catch this specific error and retry the transaction, perhaps with a slightly larger staleness bound or, after a few attempts, by falling back to a strong read.

    go
    // Simplified retry logic
    const maxRetries = 3
    for i := 0; i < maxRetries; i++ {
        product, err := repo.GetProductByID(ctx, id)
        if err != nil {
            // Use pgx error codes to check for serialization failure
            var pgErr *pgconn.PgError
            if errors.As(err, &pgErr) && pgErr.Code == "40001" {
                time.Sleep(time.Duration(10*(i+1)) * time.Millisecond) // Exponential backoff
                continue
            }
            // It's a different error, so break and return
            return nil, err
        }
        return product, nil
    }
    return nil, fmt.Errorf("failed after %d retries", maxRetries)

    Edge Case: Caching and Follower Reads.

    Layering a cache like Redis on top of follower reads is a powerful pattern. The API server in eu-west-1 can perform a fast follower read and populate its local Redis instance. However, what TTL do you set? Since the data is already slightly stale, a very long TTL could exacerbate the issue. A sophisticated pattern involves using the cluster_logical_timestamp() returned by CockroachDB with the query to set a more intelligent cache expiration, ensuring the cache doesn't hold data significantly older than the database replica itself.

    Performance Benchmarking: The Proof

    To quantify the impact, we set up a 3-node CockroachDB cluster across us-east-1, eu-west-1, and ap-southeast-1. The products table's leaseholder was pinned to the us-east-1 replica. We used the Go application from above, deployed in eu-west-1, and benchmarked it with k6.

    Test Scenario: 100 concurrent users requesting product details for 60 seconds.

    Test CaseAverage Latencyp95 Latencyp99 Latency
    Strong Read (Default)85.2 ms110.4 ms145.1 ms
    Follower Read4.1 ms8.9 ms15.3 ms

    The results are not just improvements; they are transformative. We see a >20x reduction in average latency and a ~10x reduction in tail latency (p99). This is the difference between a snappy user interface and a sluggish one.

    Conclusion

    Follower Reads in CockroachDB are not a simple performance flag; they are a fundamental architectural tool for building high-performance, geo-distributed systems. By trading a small, controllable amount of data staleness for significant latency reduction, you can solve the cross-region read problem without abandoning the safety and scale of a distributed SQL database.

    The key to successful implementation lies not in ad-hoc query changes, but in a deliberate architectural approach. By isolating consistency control in middleware and creating a clear contract within your data access layer, you can apply this powerful feature surgically, exactly where it's needed. Combined with robust monitoring of follower read rates and latency distribution, this pattern allows senior engineers to deliver truly global applications that are both correct and incredibly fast.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles