CockroachDB Follower Reads for Low-Latency Geo-Distributed APIs
The Tyranny of Light Speed: Geo-Distributed Read Latency
In any globally distributed, strongly consistent database like CockroachDB, the fundamental promise is survivability and serializable isolation. This is achieved through a consensus protocol, typically Raft, where writes and consistent reads must be coordinated through a single replica in a Raft group: the leaseholder. While this architecture provides correctness, it introduces an unavoidable performance bottleneck for read-heavy, globally-deployed applications: network latency.
Consider a three-region CockroachDB cluster (us-east-1, eu-west-1, ap-southeast-1) with an application server deployed in each region. If a user in Europe (eu-west-1) requests data whose leaseholder resides in the US (us-east-1), the request must traverse the Atlantic and back. This round trip alone can add 70-100ms of latency, even before the database performs any work. For applications requiring sub-100ms API response times, this is often a non-starter.
Here’s a simple visualization of the problem:
graph TD
subgraph eu-west-1
AppEU[App Server]
ReplicaEU[CRDB Replica]
end
subgraph us-east-1
ReplicaUS[CRDB Replica (Leaseholder)]
end
AppEU -- Read Request --> ReplicaEU
ReplicaEU -- Forward to Leaseholder --> ReplicaUS
ReplicaUS -- Process Read --> ReplicaUS
ReplicaUS -- Response --> ReplicaEU
ReplicaEU -- Response --> AppEU
linkStyle 1 stroke:#ff0000,stroke-width:2px,stroke-dasharray: 5 5
linkStyle 3 stroke:#ff0000,stroke-width:2px,stroke-dasharray: 5 5
The dashed red lines represent the costly cross-continent network hops. For many read patterns—product catalogs, articles, user profiles, configuration data—this level of latency for perfect, up-to-the-millisecond data is overkill. The business requirement is often "recent enough," not "perfectly consistent."
This is where Follower Reads become a critical tool. They allow us to bypass the leaseholder and serve a read directly from a local replica, trading a small, configurable amount of data staleness for a massive reduction in latency.
Follower Reads: A Scalpel, Not a Sledgehammer
Follower Reads are a mechanism that allows any replica, not just the leaseholder, to serve a read request. The key is that the read is performed at a specific point in the past, ensuring that the data returned is consistent as of that timestamp. This is accomplished using the AS OF SYSTEM TIME clause in your SQL queries.
There are two primary ways to specify the timestamp:
follower_read_timestamp(): This is the most aggressive option for latency reduction. It instructs CockroachDB to use the most recent timestamp at which the local replica can guarantee a consistent historical view. This is typically a very small fraction of a second in the past, but the exact staleness is not guaranteed. It will always be served by the local node.'-5s'): This provides a service-level objective (SLO) for data freshness. You are telling the database, "Give me data that is no more than 5 seconds old." CockroachDB will attempt to serve this from a local replica. If the local replica's data is older than the requested bound, it will forward the request to a replica that can satisfy the time bound, potentially re-introducing cross-region latency.The Core SQL Syntax
A standard, strongly consistent read:
SELECT id, name, price FROM products WHERE id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d';
A follower read for maximum performance:
SELECT id, name, price FROM products WHERE id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d'
AS OF SYSTEM TIME follower_read_timestamp();
A follower read with a bounded staleness of 10 seconds:
BEGIN;
SET TRANSACTION AS OF SYSTEM TIME '-10s';
SELECT id, name FROM products WHERE id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d';
SELECT sku, stock_level FROM inventory WHERE product_id = 'a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d';
COMMIT;
Using a transaction is crucial for ensuring that multiple statements within the same logical operation are all read from the same consistent snapshot in the past.
Production Implementation: A Geo-Aware API Gateway in Go
Sprinkling AS OF SYSTEM TIME throughout your application logic is a recipe for disaster. It tightly couples your business logic to your database's consistency model, making the code brittle and hard to maintain. A far superior pattern is to manage consistency at the architectural level, for example, in a middleware layer.
Let's design a Go-based microservice for an e-commerce platform. Our goal is to serve product catalog reads with low latency globally, while ensuring that cart and order operations remain strongly consistent.
Scenario:
* Reads (/products/{id}): Can be slightly stale. Low latency is critical for user experience.
* Writes (/cart): Must be strongly consistent. Latency is acceptable.
Step 1: Define a Consistency-Aware Context
We'll use Go's context package to propagate the desired consistency level for each request. This decouples the HTTP layer from the data access layer.
package consistency
import "context"
type key int
const consistencyKey key = 0
// Level defines the desired consistency for a database operation.
type Level string
const (
// Strong provides the default, highest level of consistency.
Strong Level = "strong"
// Stale allows for follower reads with minimal latency.
Stale Level = "stale"
)
// WithConsistency returns a new context with the desired consistency level.
func WithConsistency(ctx context.Context, level Level) context.Context {
return context.WithValue(ctx, consistencyKey, level)
}
// FromContext extracts the consistency level from a context.
// It defaults to Strong if no level is set.
func FromContext(ctx context.Context) Level {
if level, ok := ctx.Value(consistencyKey).(Level); ok {
return level
}
return Strong
}
Step 2: The HTTP Middleware
This middleware inspects the incoming request and sets the appropriate consistency level in the context. We can base this on the HTTP method, a specific header, or the URL path.
package main
import (
"log"
"net/http"
"strings"
"yourapp/consistency"
)
func ConsistencyMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
level := consistency.Strong
// Rule-based consistency selection
// For this example, any GET request to /products/ is considered stale-tolerant.
if r.Method == "GET" && strings.HasPrefix(r.URL.Path, "/products/") {
level = consistency.Stale
} else if r.Header.Get("X-Consistency-Level") == "stale" {
// Allow clients to explicitly request stale data
level = consistency.Stale
}
ctx = consistency.WithConsistency(ctx, level)
log.Printf("Request %s %s assigned consistency level: %s", r.Method, r.URL.Path, level)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
Step 3: The Data Access Layer
This is where the magic happens. Our database repository will read the consistency level from the context and dynamically adjust the SQL it executes.
package main
import (
"context"
"database/sql"
"fmt"
"github.com/jackc/pgx/v4/stdlib"
"yourapp/consistency"
)
type Product struct {
ID string
Name string
Price float64
}
type ProductRepository struct {
DB *sql.DB
}
func (r *ProductRepository) GetProductByID(ctx context.Context, id string) (*Product, error) {
level := consistency.FromContext(ctx)
tx, err := r.DB.BeginTx(ctx, &sql.TxOptions{ReadOnly: true})
if err != nil {
return nil, fmt.Errorf("failed to begin transaction: %w", err)
}
defer tx.Rollback() // Rollback is a no-op if Commit is called
if level == consistency.Stale {
// This is the key part: set the transaction to use a follower read.
_, err := tx.ExecContext(ctx, "SET TRANSACTION AS OF SYSTEM TIME follower_read_timestamp()")
if err != nil {
return nil, fmt.Errorf("failed to set follower read transaction: %w", err)
}
}
product := &Product{}
row := tx.QueryRowContext(ctx, "SELECT id, name, price FROM products WHERE id = $1", id)
if err := row.Scan(&product.ID, &product.Name, &product.Price); err != nil {
if err == sql.ErrNoRows {
return nil, nil // Or a custom not found error
}
return nil, fmt.Errorf("failed to scan product: %w", err)
}
if err := tx.Commit(); err != nil {
return nil, fmt.Errorf("failed to commit transaction: %w", err)
}
return product, nil
}
By wrapping the logic in a transaction, we ensure that even if our GetProductByID method needed to perform multiple reads, they would all be from the same historical snapshot. The SET TRANSACTION statement is scoped to the current transaction and does not affect other connections in the pool.
Putting It All Together: The Main Application
package main
import (
"context"
"database/sql"
"encoding/json"
"log"
"net/http"
"os"
"time"
"github.com/gorilla/mux"
_ "github.com/jackc/pgx/v4/stdlib"
)
// Stubs for consistency and repository from above
func main() {
dbURL := os.Getenv("DATABASE_URL")
if dbURL == "" {
log.Fatal("DATABASE_URL environment variable not set")
}
db, err := sql.Open("pgx", dbURL)
if err != nil {
log.Fatalf("could not connect to database: %v", err)
}
defer db.Close()
db.SetMaxOpenConns(20)
db.SetMaxIdleConns(5)
db.SetConnMaxLifetime(5 * time.Minute)
repo := &ProductRepository{DB: db}
r := mux.NewRouter()
r.Use(ConsistencyMiddleware)
r.HandleFunc("/products/{id}", func(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
id := vars["id"]
product, err := repo.GetProductByID(r.Context(), id)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
if product == nil {
http.NotFound(w, r)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(product)
}).Methods("GET")
// ... other handlers for cart, orders, etc. that don't use follower reads
log.Println("Server starting on :8080")
if err := http.ListenAndServe(":8080", r); err != nil {
log.Fatalf("could not start server: %v", err)
}
}
This architecture is clean, testable, and scalable. The business logic in the handlers is completely unaware of the underlying consistency model, which is controlled entirely by the middleware and data access layer.
Advanced Topic: Bounded Staleness and Monitoring
While follower_read_timestamp() is excellent for performance, the unbounded nature of its staleness can be problematic for some use cases. What if you need a guarantee that data is no more than 5 seconds old? This is where bounded staleness comes in.
Changing our repository is simple:
// In ProductRepository.GetProductByID
if level == consistency.Stale {
stalenessBound := "-5s" // Configurable
query := fmt.Sprintf("SET TRANSACTION AS OF SYSTEM TIME '%s'", stalenessBound)
_, err := tx.ExecContext(ctx, query)
// ...
}
The Trade-off: A bounded staleness query is not guaranteed to be served by a local replica. If the local replica's data is, for example, 7 seconds old, CockroachDB must forward the request to another node that can satisfy the 5-second bound. You might still pay the cross-region latency penalty. Therefore, you must align your staleness bound with your cluster's closed_timestamp_target_duration setting to maximize the chances of a local read.
Monitoring Follower Read Performance
How do you know if your follower reads are actually working? Blindly implementing this feature without monitoring is negligent. CockroachDB exposes crucial metrics via its Prometheus endpoint.
Key Metrics:
* sql.follower_reads.success_count: A counter for successful follower reads. This is your primary indicator that the feature is being used.
* sql.exec.latency: This histogram/summary should be monitored with a label filter on your application name. You should see a bimodal distribution: one peak for fast, local follower reads and another for slower, cross-region strong reads.
Example PromQL Query:
To see the rate of follower reads per instance:
rate(sql_follower_reads_success_count{app_name="your-app-name"}[5m])
To see the 99th percentile latency for reads, distinguishing between those that likely hit the leaseholder vs. those that were local:
# P99 latency for reads from your app in eu-west-1
histogram_quantile(0.99, sum(rate(sql_exec_latency_bucket{app_name="your-app-name", region="eu-west-1"}[5m])) by (le))
You would expect to see this latency number drop significantly after deploying the follower reads feature for your catalog APIs.
Edge Cases and Anti-Patterns
Anti-Pattern: Using Follower Reads for Writes.
You cannot use AS OF SYSTEM TIME with INSERT, UPDATE, DELETE, or any statement that modifies data. CockroachDB will reject the transaction with an error. Follower reads are, as the name implies, for reads only.
Edge Case: The Stale Read Retry Loop.
When using bounded staleness, a query might fail with a retryable error SQLSTATE: 40001 with the message retry txn in a higher priority. This can happen if the node you've connected to cannot serve a read at the requested timestamp. Your application logic should be prepared to handle this.
A robust implementation might catch this specific error and retry the transaction, perhaps with a slightly larger staleness bound or, after a few attempts, by falling back to a strong read.
// Simplified retry logic
const maxRetries = 3
for i := 0; i < maxRetries; i++ {
product, err := repo.GetProductByID(ctx, id)
if err != nil {
// Use pgx error codes to check for serialization failure
var pgErr *pgconn.PgError
if errors.As(err, &pgErr) && pgErr.Code == "40001" {
time.Sleep(time.Duration(10*(i+1)) * time.Millisecond) // Exponential backoff
continue
}
// It's a different error, so break and return
return nil, err
}
return product, nil
}
return nil, fmt.Errorf("failed after %d retries", maxRetries)
Edge Case: Caching and Follower Reads.
Layering a cache like Redis on top of follower reads is a powerful pattern. The API server in eu-west-1 can perform a fast follower read and populate its local Redis instance. However, what TTL do you set? Since the data is already slightly stale, a very long TTL could exacerbate the issue. A sophisticated pattern involves using the cluster_logical_timestamp() returned by CockroachDB with the query to set a more intelligent cache expiration, ensuring the cache doesn't hold data significantly older than the database replica itself.
Performance Benchmarking: The Proof
To quantify the impact, we set up a 3-node CockroachDB cluster across us-east-1, eu-west-1, and ap-southeast-1. The products table's leaseholder was pinned to the us-east-1 replica. We used the Go application from above, deployed in eu-west-1, and benchmarked it with k6.
Test Scenario: 100 concurrent users requesting product details for 60 seconds.
| Test Case | Average Latency | p95 Latency | p99 Latency |
|---|---|---|---|
| Strong Read (Default) | 85.2 ms | 110.4 ms | 145.1 ms |
| Follower Read | 4.1 ms | 8.9 ms | 15.3 ms |
The results are not just improvements; they are transformative. We see a >20x reduction in average latency and a ~10x reduction in tail latency (p99). This is the difference between a snappy user interface and a sluggish one.
Conclusion
Follower Reads in CockroachDB are not a simple performance flag; they are a fundamental architectural tool for building high-performance, geo-distributed systems. By trading a small, controllable amount of data staleness for significant latency reduction, you can solve the cross-region read problem without abandoning the safety and scale of a distributed SQL database.
The key to successful implementation lies not in ad-hoc query changes, but in a deliberate architectural approach. By isolating consistency control in middleware and creating a clear contract within your data access layer, you can apply this powerful feature surgically, exactly where it's needed. Combined with robust monitoring of follower read rates and latency distribution, this pattern allows senior engineers to deliver truly global applications that are both correct and incredibly fast.