Tuning Sidecarless Service Meshes with Cilium and eBPF

13 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Inherent Overhead of the Sidecar Pattern

For years, the sidecar proxy model, epitomized by Istio's use of Envoy, has been the de facto standard for implementing service meshes in Kubernetes. It's a powerful pattern that decouples application logic from network concerns like mTLS, observability, and traffic management. However, for senior engineers operating at scale, the performance and resource tax of this model is a well-understood and often frustrating reality.

Every pod requires its own dedicated proxy instance. This leads to:

  • Resource Bloat: The aggregate CPU and memory consumption of thousands of Envoy sidecars across a large cluster becomes a significant operational cost. Resource requests and limits for sidecars complicate pod scheduling and can lead to resource fragmentation.
  • Latency Amplification: Each network call from a service (service-a to service-b) is hijacked and traverses the user-space networking stacks of two separate proxies. The path looks like this: service-a -> localhost:15001 (outbound proxy) -> kernel -> kernel -> localhost:15006 (inbound proxy) -> service-b. This trip through the user-space adds measurable latency to every single request, which is particularly detrimental for latency-sensitive applications.
  • Complex Traffic Path: The traffic redirection is typically handled by iptables rules injected into the pod's network namespace. While functional, iptables is a chain-based system that can become complex and inefficient to traverse at scale, and debugging network connectivity involves untangling these intricate rule sets.
  • The industry's search for a more efficient alternative has led to the rise of sidecarless service meshes, a paradigm fundamentally enabled by eBPF (extended Berkeley Packet Filter).

    This article is not an introduction to eBPF. It assumes you understand its core concept: running sandboxed programs in the Linux kernel. We will dive directly into how Cilium leverages eBPF to build a high-performance, sidecarless service mesh, focusing on the specific implementation details, performance tuning, and advanced multi-cluster patterns that matter in production.

    Kernel-Level Magic: eBPF for Traffic Interception

    Instead of a proxy per pod, Cilium deploys a single agent (cilium-agent) per node as a DaemonSet. This agent is responsible for managing the eBPF programs that handle networking, observability, and security for all pods on that node. The key performance gain comes from how and where traffic is intercepted.

    Traditional sidecars use iptables to redirect a pod's traffic to the proxy listening on localhost. Cilium's eBPF approach is far more direct and efficient.

    Socket-Level Interception with `sock_ops`

    For TCP traffic, Cilium attaches eBPF programs to control groups (cgroups) and uses sock_ops hooks. These hooks trigger on socket operations, such as connect(), sendmsg(), and recvmsg(). When an application in a pod attempts to establish a connection:

  • The connect() system call is made.
    • The eBPF program attached to the pod's cgroup is executed within the kernel context.
    • This program has access to the socket's metadata, including the destination IP and port.
    • Cilium's eBPF logic performs a map lookup (an efficient key-value store in the kernel) to identify the destination as a Kubernetes service.
    • It then performs service-to-endpoint translation directly in the kernel, choosing a backing pod IP.
  • Crucially, the eBPF program can then transparently redirect this connection to the destination pod's IP without the packet ever leaving the kernel to be processed by a user-space proxy.
  • Here is a conceptual (and simplified) snippet of what such an eBPF program in C might look like:

    c
    #include <bpf/bpf_helpers.h>
    // ... other includes
    
    // Simplified structure for service mapping
    struct service_key {
        __u32 ip;
        __u16 port;
    };
    
    struct service_endpoint {
        __u32 backend_ip;
        __u16 backend_port;
    };
    
    // BPF map to store Service IP -> Backend Pod IP
    struct {
        __uint(type, BPF_MAP_TYPE_HASH);
        __uint(max_entries, 1024);
        __type(key, struct service_key);
        __type(value, struct service_endpoint);
    } service_map SEC(".maps");
    
    SEC("sockops/connect")
    int bpf_redir_connect(struct bpf_sock_ops *sk_ops) {
        // Only handle IPv4 for simplicity
        if (sk_ops->family != AF_INET) {
            return BPF_OK;
        }
    
        struct service_key key = {};
        key.ip = sk_ops->remote_ip4;
        key.port = bpf_ntohs(sk_ops->remote_port);
    
        // Look up if the destination is a known service
        struct service_endpoint *endpoint = bpf_map_lookup_elem(&service_map, &key);
    
        if (endpoint) {
            // Destination is a service, redirect the connection
            bpf_printk("Redirecting connection from %x to %x:%d", key.ip, endpoint->backend_ip, endpoint->backend_port);
    
            // The core redirection logic
            int ret = bpf_sock_hash_update(sk_ops, &sock_ops_map, endpoint, BPF_ANY);
            if (ret != 0) {
                bpf_printk("Failed to update sock_hash_map: %d", ret);
            }
        }
    
        return BPF_OK;
    }
    
    char _license[] SEC("license") = "GPL";

    This direct kernel-level redirection is the source of the primary latency reduction. We entirely avoid the two user-space hops and associated context switching inherent in the sidecar model.

    Advanced Pattern: High-Performance Multi-Cluster Routing

    The benefits of eBPF become even more pronounced in multi-cluster topologies. Managing service discovery, routing, and security policies across geographically distributed Kubernetes clusters is a significant challenge. Cilium's Cluster Mesh addresses this with a focus on performance and operational simplicity.

    Architecture of Cilium Cluster Mesh

    Cluster Mesh connects multiple clusters into a single networking and policy domain. Each cluster runs its own cilium-agent and clustermesh-apiserver. The agents in one cluster learn about the services and identities in other clusters, populating their local eBPF maps with this information.

    When you declare a service as global, its endpoints from all clusters are shared. Let's consider a practical scenario.

    Scenario: A frontend service in cluster-us-west-1 needs to call a user-db service that has pods running in both us-west-1 and us-east-1 for high availability.

    Step 1: Enable Cluster Mesh and Define the Global Service

    Your Helm values for Cilium deployment in us-west-1 would include:

    yaml
    # values-us-west-1.yaml
    cluster:
      name: us-west-1
      id: 1 # Must be unique per cluster
    
    clustermesh:
      apiserver:
        # TLS certs for inter-cluster communication
        # ... configuration for certs ...
    
      # Point to another cluster's control plane
      config:
        clusters:
          - name: us-east-1
            address: ... # Kube API or LoadBalancer of us-east-1

    You would apply a reciprocal configuration in us-east-1.

    Next, you annotate the user-db service in both clusters to make it global:

    yaml
    # user-db-service.yaml (applied to both clusters)
    apiVersion: v1
    kind: Service
    metadata:
      name: user-db
      namespace: backend
      annotations:
        # This is the key annotation
        io.cilium/global-service: "true"
    spec:
      type: ClusterIP
      ports:
      - port: 5432
        protocol: TCP
      selector:
        app: user-db

    Step 2: The eBPF Routing Logic in Action

    Now, when a frontend pod in us-west-1 makes a call to user-db.backend.svc.cluster.local:

  • The connect() call is intercepted by the Cilium eBPF program on the frontend pod's node.
  • The eBPF program performs a lookup in its service map. It sees that user-db is a global service.
  • The map contains endpoints for user-db from both us-west-1 and us-east-1.
  • Cilium's load balancing logic (by default, topology-aware) will prefer local endpoints. If all local endpoints fail their health checks, it will seamlessly select an endpoint in us-east-1.
  • Let's assume it chooses a remote endpoint in us-east-1. The eBPF program on the source node in us-west-1 encapsulates the original packet directly into a tunnel protocol (like Geneve or VXLAN) and sends it to a gateway node in us-east-1.
  • The receiving node in us-east-1 decapsulates the packet and its eBPF program delivers it directly to the destination user-db pod.
  • This entire cross-cluster routing decision and encapsulation happens in the kernel, avoiding multiple layers of ingress gateways, complex mTLS certificate federation, and user-space hops that plague traditional multi-cluster service mesh implementations.

    Quantifying Performance: A Reproducible Benchmark

    Talk is cheap. Let's define a rigorous benchmark to measure the latency and resource overhead.

    Testbed Setup:

    * Clusters: Two GKE clusters (e.g., n2-standard-4 nodes) in different regions.

    * Tooling: fortio for load generation and latency measurement, Prometheus for resource metrics.

    * Services: A simple client and server application (e.g., fortio-server).

    Three Scenarios to Compare:

  • Baseline: Kubernetes networking with Calico CNI. No service mesh.
  • Sidecar Mesh: Istio 1.18 in its default configuration (Envoy sidecars injected).
  • Sidecarless Mesh: Cilium 1.14 with CNI and Service Mesh enabled.
  • Benchmark Test: Intra-Cluster Latency

    Deploy the fortio-server and fortio-client in the same cluster. From the client pod, run:

    bash
    fortio load -qps 1000 -t 120s -c 64 http://fortio-server:8080/echo

    Expected Results (Sample Data):

    Scenariop50 Latencyp99 LatencyClient CPU (avg)Server Pod CPU (avg)Notes
    Baseline (Calico)0.4 ms1.1 ms0.2 cores0.2 coresThe ground truth of network performance.
    Istio (Sidecar)2.1 ms5.8 ms0.2 cores0.2 (app) + 0.3 (envoy)Notice the added latency and the CPU cost of the Envoy proxy.
    Cilium (Sidecarless)0.6 ms1.5 ms0.2 cores0.2 coresLatency is very close to baseline. No per-pod resource overhead.

    Analysis:

    The results clearly show the "sidecar tax." Istio adds over 4ms to the p99 latency, a 400%+ increase over the baseline. Cilium adds only 0.4ms, demonstrating the efficiency of in-kernel processing. Furthermore, the CPU cost in the Istio model is per-pod. For a node with 30 pods, this amounts to 30 * 0.3 = 9 extra cores of CPU dedicated just to running proxies, whereas the Cilium agent's CPU usage is relatively fixed per-node.

    Production Edge Cases and Considerations

    While the performance is compelling, adopting an eBPF-based mesh requires understanding its unique operational characteristics.

    1. The Kernel Version Contract

    eBPF is not a static technology; its capabilities evolve with the Linux kernel. This is a critical production constraint.

    * Baseline: Cilium generally requires Linux kernel 4.19+ for most of its core functionality.

    * Advanced Features: More advanced features, like some socket-level load balancing or efficient session affinity, may require kernel 5.10+.

    Production Implication: You cannot treat your node OS and kernel version as an afterthought. Your infrastructure team must have a process for managing and validating kernel versions across the fleet. A heterogeneous cluster with nodes running wildly different kernel versions can lead to inconsistent behavior and difficult-to-debug issues. Always consult the Cilium documentation for the feature-to-kernel-version mapping.

    2. The Observability Paradigm Shift

    Envoy provides extremely rich, standardized L7 metrics (HTTP status codes, request durations, gRPC status, etc.) out of the box. How does a sidecarless model compete?

    Cilium's answer is Hubble. Hubble leverages eBPF to capture network flow data with very low overhead. It can provide:

    * L3/L4 Visibility: See all network flows between pods, services, and external endpoints.

    * Policy Verdicts: See which network policies are allowing or denying traffic, which is invaluable for debugging.

    * Service Map: Automatically generate a visual map of service dependencies.

    However, for deep L7 metrics (e.g., http_requests_total with labels for path and method), the story is more nuanced. Cilium can parse some protocols like HTTP, gRPC, and Kafka directly in the kernel, but it's not as exhaustive as Envoy's protocol filters. The metrics are exposed via Hubble and can be scraped by Prometheus.

    Production Implication: You might need to adjust your observability strategy. Rely on Hubble for network-level visibility and policy enforcement, and ensure your applications export their own high-cardinality L7 metrics (a best practice anyway). This combination provides comprehensive observability without the performance penalty of universal L7 proxying.

    3. The Hybrid Approach for L7 Policies

    What if you absolutely need fine-grained L7 traffic management, like path-based routing (/api/v1 -> service-v1, /api/v2 -> service-v2) or gRPC method-based authorization?

    Cilium handles this with a hybrid model. It can use eBPF to redirect traffic for specific services to a node-local Envoy proxy only when an L7 policy is applied.

    Here's a CiliumNetworkPolicy example:

    yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: api-l7-routing
    spec:
      endpointSelector:
        matchLabels:
          app: api-gateway
      egress:
      - toPorts:
        - ports:
          - port: "8080"
            protocol: TCP
          rules:
            http:
            - method: "GET"
              path: "/public/.*"
            - method: "POST"
              path: "/data/ingest"

    When this policy is applied, Cilium's eBPF programs on the node are updated. Traffic destined for the api-gateway on port 8080 will be redirected to an Envoy proxy managed by the Cilium agent. All other traffic on the node continues to bypass the proxy, flowing directly via the kernel.

    Production Implication: This gives you the best of both worlds. You pay the performance price of a proxy only for the traffic that requires deep L7 inspection. This is a surgical approach, contrasting sharply with the all-or-nothing proxy injection of the traditional sidecar model.

    Conclusion: A New Performance Frontier

    eBPF-based, sidecarless service meshes are not a silver bullet, but they represent a fundamental architectural shift that directly addresses the performance and resource overhead of the sidecar pattern. For senior engineers building and operating large-scale, latency-sensitive distributed systems, the trade-offs are compelling.

    By moving routing, load balancing, and policy enforcement from user-space proxies into the Linux kernel, Cilium and similar technologies offer near-baseline network performance. The cost is a tighter coupling with the kernel version and a shift in observability strategies. However, with hybrid models for L7 policy and powerful tools like Hubble, the sidecarless approach provides a flexible and highly-performant foundation for the next generation of cloud-native infrastructure.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles