Cilium & eBPF: High-Performance Service Mesh Without Sidecars

15 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Inherent Cost of the Sidecar Model

For years, the dominant architecture for service meshes, popularized by Istio and Linkerd, has been the sidecar proxy. The pattern is conceptually straightforward: inject a full-featured L7 proxy (typically Envoy) into every application pod. Network traffic is then transparently redirected to this proxy using iptables rules. This model provided a powerful, application-agnostic way to layer on observability, security, and traffic control.

However, for engineering teams operating at scale, the accumulated overhead of this pattern has become a significant source of operational friction and performance degradation. The costs are not trivial and manifest in several key areas:

  • Latency Overhead: Every network request and response must traverse the TCP/IP stack multiple times. A request from Service A to Service B follows a path like this:
  • * Service A process (user-space) -> Kernel TCP/IP stack

    * Kernel (via iptables) -> Service A's Envoy sidecar (user-space)

    * Envoy sidecar (user-space) -> Kernel TCP/IP stack

    * Kernel -> Physical/Virtual Network

    * ...and the reverse path on the receiving end.

    These user-space to kernel-space transitions are expensive context switches. While each switch is measured in microseconds, for latency-sensitive services handling thousands of requests per second, this added P99 latency is a critical performance bottleneck.

  • Resource Consumption: A full-fledged Envoy proxy is not lightweight. Deploying one for every single pod in a cluster results in a massive duplication of resources. Each sidecar consumes a baseline of CPU and memory for its own control plane reconciliation, stats processing, and active network connections. In a cluster with thousands of pods, this can easily translate to dozens or even hundreds of dedicated nodes' worth of resources being spent solely on the mesh infrastructure, not the application logic. This directly impacts cloud spend and cluster density.
  • Operational Complexity: The sidecar model introduces a complex lifecycle management problem. Sidecar injection, version updates, and configuration synchronization are non-trivial. A mismatch between the control plane version and the data plane sidecars can lead to subtle, hard-to-debug failures. Furthermore, the iptables-based traffic redirection can be brittle and difficult to reason about, especially when combined with other networking tools or CNI plugins.
  • This is the fundamental problem Cilium's eBPF-based service mesh aims to solve: providing the rich feature set of a service mesh without the performance and resource tax of the per-pod sidecar model.

    eBPF and Cilium: A Kernel-Native Paradigm Shift

    eBPF (extended Berkeley Packet Filter) allows sandboxed programs to run directly within the Linux kernel, triggered by various hook points. It's a revolutionary technology that enables safe, programmable kernel-level logic without changing kernel source code or loading kernel modules. Cilium leverages eBPF to implement networking, observability, and security in a fundamentally more efficient way.

    Instead of redirecting traffic to a user-space proxy, Cilium attaches eBPF programs to strategic points in the kernel's networking stack, such as:

    * Socket Level: eBPF programs can be attached to sockets to enforce policies or accelerate network functions as soon as data is written or read by an application.

    * Traffic Control (TC): eBPF programs on the TC ingress/egress hooks can inspect, modify, or redirect packets as they enter or leave a network interface (physical or virtual).

    This allows Cilium to understand and control traffic flow at a native kernel level. Here's a comparative view of the traffic path:

    mermaid
    graph TD
        subgraph Sidecar Model (Istio)
            A[App Pod A] -->|1. write()| K1(Kernel)
            K1 -->|2. iptables redirect| E1(Envoy Sidecar)
            E1 -->|3. process()| E1
            E1 -->|4. write()| K1_2(Kernel)
            K1_2 -->|5. Network| B(App Pod B)
        end
    
        subgraph eBPF Model (Cilium)
            C[App Pod C] -->|1. write()| K2{Kernel w/ eBPF}
            K2 -- eBPF program --
            K2 -->|2. Network| D(App Pod D)
        end

    The key insight for Cilium's service mesh is a hybrid approach. For L3/L4 policy enforcement and routing, everything happens within the eBPF programs in the kernel. When L7 (e.g., HTTP) inspection is required, instead of a per-pod proxy, Cilium can efficiently hand off the relevant connection to a single, highly-optimized Envoy proxy running per-node. This amortizes the cost of the proxy across all pods on that node, dramatically reducing the resource footprint.

    This architectural shift moves the service mesh data plane from a distributed, per-pod model to a centralized, per-node model, executed with the performance of in-kernel logic.

    Production Implementation: Cilium Service Mesh in Practice

    Let's move from theory to a practical, production-grade implementation. We'll configure a canary deployment for a microservices application.

    Scenario: We have a product-catalog service. The current stable version is v1. We are deploying a new version, v2, and want to gradually shift 10% of live traffic to it for testing.

    Our Kubernetes deployments would look something like this:

    yaml
    # product-catalog-v1-deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: product-catalog-v1
      labels:
        app: product-catalog
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: product-catalog
          version: v1
      template:
        metadata:
          labels:
            app: product-catalog
            version: v1
        spec:
          containers:
          - name: product-catalog
            image: my-repo/product-catalog:v1.0.0
            ports:
            - containerPort: 8080
    
    ---
    # product-catalog-v2-deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: product-catalog-v2
      labels:
        app: product-catalog
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: product-catalog
          version: v2
      template:
        metadata:
          labels:
            app: product-catalog
            version: v2
        spec:
          containers:
          - name: product-catalog
            image: my-repo/product-catalog:v2.0.0
            ports:
            - containerPort: 8080
    
    ---
    # product-catalog-service.yaml
    apiVersion: v1
    kind: Service
    metadata:
      name: product-catalog
    spec:
      type: ClusterIP
      ports:
      - port: 80
        targetPort: 8080
      selector:
        app: product-catalog

    Notice the service selector app: product-catalog targets pods from both deployments.

    Code Example 1: Implementing Canary Traffic Splitting

    With Cilium, we don't use a VirtualService or TrafficSplit CRD like in other meshes. Instead, we leverage standard Kubernetes Services and extend them with CiliumNetworkPolicy. The key is to define backend services for each version and then use a policy to control the traffic flow to the main service.

    First, we create version-specific services:

    yaml
    # product-catalog-backend-services.yaml
    apiVersion: v1
    kind: Service
    metadata:
      name: product-catalog-v1
    spec:
      ports:
      - port: 80
        targetPort: 8080
      selector:
        app: product-catalog
        version: v1
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: product-catalog-v2
    spec:
      ports:
      - port: 80
        targetPort: 8080
      selector:
        app: product-catalog
        version: v2

    Now, the core logic. We apply a CiliumNetworkPolicy that intercepts traffic going to the main product-catalog service and splits it between our version-specific backends.

    yaml
    # canary-traffic-split-policy.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: canary-product-catalog-split
      namespace: default
    spec:
      endpointSelector: {}
      egress:
      - toServices:
        - k8sService:
            serviceName: product-catalog
            namespace: default
        rules:
          http:
          - {}
      - toEndpoints:
        - matchLabels:
            "k8s:io.kubernetes.service.name": product-catalog
        rules:
          http:
          - headerMatches:
            - name: ":path"
              value: "/products.*"
            loadBalancer:
              policy: "RoundRobin"
              toServices:
              - name: "product-catalog-v1"
                namespace: "default"
                weight: 90
              - name: "product-catalog-v2"
                namespace: "default"
                weight: 10

    Let's break down this policy:

    endpointSelector: {}: This is a crucial detail. An empty selector means this egress policy applies to all pods* in the namespace.

    * toServices: The first block allows traffic to the Kubernetes product-catalog service. This is necessary for service discovery.

    * toEndpoints: The second block is where the magic happens. It selects the endpoints backing the product-catalog service.

    * loadBalancer: This section defines the L7 routing logic.

    * policy: "RoundRobin": Specifies the load balancing algorithm.

    * toServices: This array defines the weighted backends. We direct 90% of the traffic to product-catalog-v1 and 10% to product-catalog-v2.

    When a client pod calls http://product-catalog/products, Cilium's eBPF program intercepts the connection. It recognizes the destination is managed by a Cilium L7 policy, directs the traffic to the node-local Envoy proxy, which then performs the weighted split according to this CRD's configuration. The entire process is transparent to the application.

    Code Example 2: Enabling Mutual TLS (mTLS)

    Cilium can provide mTLS encryption between pods without a sidecar. It leverages the kernel's built-in IPsec or WireGuard capabilities, which are significantly more performant than user-space TLS termination in a proxy.

    Enabling cluster-wide mTLS is often a one-liner during installation or upgrade:

    bash
    helm upgrade cilium cilium/cilium --version 1.15.0 \
       --namespace kube-system \
       --set encryption.enabled=true \
       --set encryption.type=wireguard

    For more granular control, you can use a CiliumNetworkPolicy to enforce mTLS for specific workloads.

    yaml
    # enforce-mtls-for-checkout.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "require-mtls-for-checkout-service"
      namespace: "production"
    spec:
      endpointSelector:
        matchLabels:
          app: checkout-service
      ingress:
      - fromEndpoints:
        - matchLabels:
            app: frontend-api
        authentication:
          mode: "required"

    This policy dictates that any traffic ingressing to the checkout-service from the frontend-api must be authenticated via the mesh's mTLS protocol. If an unencrypted request arrives, the eBPF program in the kernel will drop the packet before it ever reaches the application pod.

    Performance Analysis and Benchmarking

    Claims of performance improvement are meaningless without data. Let's analyze a hypothetical but realistic benchmark comparing a sidecar model (Istio) with Cilium's eBPF-based model.

    Test Setup:

    * Cluster: 3-node GKE cluster, e2-standard-4 instances (4 vCPU, 16GB RAM).

    * Workload: A simple client pod making HTTP requests to a server pod.

    * Tool: fortio for load generation and latency measurement.

    * Scenarios:

    1. Baseline: No service mesh, just plain Kubernetes networking.

    2. Istio: Istio 1.20 installed with default settings (Envoy sidecar injected).

    3. Cilium: Cilium 1.15 installed in eBPF mode with L7 policies enabled.

    Benchmark 1: Request Latency (at 1000 QPS)

    We measure the end-to-end latency for a simple HTTP GET request.

    MetricBaseline (No Mesh)Istio (Sidecar)Cilium (eBPF)
    P50 Latency0.8 ms2.5 ms1.1 ms
    P90 Latency1.5 ms5.8 ms2.0 ms
    P99 Latency2.9 ms12.1 ms3.8 ms

    Analysis:

    * The Istio sidecar adds a significant latency penalty across all percentiles, with the P99 latency increasing by over 4x compared to the baseline. This is the direct cost of the two extra user-space proxy hops.

    * Cilium introduces a much smaller, almost negligible latency increase. The P99 is only ~30% higher than the baseline. This is because the traffic path remains within the kernel for L3/L4, and the L7 handoff to the single per-node proxy is highly optimized.

    Benchmark 2: Resource Consumption

    We measure the CPU and Memory usage of a single application pod (nginx) with and without an injected sidecar.

    ResourcePod (Baseline)Pod + Istio SidecarPod (Cilium - no sidecar)
    CPU (milli-cores)~5m~85m~5m
    Memory (MiB)~10 MiB~110 MiB~10 MiB

    Analysis:

    * The Istio sidecar adds a substantial resource footprint to the application pod: ~80m CPU and ~100 MiB RAM at idle. Under load, this can be much higher.

    * With Cilium, the application pod's resource consumption is identical to the baseline because there is no sidecar. The cost of the service mesh is paid once per-node by the Cilium agent and its shared Envoy proxy, not per-pod.

    * This has massive implications for bin-packing and cluster density. On a node that could run 40 pods without a mesh, adding Istio sidecars might reduce its capacity to 25-30 pods due to the resource overhead. With Cilium, the capacity remains close to 40.

    Advanced Edge Cases and Operational Considerations

    The performance benefits are clear, but senior engineers must consider the operational trade-offs and edge cases.

    Edge Case 1: Debugging Kernel-Level Events

    When a network policy in a sidecar mesh fails, you can kubectl exec into the sidecar and inspect its logs and configuration (envoy -c ...). With eBPF, the logic is in the kernel, which can feel like a black box.

    Solution: Cilium provides powerful observability tools specifically for this purpose.

    * cilium monitor: This command provides a real-time stream of packet-level events as seen by the Cilium agent, including policy verdicts. You can see exactly why a packet was dropped.

    bash
        # See dropped packets with verbose output
        $ kubectl -n kube-system exec -it ds/cilium -- cilium monitor --type drop -v
        
        xx drop (Policy denied) flow 0x0... to endpoint 123, iface eth0, verdict DENY

    * Hubble: Cilium's observability platform. The CLI and UI provide a service dependency graph and allow you to inspect individual network flows.

    Scenario: A request from frontend to product-catalog is failing.

    bash
        # Use the Hubble CLI to observe flows from the frontend pod
        $ hubble observe --from pod:default/frontend -f
        
        # Example Output
        TIMESTAMP     SOURCE              DESTINATION               TYPE     VERDICT   SUMMARY
        ...           default/frontend-a -> default/product-catalog-b   http-request  DROPPED   Policy denied on egress

    This immediately tells you the drop is due to an egress policy on the source pod, narrowing down the debugging scope immensely.

    Edge Case 2: Handling Non-Standard Protocols

    Envoy has rich support for many L7 protocols (HTTP/2, gRPC, Redis, Mongo, etc.). Cilium's eBPF-based L7 parsing is more nascent. While it has excellent support for HTTP, gRPC, and Kafka, what about a custom TCP protocol?

    Solution: You have a few options:

  • L4 Policies: For protocols Cilium can't parse at L7, you can fall back to L4 policies. You can still enforce mTLS and allow/deny traffic based on IP, port, and identity, but you lose L7 capabilities like path-based routing or header manipulation.
  • Envoy CRD: You can still use the full power of Envoy by explicitly configuring it via the CiliumEnvoyConfig CRD. This allows you to inject custom Lua filters or configure advanced protocol-specific features, while still benefiting from the per-node proxy model.
  • Conclusion: A Calculated Trade-Off

    The move to an eBPF-based service mesh like Cilium is not merely an implementation swap; it's an architectural evolution. It challenges the established sidecar pattern by offering a solution that is deeply integrated with the Linux kernel.

    The benefits are compelling: drastically lower latency, a significantly smaller resource footprint, and simplified pod lifecycle management. For performance-critical systems, high-density clusters, or cost-sensitive environments, these advantages can be transformative.

    However, this comes with the trade-off of a steeper learning curve for debugging and a dependency on modern Linux kernel versions. The operational mindset must shift from debugging user-space proxies to observing kernel-level events. For teams willing to make this investment, the sidecarless service mesh represents the future of cloud-native networking—a future that is faster, more efficient, and more scalable.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles