Cilium & eBPF: Sidecar-less Service Mesh for K8s Performance

October 15, 2025

17 min read

Goh Ling Yong

Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Sidecar Proxy Bottleneck: A Performance Autopsy

For any team operating a service mesh at scale, the architectural elegance of the sidecar pattern eventually confronts the harsh realities of production performance. While proxies like Envoy provide a wealth of features, injecting them into every application pod's network namespace creates a distributed tax on latency and resources across the entire cluster. This isn't a theoretical concern; it's a measurable bottleneck that impacts SLOs and infrastructure costs.

Let's dissect the specific performance penalties inherent in the sidecar model, assuming a baseline understanding of how service meshes like Istio or Linkerd operate.

1. Latency Injection via Kernel-Userspace Traversal

The primary source of latency is the repeated traversal of the TCP/IP stack. When Service A wants to communicate with Service B in a sidecar mesh:

Service A (Userspace): The application sends a packet. The kernel's TCP/IP stack processes it.

Redirect to Sidecar A (Userspace): Instead of leaving the node, iptables rules redirect the packet back to the Envoy proxy running in Service A's pod.

Envoy A Processing: The sidecar proxy inspects the packet, applies L7 policies (routing, retries, mTLS encryption), and makes a routing decision.

Envoy A to Kernel: The newly processed packet is sent back to the kernel stack.

Kernel to Network: The packet finally leaves the node, destined for Service B's node.

Arrival at Node B: The packet arrives at the kernel on Node B.

Redirect to Sidecar B (Userspace): iptables rules on Node B intercept the packet and redirect it to Service B's Envoy sidecar.

Envoy B Processing: The sidecar decrypts the mTLS traffic and applies ingress policies.

Envoy B to Service B (Userspace): The proxy forwards the packet to the actual application container via the loopback interface.

Each userspace-to-kernel transition involves context switching, which is computationally expensive. This round trip through two separate user-space proxies can add anywhere from a few milliseconds to tens of milliseconds of latency per hop. In a complex microservices call chain, this latency is amplified, directly impacting user-facing P99 response times.

2. Cluster-Wide Resource Overhead

Every pod running a sidecar proxy consumes dedicated CPU and memory. While a single Envoy instance might seem lightweight, the cumulative effect across a cluster of hundreds or thousands of pods is substantial.

Consider a service with 100 pods, where each Envoy sidecar is configured with a request of 100m CPU and 128Mi memory. This equates to an extra 10 vCPU cores and 12.5 GiB of memory dedicated solely to running the mesh infrastructure, not the application logic. This overhead directly translates to higher cloud provider bills and reduced pod density per node.

We can visualize this with a simple kubectl top command on a pod with and without an injected sidecar:

bash

# Without sidecar
$ kubectl top pod product-catalog-v1-abcdef-12345 -n my-app
NAME                                CPU(cores)   MEMORY(bytes)
product-catalog-v1-abcdef-12345     25m          60Mi

# With Istio sidecar
$ kubectl top pod product-catalog-v1-uvwxyz-67890 -n my-app
NAME                                CPU(cores)   MEMORY(bytes)
product-catalog-v1-uvwxyz-67890     135m         195Mi
# (Note: 'istio-proxy' container accounts for ~110m CPU and ~135Mi Memory)

This isn't just about static resource requests; it's about the CPU cycles consumed during traffic processing, which can lead to CPU throttling for the main application container if node resources are constrained—the classic "noisy neighbor" problem, but inside the same pod.

3. Operational Complexity and Fragility

The sidecar model introduces significant operational friction:

* Opaque Networking: Application developers lose direct visibility into network flows. Debugging becomes a complex exercise in reading Envoy logs and configuration dumps.

* Version Skew: The sidecar version must be managed in lockstep with the control plane. A botched upgrade can lead to widespread communication failures.

* Traffic Interception Brittleness: Reliance on iptables can be fragile. Other tools that manipulate iptables (like Calico or kube-proxy in some modes) can interfere with the sidecar's redirection rules.

* Startup Racing: The application container can sometimes start and attempt network calls before the sidecar proxy is fully initialized and ready to receive traffic, leading to startup failures.

These challenges are the driving force behind seeking a more efficient, integrated solution for service mesh capabilities.

eBPF & Cilium: Kernel-Level Networking as a Paradigm Shift

eBPF (extended Berkeley Packet Filter) fundamentally changes the equation by allowing us to run sandboxed programs directly within the Linux kernel. This isn't just a faster iptables; it's a programmable data plane that can make intelligent decisions at the earliest possible point in the packet lifecycle, eliminating the need for expensive user-space detours.

Cilium leverages eBPF to implement networking, observability, and security in a way that is perfectly suited for a service mesh.

How Cilium Replaces the Sidecar

Instead of redirecting traffic to a user-space proxy, Cilium attaches eBPF programs to kernel hooks, primarily at the socket and Traffic Control (TC) layers.

Here's how the Service A to Service B communication path is transformed:

Service A sends a packet: When the application in Service A's pod makes a sendmsg() or recvmsg() syscall, a Cilium eBPF program attached to the socket can immediately apply policies.

Identity-Based Security: Cilium assigns a cryptographic identity to every pod based on its Kubernetes labels. The eBPF program in the kernel checks an eBPF map to see if Identity A is allowed to talk to Identity B. This is far more efficient than IP-based firewall rules.

Transparent Encryption (mTLS): If mTLS is required, Cilium can use kernel-level TLS (kTLS) to encrypt the packet payload directly in the kernel before it's passed to the network device driver. The expensive encryption/decryption work happens once, in the kernel.

Service Load Balancing: When the packet leaves the node, Cilium's eBPF program at the TC layer performs service load balancing. It looks up the destination service IP, finds the healthy backend pods from another eBPF map, and rewrites the packet's destination IP directly. This bypasses kube-proxy and iptables entirely.

Arrival at Node B: The packet arrives encrypted. The eBPF program on Node B's TC layer decrypts it using kTLS and delivers it directly to the correct pod's network interface.

Crucially, for L3/L4 policy enforcement and load balancing, the packet never leaves the kernel. For L7 policies (e.g., HTTP-aware routing), Cilium still uses an Envoy proxy, but it runs one per node (or in a shared cluster pool) rather than one per pod. The eBPF programs efficiently steer only the relevant L7 traffic to this shared proxy, dramatically reducing the resource footprint.

This architecture provides the benefits of a service mesh without the performance tax of the sidecar.

Production Implementation: Migrating from Istio to a Sidecar-less Cilium Mesh

This section provides a hands-on, production-oriented guide to migrating an existing application from an Istio service mesh to Cilium. We'll use a local kind cluster for demonstration, but the principles and manifests apply directly to production environments.

Prerequisites

* A running Kubernetes cluster (e.g., kind, minikube, or a cloud provider).

* kubectl and helm installed.

* A load testing tool like k6.

Step 1: Establish a Performance Baseline with Istio

First, let's deploy a sample application and install Istio to measure our baseline performance.

bash

# 1. Create a kind cluster
kind create cluster --name cilium-migration-demo

# 2. Install Istio
curl -L https://istio.io/downloadIstio | sh -
cd istio-*
bin/istioctl install --set profile=demo -y
cd ..

# 3. Label a namespace for Istio injection
kubectl create namespace online-boutique
kubectl label namespace online-boutique istio-injection=enabled

# 4. Deploy the Online Boutique sample application
kubectl apply -n online-boutique -f https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/main/release/kubernetes-manifests.yaml

# 5. Expose the frontend via an Istio Gateway
kubectl apply -n online-boutique -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: frontend-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: frontend-virtualservice
spec:
  hosts:
  - "*"
  gateways:
  - frontend-gateway
  http:
  - route:
    - destination:
        host: frontend
        port:
          number: 80
EOF

# 6. Get the Ingress Gateway IP and Port
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')
export INGRESS_HOST=$(kubectl get nodes --namespace istio-system -o jsonpath='{.items[0].status.addresses[0].address}')
export GATEWAY_URL=$INGRESS_HOST:$INGRESS_PORT

echo "Access the app at http://$GATEWAY_URL"

Now, run a baseline load test with k6:

javascript

// istio-baseline-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '30s', target: 50 },
    { duration: '1m', target: 50 },
    { duration: '30s', target: 0 },
  ],
};

export default function () {
  const res = http.get(`http://${__ENV.GATEWAY_URL}/`);
  check(res, { 'status was 200': (r) => r.status == 200 });
  sleep(1);
}

bash

k6 run --env GATEWAY_URL=$GATEWAY_URL istio-baseline-test.js

Record the results, paying close attention to http_req_duration (especially p(95) and p(99)) and the total CPU/Memory usage across the cluster (kubectl top nodes).

Step 2: Install and Configure Cilium

We'll install Cilium via Helm, enabling the necessary features for the service mesh.

bash

helm repo add cilium https://helm.cilium.io/

helm install cilium cilium/cilium --version 1.15.5 \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set securityContext.privileged=true \
  --set bpf.masquerade=true \
  --set serviceMesh.enabled=true \
  --set hubble.enabled=true \
  --set hubble.ui.enabled=true \
  --set hubble.relay.enabled=true

* kubeProxyReplacement=true: Cilium's eBPF implementation completely replaces kube-proxy for maximum performance.

* serviceMesh.enabled=true: A meta-flag that enables L7 visibility and control.

hubble.: Enables Hubble for deep observability.

After installation, verify that all Cilium pods are running and the status check is successful:

bash

kubectl -n kube-system get pods -l k8s-app=cilium
cilium status --wait

Step 3: Advanced Configuration for mTLS and Canary Deployments

Now, we'll start migrating services. The key is to do this gradually.

A. Disable Istio Injection and Annotate for Cilium

First, we'll remove the Istio sidecar from a specific deployment (e.g., productcatalogservice) and let Cilium manage its traffic.

bash

# Remove the Istio sidecar by restarting the deployment after disabling injection for the namespace
kubectl label ns online-boutique istio-injection-
kubectl rollout restart deployment/productcatalogservice -n online-boutique

B. Enforcing Mutual TLS (mTLS)

Cilium can enforce mTLS using its built-in certificate authority. Let's create a policy that requires mTLS for any traffic ingressing the productcatalogservice.

yaml

# mtl-policy.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "require-mtls-for-productcatalog"
  namespace: online-boutique
spec:
  endpointSelector:
    matchLabels:
      app: productcatalogservice
  ingress:
    - fromEndpoints:
        - matchLabels:
            # Allow traffic from frontend
            app: frontend
      # This is the key part: require authentication
      authentication:
        mode: "required"

bash

kubectl apply -f mtl-policy.yaml

With this policy, any pod attempting to connect to productcatalogservice must present a valid SPIFFE identity certificate, which Cilium transparently provides and validates. Traffic from pods not managed by Cilium (or without a valid identity) will be dropped at the kernel level.

C. Implementing a Canary Release with CiliumEnvoyConfig

This is where we replicate advanced L7 routing. Let's assume you've deployed a v2 of the recommendationservice.

bash

# (Assume recommendationservice-v2 deployment exists)

We can use a CiliumEnvoyConfig object to programmatically configure Cilium's shared Envoy proxy to split traffic.

yaml

# canary-recommendations.yaml
apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
  name: recommendations-canary
  namespace: online-boutique
spec:
  # Target the service we want to apply routing rules to
  services:
    - name: recommendationservice
      namespace: online-boutique

  # This is the raw Envoy configuration
  resources:
    - "@type": type.googleapis.com/envoy.config.route.v3.RouteConfiguration
      name: recommendations-route
      virtual_hosts:
        - name: recommendations-virtualhost
          domains: ["*"]
          routes:
            - match: { prefix: "/" }
              route:
                # This defines the traffic split
                weighted_clusters:
                  clusters:
                    - name: "online-boutique/recommendationservice-v1"
                      weight: 90
                    - name: "online-boutique/recommendationservice-v2"
                      weight: 10
                  total_weight: 100

This manifest instructs Cilium's eBPF datapath to redirect traffic destined for recommendationservice to the shared Envoy proxy. The proxy then applies this configuration, splitting traffic 90/10 between the v1 and v2 services. This is achieved without a sidecar in either the source or destination pods.

bash

kubectl apply -f canary-recommendations.yaml

Step 4: Full Migration and Performance Re-evaluation

Once you've migrated all services by restarting their deployments (to remove the sidecar) and have replicated your necessary routing and security policies with Cilium CRDs, you can decommission Istio.

bash

# Uninstall Istio control plane
cd istio-*
bin/istioctl uninstall --purge -y

# Remove Istio system namespace
kubectl delete namespace istio-system

Now, re-run the exact same k6 load test:

bash

k6 run --env GATEWAY_URL=$GATEWAY_URL istio-baseline-test.js

You will now be hitting the Cilium-managed ingress. Compare the results.

Performance Deep Dive & Benchmark Analysis

After running the load tests, the difference is typically stark. Here's what a representative comparison looks like:

Metric	Istio (Sidecar-based)	Cilium (Sidecar-less)	Improvement
p(95) Latency	85ms	40ms	53% ↓
p(99) Latency	150ms	65ms	57% ↓
Requests per Second	450 RPS	600 RPS	33% ↑
Cluster CPU (Total)	4.5 cores	2.8 cores	38% ↓
Cluster Memory (Total)	8.2 GiB	6.5 GiB	21% ↓

Analysis of Results

* Latency Reduction: The >50% reduction in P99 latency is a direct result of eliminating the two user-space proxy hops for every request. The path from service-to-service is now almost entirely within the kernel, which is orders of magnitude faster.

* Throughput Increase: Lower latency per request allows the system to handle more concurrent requests, leading to a significant increase in overall throughput (RPS).

* Resource Efficiency: The dramatic drop in CPU and Memory usage comes from removing thousands of redundant Envoy processes. A few shared, node-level Envoy proxies (managed by Cilium for L7) and the hyper-efficient eBPF programs consume far fewer resources.

Edge Case: When You Still Need Granular L7 Control

What if a specific service requires a complex WebAssembly (WASM) filter or a custom Lua script that only Envoy can provide? The Cilium model doesn't preclude this. Instead of a cluster-wide sidecar injection, you can opt-in to a dedicated proxy for just that one service. You would deploy Envoy as a container within that specific pod and use a CiliumEnvoyConfig to direct traffic to it. This provides the flexibility of Envoy's rich feature set where needed, without paying the performance tax across the entire fleet.

Advanced Observability with Hubble and eBPF

One of the most powerful features of this architecture is the deep, effortless observability provided by Hubble, which sources its data directly from the eBPF programs.

Hubble UI: Forward the Hubble UI service to your local machine:

bash

cilium hubble ui

This will open a web interface showing a live service dependency map, network policies, and traffic flows, all generated without any application instrumentation.

Hubble CLI for Advanced Debugging

The real power is in the CLI for real-time diagnostics.

* Why was my connection dropped?

Get an immediate, kernel-level reason for a dropped packet.

bash

    # See dropped packets from the frontend pod
    hubble observe --from-pod online-boutique/frontend-xxxx -n online-boutique --verdict DROPPED
    # Example Output:
    # DROP (Policy denied) ...

* Trace live HTTP requests: Inspect L7 traffic in real-time between services.

bash

    hubble observe -n online-boutique --protocol http -f
    # Example Output:
    # ... GET /product/OLJCESPC7Z => 200 (OK)
    # ... POST /cart => 200 (OK)

* Diagnose DNS issues: eBPF gives Cilium visibility into DNS requests and responses for every pod.

bash

    hubble observe -n online-boutique --to-port 53 -f

These tools provide an unparalleled level of visibility directly from the source of truth—the kernel—without the overhead of log shipping or distributed tracing agents for basic network flow analysis.

Conclusion: The Future is Sidecar-less

Migrating from a traditional sidecar-based service mesh to a sidecar-less architecture with Cilium and eBPF is more than just an incremental improvement; it's a fundamental shift in how we build cloud-native infrastructure. By moving policy enforcement, load balancing, and observability into the Linux kernel, we eliminate systemic latency and resource bloat, leading to faster, more efficient, and more reliable applications.

The operational simplicity of managing a single CNI that also provides advanced service mesh capabilities cannot be overstated. It reduces complexity, streamlines debugging, and lowers the cognitive load on platform engineering teams. While the sidecar pattern was a necessary and innovative step in the evolution of service meshes, the performance and efficiency gains offered by eBPF represent the clear path forward for building the next generation of high-performance distributed systems on Kubernetes.

The Sidecar Proxy Bottleneck: A Performance Autopsy

1. Latency Injection via Kernel-Userspace Traversal

2. Cluster-Wide Resource Overhead

3. Operational Complexity and Fragility

eBPF & Cilium: Kernel-Level Networking as a Paradigm Shift

How Cilium Replaces the Sidecar

Production Implementation: Migrating from Istio to a Sidecar-less Cilium Mesh

Prerequisites

Step 1: Establish a Performance Baseline with Istio

Step 2: Install and Configure Cilium

Step 3: Advanced Configuration for mTLS and Canary Deployments

Step 4: Full Migration and Performance Re-evaluation

Performance Deep Dive & Benchmark Analysis

Analysis of Results

Edge Case: When You Still Need Granular L7 Control

Advanced Observability with Hubble and eBPF

Conclusion: The Future is Sidecar-less

Found this article helpful?