eBPF-Powered Observability: Low-Overhead Tracing in K8s with Cilium
The Performance Bottleneck of Sidecar Observability
In modern microservices architectures running on Kubernetes, observability is non-negotiable. However, the de facto standard—the sidecar proxy model popularized by service meshes like Istio—comes with a well-documented performance cost. Every packet originating from or destined for your application pod must traverse a user-space proxy. This round trip involves multiple context switches between user space and kernel space, memory copy operations, and the inherent processing latency of the proxy itself. For high-throughput, low-latency services, this can add milliseconds to your p99 latency and significantly increase CPU and memory footprints across the cluster.
The fundamental issue is the data path. A typical sidecar flow looks like this:
Application -> Pod Network Namespace (veth) -> Host Network Namespace -> Sidecar Proxy (User Space) -> Host Network Namespace -> Destination
This model, while powerful for traffic management and policy enforcement, is suboptimal for pure observability. We are paying a continuous performance tax for data that is often just being observed, not mutated.
eBPF (extended Berkeley Packet Filter) offers a paradigm shift. By attaching small, sandboxed programs to various hook points within the Linux kernel, we can achieve similar visibility directly at the source, eliminating the user-space detour. Cilium leverages eBPF to create a networking, security, and observability data plane that operates almost entirely within the kernel.
This article will demonstrate how to harness this power, focusing on practical, production-level techniques for deep system tracing with minimal overhead.
Section 1: Architecting for Kernel-Level Visibility
Before diving into commands, it's crucial to understand the architectural differences. Cilium's observability tool, Hubble, uses eBPF programs attached to kernel hooks like Traffic Control (TC) and socket operations to capture network flow data.
veth pair) can inspect every packet. It can see source/destination IP, port, and TCP flags. This is done before the packet is even handed to the pod's network stack, making it incredibly efficient.read/write syscalls within common TLS libraries (e.g., OpenSSL, GnuTLS). This allows it to inspect the unencrypted data just before it's encrypted by the application's TLS library or just after it's decrypted. This provides L7 visibility (HTTP paths, gRPC methods, Kafka topics) without the overhead and complexity of certificate management and TLS termination in a sidecar.Production-Grade Cilium & Hubble Configuration
We assume a running Kubernetes cluster. A default Cilium installation is insufficient; we need to enable Hubble with its UI and metrics endpoints. Below is a production-oriented Helm configuration snippet.
# values-production.yaml
# Enable Kubernetes without kube-proxy for maximum performance
# This lets Cilium manage all service routing via eBPF maps.
kubeProxyReplacement: strict
# BPF-based host routing for pods
hostServices:
enabled: true
# Enable BPF masquerading for traffic leaving the cluster
bpf:
masquerade: true
# Hubble Configuration
hubble:
enabled: true
# Deploy Hubble Relay for cluster-wide flow aggregation
relay:
enabled: true
# Tune buffer sizes for high-traffic clusters
# Default is 4095; increase if you see dropped flows
bufferSize: 8191
# Deploy the UI for visual inspection
ui:
enabled: true
# Enable metrics for Prometheus integration
metrics:
enabled:
- "dns"
- "drop"
- "tcp"
- "flow"
- "port-distribution"
- "icmp"
- "http"
# Prometheus Integration
prometheus:
enabled: true
serviceMonitor:
enabled: true # For Prometheus Operator
# Operator Configuration
operator:
prometheus:
enabled: true
serviceMonitor:
enabled: true
To apply this configuration:
helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium --version 1.12.5 \
--namespace kube-system \
-f values-production.yaml
Key Production Considerations from this configuration:
kubeProxyReplacement: strict: This is a critical performance optimization. It removes iptables from the service routing path entirely. Cilium uses eBPF hash maps to perform NAT for Kubernetes Services, which is significantly faster and more scalable than sequential iptables rule processing.hubble.relay.enabled: true: In a multi-node cluster, the Hubble daemon on each node only sees flows on that node. Hubble Relay aggregates these flows, providing a single API endpoint for cluster-wide observability. Without it, you'd have to query each node's agent individually.Section 2: Advanced Flow Tracing with the Hubble CLI
The Hubble CLI is your primary tool for real-time debugging. Let's move beyond hubble observe and into complex scenarios.
First, let's set up a sample application. We'll use a paymentservice and a currencyservice.
# sample-app.yaml
apiVersion: v1
kind: Namespace
metadata:
name: hipstershop
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: paymentservice
namespace: hipstershop
labels:
app: paymentservice
spec:
replicas: 1
selector:
matchLabels:
app: paymentservice
template:
metadata:
labels:
app: paymentservice
spec:
containers:
- name: server
image: gcr.io/google-samples/microservices-demo/paymentservice:v0.3.8
ports:
- containerPort: 50051
---
apiVersion: v1
kind: Service
metadata:
name: paymentservice
namespace: hipstershop
spec:
type: ClusterIP
selector:
app: paymentservice
ports:
- port: 50051
targetPort: 50051
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: currencyservice
namespace: hipstershop
labels:
app: currencyservice
spec:
replicas: 1
selector:
matchLabels:
app: currencyservice
template:
metadata:
labels:
app: currencyservice
spec:
containers:
- name: server
image: gcr.io/google-samples/microservices-demo/currencyservice:v0.3.8
ports:
- containerPort: 7000
---
apiVersion: v1
kind: Service
metadata:
name: currencyservice
namespace: hipstershop
spec:
type: ClusterIP
selector:
app: currencyservice
ports:
- port: 7000
targetPort: 7000
Apply it: kubectl apply -f sample-app.yaml
Scenario 1: Debugging Dropped Packets due to a Network Policy
Let's create a restrictive CiliumNetworkPolicy that denies traffic.
# deny-policy.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "deny-all-ingress"
namespace: hipstershop
spec:
endpointSelector:
matchLabels:
app: paymentservice
ingress: [] # Empty ingress means deny all
Apply it: kubectl apply -f deny-policy.yaml
Now, if we try to connect from currencyservice to paymentservice, it will fail. A simple ping won't work, but how do we debug this at the network layer?
# Exec into the currencyservice pod
CURRENCY_POD=$(kubectl get pods -n hipstershop -l app=currencyservice -o jsonpath='{.items[0].metadata.name}')
kubectl exec -it -n hipstershop $CURRENCY_POD -- /bin/sh
# Inside the pod, try to connect (this will hang)
apk add --no-cache curl
curl -v paymentservice:50051
Now, from another terminal, use Hubble to see exactly why it's failing.
# Port-forward the hubble-relay service
kubectl port-forward -n kube-system svc/hubble-relay 4245:80 &
# Use hubble observe to find the drop
hubble observe --namespace hipstershop --verdict DROPPED --to-pod paymentservice -f
Expected Output & Analysis:
TIME SOURCE -> DESTINATION VERDICT REASON
Oct 26 12:35:10.123 hipstershop/currencyservice-5f... (10.0.1.45) -> hipstershop/paymentservice-6c... (10.0.1.99:50051) DROPPED Policy denied
The output is unambiguous. We see:
VERDICT: DROPPEDREASON: Policy deniedThis confirms a network policy is the culprit. The key performance insight here is that this filtering and logging happened entirely in the kernel. The packet was dropped by the eBPF program on the destination pod's network interface; it never even reached the pod's network stack, let alone an iptables chain. This is extremely efficient for enforcing firewall rules.
Scenario 2: Identity-Based vs. IP-Based Filtering
Cilium assigns a security identity (a numeric ID) to each endpoint based on its labels. Policies are then enforced based on these identities, not ephemeral pod IPs. This is a more robust and scalable model.
Let's find the identity of our pods:
# Get the Cilium Endpoint for the currency service
CILIUM_EP=$(kubectl get cep -n hipstershop -l app=currencyservice -o jsonpath='{.items[0].metadata.name}')
# Describe the endpoint to get its identity
kubectl describe cep -n hipstershop $CILIUM_EP
# Look for a line like: Identity: ID=43128, Labels: [k8s:app=currencyservice, ...]
Let's say the identity is 43128. We can now use this for highly specific tracing.
# Trace all traffic originating from any pod with this identity
hubble observe --from-identity 43128
# This is far more powerful than IP-based filtering, as pods can be rescheduled and get new IPs,
# but their label-based identity remains the same.
Section 3: L7 Observability without Sidecars (HTTP/gRPC)
This is where Cilium + eBPF truly outshines traditional methods for pure observability.
First, let's fix our network policy to allow traffic.
# allow-policy.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "allow-currency-to-payment"
namespace: hipstershop
spec:
endpointSelector:
matchLabels:
app: paymentservice
ingress:
- fromEndpoints:
- matchLabels:
app: currencyservice
Delete the old policy and apply the new one:
kubectl delete cnp -n hipstershop deny-all-ingress
kubectl apply -f allow-policy.yaml
Our sample app uses gRPC. Let's generate some traffic.
# We'll deploy a load generator
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/main/release/kubernetes-manifests.yaml
Now, let's inspect the gRPC calls between services.
# Observe gRPC traffic to the paymentservice
hubble observe --namespace hipstershop --protocol grpc --to-pod paymentservice -f
Expected Output & Analysis:
TIME SOURCE -> DESTINATION TYPE VERDICT
Oct 26 12:45:20.555 hipstershop/checkoutservice-7d... -> hipstershop/paymentservice-6c... (50051) gRPC FORWARDED (gRPC) {call:"hipstershop.PaymentService/Charge", authority:":authority: paymentservice:50051"}
Notice the rich L7 information: gRPC {call:"hipstershop.PaymentService/Charge"}. Hubble's eBPF programs, attached to the socket functions, have parsed the gRPC frame to extract the service and method name. This was achieved without a sidecar, without terminating TLS, and without any application code changes.
Edge Case: Statically Compiled Go Binaries
A common edge case for L7 parsing is with statically compiled Go applications that don't use the system's shared C libraries for TLS (like OpenSSL). By default, Cilium's kprobes look for symbols in these shared libraries. If your Go app is built like this, Cilium's automatic L7 parsing might fail.
Solution: You need to provide user-space probing information via Cilium's configuration or annotations on the pod. This tells the Cilium agent where to find the TLS function symbols within the statically linked binary. This is an advanced procedure and requires analyzing the binary with tools like nm to find the correct function offsets.
Example annotation (conceptual):
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-static-go-app
annotations:
# This is a conceptual example; the exact annotation may vary
"cilium.io/tls-probe.go-crypto/tls": "/path/to/binary:main.FunctionName"
spec:
# ...
Section 4: Integrating with Prometheus for Long-Term Analysis
Real-time CLI observation is for debugging. For monitoring and alerting, we need to integrate with a TSDB like Prometheus.
Our values-production.yaml already enabled the metrics endpoints and created a ServiceMonitor. If you have the Prometheus Operator installed, it will automatically scrape Cilium and Hubble.
Let's explore some powerful PromQL queries you can build.
1. HTTP Latency Golden Signals (without a sidecar)
Hubble can export HTTP request/response metrics. You can calculate p95 latency between two services.
# p99 latency for HTTP GET requests from frontend to productcatalogservice
histogram_quantile(0.99,
sum(rate(hubble_http_response_latency_seconds_bucket{
namespace="hipstershop",
source_app="frontend",
destination_app="productcatalogservice",
method="GET"
}[5m])) by (le, source_app, destination_app)
)
2. Network Policy Drop Rate by Reason
This is invaluable for security monitoring. You can alert if a specific application suddenly starts seeing a high rate of policy denials.
# Rate of dropped packets to the paymentservice, broken down by drop reason
sum(rate(hubble_drop_total{
namespace="hipstershop",
destination_app="paymentservice"
}[5m])) by (reason)
3. DNS Resolution Failures per Source App
Cilium can also parse DNS requests/responses, giving you insight into service discovery issues.
# Rate of DNS queries with RCODE != NoError, indicating an error
sum(rate(hubble_dns_responses_total{
namespace="hipstershop",
rcode!="NoError"
}[5m])) by (source_app)
These metrics provide a comprehensive view of your network's health and security posture, all sourced directly from the kernel with minimal overhead.
Section 5: Advanced Troubleshooting with `cilium policy trace`
Sometimes, hubble observe tells you a packet was dropped, but your network policies are complex, and you don't know which rule is the cause. The cilium policy trace command is a simulator that lets you determine the policy verdict for a hypothetical packet.
Let's go back to our deny-all-ingress scenario. We know traffic from currencyservice to paymentservice is being dropped. Let's prove it with the tracer.
First, we need the security identities and pod IPs.
# Get source pod info
SOURCE_POD_NAME=$(kubectl get pod -n hipstershop -l app=currencyservice -o jsonpath='{.items[0].metadata.name}')
SOURCE_POD_IP=$(kubectl get pod -n hipstershop $SOURCE_POD_NAME -o jsonpath='{.status.podIP}')
SOURCE_IDENTITY=$(kubectl get cep -n hipstershop -l app=currencyservice -o jsonpath='{.items[0].status.identity.id}')
# Get destination pod info
DEST_POD_NAME=$(kubectl get pod -n hipstershop -l app=paymentservice -o jsonpath='{.items[0].metadata.name}')
DEST_POD_IP=$(kubectl get pod -n hipstershop $DEST_POD_NAME -o jsonpath='{.status.podIP}')
DEST_IDENTITY=$(kubectl get cep -n hipstershop -l app=paymentservice -o jsonpath='{.items[0].status.identity.id}')
Now, run the trace from one of the Cilium agent pods. Find a cilium pod on the same node as the destination pod for the most accurate trace.
# Find a cilium agent pod
CILIUM_POD=$(kubectl get pods -n kube-system -l k8s-app=cilium -o jsonpath='{.items[0].metadata.name}')
# Execute the policy trace command inside the cilium agent
kubectl exec -it -n kube-system $CILIUM_POD -- \
cilium policy trace \
--src-identity $SOURCE_IDENTITY \
--src-ip $SOURCE_POD_IP \
--dst-identity $DEST_IDENTITY \
--dst-ip $DEST_POD_IP \
-d 50051/TCP
Expected Output & Analysis:
-> Verdict: Denied
Source Identity: 43128 -> hipstershop/currencyservice
Destination Identity: 12945 -> hipstershop/paymentservice
Traffic: TCP port 50051
Policy:
hipstershop/deny-all-ingress (Ingress)
Enforced: Yes
Rule: (no rules matched)
This output is the ultimate debugging tool. It tells you:
Denied.hipstershop/deny-all-ingress.This level of introspection allows you to resolve complex, multi-policy interaction issues with confidence, without having to guess which policy is at fault.
Conclusion: The Future is Kernel-Native
For senior engineers optimizing for performance, reliability, and security in Kubernetes, moving observability out of user-space sidecars and into the kernel via eBPF is a logical and powerful evolution. Cilium provides a mature, production-ready implementation of this vision.
By leveraging kernel-native data collection, we achieve:
While the sidecar model still holds value for complex traffic-shifting and routing use cases, for pure observability, the performance and efficiency of eBPF are undeniable. As eBPF's capabilities continue to expand with projects like Tetragon for runtime security, the kernel is solidifying its place as the next frontier for cloud-native observability and security.