Advanced K8s Network Observability with eBPF and Cilium Hubble

15 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

Beyond `iptables`: The Imperative for eBPF-based Observability

For senior engineers operating Kubernetes at scale, the limitations of the default kube-proxy implementation, typically backed by iptables, are a familiar source of friction. While functional, iptables introduces significant performance bottlenecks due to its sequential rule processing in the kernel's Netfilter framework. More critically for observability, it abstracts away the true source of network traffic, often masquerading it behind SNAT (Source Network Address Translation), making it incredibly difficult to answer a simple question: "Which specific pod is talking to which specific service?"

This is where eBPF (extended Berkeley Packet Filter) represents a paradigm shift. By allowing sandboxed programs to run directly within the Linux kernel, projects like Cilium can implement a highly efficient and identity-aware networking datapath. Cilium attaches eBPF programs to network interfaces (via tc hooks) and socket calls (via sock_ops hooks), effectively bypassing iptables and kube-proxy for in-cluster traffic. This provides not only a massive performance improvement but also a revolutionary new plane for observability.

This article assumes you understand these fundamentals. We will not cover the basics of Cilium installation or the theory of eBPF. Instead, we will focus on advanced, practical applications of Hubble, Cilium's observability component, to diagnose and resolve complex, real-world networking scenarios that are nearly impossible to debug with traditional tooling.

The Cilium eBPF Datapath: A Quick Architectural Refresher

To effectively use Hubble, it's crucial to understand how Cilium's eBPF programs capture the data it visualizes. Unlike a sidecar proxy that intercepts traffic in userspace, Cilium operates at the kernel level.

  • Identity-Based Security: Cilium assigns a security identity to each pod based on its labels. This identity is stored in an eBPF map. When a packet is sent, an eBPF program attached to the pod's network device (veth) looks up the source and destination identities. Policy decisions are made based on these identities, not ephemeral IP addresses.
  • Efficient Service Routing: For ClusterIP services, Cilium's eBPF programs attached at the tc (Traffic Control) hook on the network device perform direct DNAT (Destination NAT) to a selected backend pod's IP. This is done via a lookup in an eBPF map that stores the service-to-backend mappings. This avoids the entire iptables chain traversal, significantly reducing latency.
  • Connection Tracking: eBPF maps are used to create a highly efficient, localized connection tracking table. This allows Cilium to understand flows and associate packets with established connections directly in the kernel.
  • Hubble taps directly into these eBPF maps and a perf ring buffer where Cilium's eBPF programs push event notifications (e.g., new flows, policy verdicts, packet drops). This is the source of its power: it's not sampling traffic or relying on userspace agents; it's reporting the ground truth directly from the kernel's decision-making process.

    Production-Ready Hubble Configuration

    To unlock the full potential of Hubble, your initial Cilium deployment configuration is critical. A default install often has Hubble disabled or minimally configured. Here is a production-oriented Helm values.yaml snippet for a Cilium installation.

    Code Example 1: values.yaml for a Production Cilium/Hubble Deployment

    yaml
    # values.yaml for Cilium Helm chart
    
    kubeProxyReplacement: strict # Completely bypass kube-proxy for maximum performance
    bpf:
      # Pre-allocation of BPF maps prevents runtime performance hits
      preallocateMaps: true
    
    # Enable Hubble for deep observability
    hubble:
      enabled: true
      listenAddress: ":4244"
      # Deploy Hubble Relay for a cluster-wide view
      relay:
        enabled: true
        # Tune buffer and timeout for high-volume clusters
        # bufferSize is the number of flows per-node peer buffer.
        bufferSize: 2048
        # Timeout after which a peer is considered disconnected
        peerServiceConnectTimeout: 20s
      # Deploy the Hubble UI for visualization
      ui:
        enabled: true
      # Expose metrics for Prometheus scraping
      metrics:
        enabled:
          - "dns"
          - "drop"
          - "tcp"
          - "flow"
          - "port-distribution"
          - "icmp"
          - "http"
        serviceMonitor:
          enabled: true # If using Prometheus Operator

    Key considerations in this configuration:

    * kubeProxyReplacement: strict: This is the most performant mode, where Cilium's eBPF datapath handles all ClusterIP, NodePort, and LoadBalancer traffic. This ensures all relevant traffic is visible to Hubble.

    * hubble.relay.enabled: true: In a multi-node cluster, each node's Cilium agent has its own Hubble instance with a view of local traffic. Hubble Relay aggregates these disparate streams, providing a single API endpoint for a complete, cluster-wide view. This is essential for observing inter-node communication.

    * hubble.metrics.enabled: This exposes a rich set of Prometheus metrics from the flow data, allowing you to build dashboards and alerts on network behavior (e.g., unexpected packet drop rates, DNS resolution failures).

    Scenario 1: Debugging Silently Dropped Packets

    One of the most frustrating production issues is when a service can't connect to another, but there are no application-level errors, no logs, and ping or netcat simply time out. This often points to a networking issue, potentially a misconfigured NetworkPolicy.

    Let's simulate this. We have a three-tier application: frontend, api-gateway, and user-service.

    Code Example 2: Application and a Flawed CiliumNetworkPolicy

    yaml
    # app-deployment.yaml
    apiVersion: v1
    kind: Namespace
    metadata:
      name: prod-app
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: frontend
      namespace: prod-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: frontend
      template:
        metadata:
          labels:
            app: frontend
            tier: presentation
        spec:
          containers:
          - name: frontend-container
            image: nginx
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: api-gateway
      namespace: prod-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: api-gateway
      template:
        metadata:
          labels:
            app: api-gateway
            tier: logic
        spec:
          containers:
          - name: api-gateway-container
            image: kennethreitz/httpbin
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: user-service
      namespace: prod-app
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: user-service
      template:
        metadata:
          labels:
            app: user-service
            tier: data
        spec:
          containers:
          - name: user-service-container
            image: kennethreitz/httpbin
    ---
    # faulty-policy.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "api-access-policy"
      namespace: prod-app
    spec:
      endpointSelector:
        matchLabels:
          app: api-gateway # Policy applies to the api-gateway
      ingress:
      - fromEndpoints:
        - matchLabels:
            app: frontend
        toPorts:
        - ports:
          - port: "80"
            protocol: TCP
      egress:
      - toEndpoints:
        - matchLabels:
            # TYPO: Should be 'user-service'
            app: user-servce
        toPorts:
        - ports:
          - port: "80"
            protocol: TCP

    In the CiliumNetworkPolicy, there is a subtle typo in the egress rule: app: user-servce instead of app: user-service. This policy will be successfully applied by Kubernetes, but it will cause the api-gateway to be unable to reach the user-service. Application logs in api-gateway will only show connection timeouts.

    Diagnosis with Hubble CLI

    This is where Hubble shines. We can use the CLI to observe flows and immediately see the policy enforcement decision.

    First, port-forward to the Hubble Relay service:

    kubectl port-forward -n kube-system svc/hubble-relay 4245:80

    Now, let's observe the flows from the api-gateway pod, looking specifically for dropped packets.

    bash
    # Find the api-gateway pod name
    API_POD=$(kubectl get pods -n prod-app -l app=api-gateway -o jsonpath='{.items[0].metadata.name}')
    
    # Observe flows from this pod, filtering for DROPPED verdicts
    hubble observe -n prod-app --from pod=${API_POD} --verdict DROPPED -o json

    You will get detailed output like this:

    json
    {
      "flow": {
        "time": "2023-10-27T18:35:12.123456789Z",
        "verdict": "DROPPED",
        "drop_reason_desc": "POLICY_DENIED",
        "IP": {
          "source": "10.0.1.150",
          "destination": "10.0.1.200",
          "ipVersion": "IPv4"
        },
        "l4": {
          "TCP": {
            "source_port": 54321,
            "destination_port": 80,
            "flags": {
              "SYN": true
            }
          }
        },
        "source": {
          "ID": 123,
          "identity": 45678,
          "namespace": "prod-app",
          "labels": ["k8s:app=api-gateway", "k8s:tier=logic"],
          "pod_name": "api-gateway-xyz-123"
        },
        "destination": {
          "ID": 456,
          "identity": 56789,
          "namespace": "prod-app",
          "labels": ["k8s:app=user-service", "k8s:tier=data"],
          "pod_name": "user-service-abc-456"
        },
        "Type": "L3_L4",
        "traffic_direction": "EGRESS",
        "policy_match_type": 2
      },
      "node_name": "k8s-worker-1"
    }

    The crucial fields are "verdict": "DROPPED" and "drop_reason_desc": "POLICY_DENIED". This tells us instantly that a network policy is the culprit. Hubble provides the full source and destination pod labels, leaving no ambiguity. We can see the traffic was intended for a pod with k8s:app=user-service, and by inspecting our policy, we can quickly spot the typo.

    Correction

    After fixing the typo in faulty-policy.yaml and reapplying it, running hubble observe again would show "verdict": "FORWARDED".

    Code Example 3: The Corrected CiliumNetworkPolicy

    yaml
    # corrected-policy.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "api-access-policy"
      namespace: prod-app
    spec:
      endpointSelector:
        matchLabels:
          app: api-gateway
      ingress:
      # ... (ingress unchanged)
      egress:
      - toEndpoints:
        - matchLabels:
            # FIX: Corrected the label selector
            app: user-service
        toPorts:
        - ports:
          - port: "80"
            protocol: TCP

    This level of immediate, actionable feedback directly from the kernel's packet processing path is something tcpdump or application logs could never provide so cleanly.

    Scenario 2: Achieving L7 Visibility for API Call Tracing

    A common challenge in microservices is understanding the application-level interactions. Which API endpoint is service A calling on service B? Is it using the correct HTTP method? This is traditionally the domain of service meshes like Istio or Linkerd, which inject a sidecar proxy to inspect L7 traffic. However, this comes with significant operational overhead and resource consumption.

    Cilium's eBPF datapath can parse several L7 protocols (HTTP, gRPC, Kafka, DNS) without a sidecar. The same eBPF programs that handle L3/L4 routing can be extended to understand and enforce policies on L7 data for unencrypted traffic.

    Let's extend our policy to be more granular. We want the api-gateway to be able to call GET /users on the user-service, but not POST /users.

    Code Example 4: An Advanced CiliumNetworkPolicy with L7 HTTP rules

    yaml
    # l7-policy.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "api-access-policy"
      namespace: prod-app
    spec:
      endpointSelector:
        matchLabels:
          app: user-service # Policy now applies to the user-service
      ingress:
      - fromEndpoints:
        - matchLabels:
            app: api-gateway
        toPorts:
        - ports:
          - port: "80"
            protocol: TCP
          # L7 Rules for HTTP traffic on port 80
          rules:
            http:
            - method: "GET"
              path: "/get" # httpbin uses /get for GET requests

    After applying this policy, let's try to make both a GET and a POST request from the api-gateway pod to the user-service.

    bash
    # Exec into the api-gateway pod
    kubectl exec -it -n prod-app ${API_POD} -- /bin/bash
    
    # This request will succeed (matches the policy)
    curl -X GET http://user-service-svc.prod-app.svc.cluster.local/get
    
    # This request will fail silently (connection reset by peer)
    curl -X POST http://user-service-svc.prod-app.svc.cluster.local/post -d '{"key":"value"}'

    Again, the application sees a generic connection error. But Hubble can tell us exactly what happened at the L7 level.

    bash
    # Observe HTTP traffic, filtering for the user-service destination
    hubble observe -n prod-app --to app=user-service --protocol http -o json

    The output will contain distinct flow records for each attempt:

    Successful GET Request:

    json
    {
      "flow": {
        "verdict": "FORWARDED",
        "l7": {
          "type": "REQUEST",
          "latency_ns": "5000000",
          "http": {
            "code": 200,
            "method": "GET",
            "url": "http://user-service-svc.prod-app.svc.cluster.local/get",
            "protocol": "HTTP/1.1",
            "headers": [ ... ]
          }
        },
        // ... other fields ...
      }
    }

    Denied POST Request:

    json
    {
      "flow": {
        "verdict": "DROPPED",
        "drop_reason_desc": "POLICY_DENIED",
        "l7": {
          "type": "REQUEST",
          "http": {
            "code": 0, // No response
            "method": "POST",
            "url": "http://user-service-svc.prod-app.svc.cluster.local/post",
            "protocol": "HTTP/1.1"
          }
        },
        // ... other fields ...
        "traffic_direction": "INGRESS",
        "Summary": "TCP Flags: SYN, ACK -> 10.0.1.200:80 HTTP/1.1 POST http://user-service-svc.prod-app.svc.cluster.local/post"
      }
    }

    This is incredibly powerful. We can see the exact HTTP method and url that was denied by the policy, directly from the kernel. This allows us to debug application integration issues, enforce API contracts at the network layer, and gain deep visibility without the complexity of a service mesh.

    Advanced Patterns and Performance Considerations

    As you scale your usage of Hubble, several advanced patterns and performance considerations come into play.

    Performance Tuning Hubble

    Hubble's observability is not free. The eBPF programs push flow events into a perf ring buffer, which the userspace Hubble agent consumes. In clusters with extremely high connection rates, this buffer can fill up, leading to lost observability events.

    * Monitor for Drops: The hubble_observer_status_messages_total{type="flows-lost"} Prometheus metric is critical. If this counter is increasing, you are losing visibility.

    * Increase Buffer Sizes: In the Cilium ConfigMap (kubectl edit cm -n kube-system cilium-config), you can tune hubble-event-buffer-capacity. The default is 4095. Increasing this can help absorb bursts but uses more memory.

    * Flow Sampling: For massive clusters where capturing every single flow is not feasible, consider enabling sampling. This is not yet a mature feature but is under active development in the Cilium community.

    Edge Case: Inter-node Encrypted Traffic (WireGuard)

    Cilium can provide transparent encryption for all inter-node traffic using WireGuard. When enabled, Cilium encapsulates all traffic between nodes in an encrypted tunnel. This has a significant implication for Hubble's L7 visibility.

    Since the L7 data (e.g., HTTP headers) is encrypted once it leaves the source node, the Cilium agent on the destination node cannot parse it. Therefore, L7 observability in Hubble only works for traffic between pods on the same node when transparent encryption is active. L3/L4 flow data (IPs, ports, verdicts) remains fully visible for all traffic, as this is determined before encryption.

    This is a critical trade-off: you must choose between full cluster-wide L7 visibility and inter-node traffic encryption. For many security postures, encryption is non-negotiable, and you must accept the limitation on L7 observability or rely on a service mesh for that specific capability.

    Integrating with the Broader Observability Stack

    Hubble's CLI and UI are excellent for interactive debugging, but its true power is realized when integrated with your existing monitoring and alerting systems.

    Code Example 5: Prometheus Integration

    Assuming you've enabled metrics and the ServiceMonitor in the Helm chart, Prometheus will automatically scrape Hubble. You can then build powerful alerts.

    For example, to alert when a specific application (api-gateway) experiences a high rate of policy-denied packet drops:

    promql
    # PromQL alert for high packet drop rate
    sum(rate(hubble_drop_total{reason="Policy denied", namespace="prod-app", source_pod=~"api-gateway.*"}[5m])) by (source_pod, destination_pod) > 10

    This PromQL query calculates the per-second rate of drops due to policy denials from the api-gateway over a 5-minute window and will fire if it exceeds 10 drops/sec. This transforms Hubble from a reactive debugging tool into a proactive security and reliability monitor.

    Conclusion: The Kernel as the Source of Truth

    eBPF-based observability with Cilium and Hubble is not just an incremental improvement over existing tools; it is a fundamental shift in how we approach cloud-native network diagnostics. By tapping directly into the kernel, we get an unfiltered, performant, and context-rich view of how our services are interacting. We can move beyond guessing based on application logs and tcpdump traces to making definitive diagnoses based on the kernel's own policy enforcement verdicts.

    For senior engineers responsible for the stability and security of complex Kubernetes clusters, mastering these tools is no longer a luxury. It is a core competency for building resilient, observable, and secure systems in the cloud-native era.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles