Advanced eBPF Policies in Cilium for Zero-Trust Microservices

15 min read
Goh Ling Yong
Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

Beyond `NetworkPolicy`: The eBPF Advantage for Granular Security

Standard Kubernetes NetworkPolicy objects provide a foundational layer of network segmentation, but they are fundamentally limited by their reliance on the underlying CNI's implementation—often iptables. For senior engineers building secure, multi-tenant, or high-performance systems, the limitations of iptables become a critical bottleneck: scalability issues with large numbers of rules, performance overhead from traversing chains, and an inability to understand L7 application protocols. This is not a tenable foundation for a zero-trust architecture.

Enter Cilium and its eBPF-powered datapath. By attaching eBPF (extended Berkeley Packet Filter) programs directly to kernel hooks like XDP (eXpress Data Path) and TC (Traffic Control), Cilium bypasses iptables and kube-proxy entirely for service routing and policy enforcement. This fundamental architectural shift is not just a performance optimization; it's an enabler for a new class of highly efficient, identity-aware, and application-aware security policies.

This article assumes you understand the basics of Kubernetes networking and Cilium's role as a CNI. We will not cover installation or introductory concepts. Instead, we will focus on advanced, production-ready policy patterns that leverage the full power of eBPF to implement a robust zero-trust model.

The Performance Case: eBPF Datapath vs. `iptables`

Before diving into policy specifics, it's crucial to understand why eBPF offers a superior foundation. In a large cluster, kube-proxy in iptables mode creates extensive iptables chains. For each packet, the kernel must traverse these chains sequentially, a process whose latency scales linearly (O(n)) with the number of services and rules. This can introduce significant network latency and CPU overhead on nodes.

Cilium's eBPF datapath replaces this with highly efficient hash table lookups (O(1) complexity). When a packet arrives at a network interface, Cilium's eBPF program extracts its identity (derived from Kubernetes labels, service accounts, etc., and stored in a compact numeric format) and performs a map lookup to determine the applicable policy. This is orders of magnitude faster, especially at scale.

Conceptual Packet Flow Comparison:

* iptables-based CNI: Packet -> NIC -> Kernel Netfilter Hooks -> Traverse PREROUTING Chain -> Traverse FORWARD/INPUT Chain -> Traverse Custom CNI Chains -> Decision -> Traverse POSTROUTING Chain -> Egress

* Cilium eBPF CNI: Packet -> NIC -> TC/XDP eBPF Hook -> eBPF Map Lookup (Identity, Policy) -> Decision (Forward/Drop) -> Egress

This efficiency allows us to build complex, granular policies without fearing the performance degradation inherent in legacy systems.

Pattern 1: Identity-Based L3/L4 Policies for Internal Trust

Zero-trust dictates that we never trust, always verify. In Kubernetes, this means moving beyond simple pod-to-pod communication and establishing trust based on verifiable workload identity. Cilium excels at this by assigning a unique, cryptographic security identity to every endpoint based on its labels.

Let's model a realistic microservices application:

* frontend: Serves the UI. Namespace: app. Labels: app=frontend, tier=presentation.

* api-gateway: The entrypoint for all API calls. Namespace: app. Labels: app=api-gateway, tier=backend.

* user-service: Manages user data. Namespace: app. Labels: app=user-service, tier=backend.

* payment-service: Processes payments. Namespace: app. Labels: app=payment-service, tier=backend.

* postgres-db: The primary database. Namespace: db. Labels: app=postgres, tier=database.

Our security requirements are:

  • Default deny all ingress and egress traffic in the app namespace.
  • Allow ingress to api-gateway from frontend on TCP port 8080.
  • Allow api-gateway to talk to user-service and payment-service on TCP port 80.
  • Allow user-service and payment-service to talk to postgres-db on TCP port 5432.
  • No other communication is allowed (e.g., frontend cannot directly access user-service).
  • First, we establish a default-deny posture for the app namespace. This is a critical first step.

    yaml
    # 01-default-deny.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "default-deny-app-namespace"
      namespace: app
    spec:
      endpointSelector: {}
      ingress: []
      egress: []

    Applying this policy will immediately block all traffic into and out of any pod in the app namespace. Now, we'll layer in our specific allow rules.

    yaml
    # 02-app-l4-policy.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "api-gateway-policy"
      namespace: app
    spec:
      endpointSelector:
        matchLabels:
          app: api-gateway
      ingress:
      - fromEndpoints:
        - matchLabels:
            app: frontend
        toPorts:
        - ports:
          - port: "8080"
            protocol: TCP
      egress:
      - toEndpoints:
        - matchLabels:
            app: user-service
        - matchLabels:
            app: payment-service
        toPorts:
        - ports:
          - port: "80"
            protocol: TCP
    
    ---
    
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "backend-services-policy"
      namespace: app
    spec:
      # This policy applies to multiple services
      endpointSelector:
        matchExpressions:
          - {key: app, operator: In, values: [user-service, payment-service]}
      egress:
        - toEndpoints:
          # Selects endpoints in any namespace with these labels
          - matchLabels:
              app: postgres
              tier: database
          toPorts:
            - ports:
              - port: "5432"
                protocol: TCP

    Analysis and Edge Cases:

    * endpointSelector: This is the core of the policy, defining which pods it applies to. We use matchLabels for single endpoints and matchExpressions to group multiple services under one policy object.

    fromEndpoints / toEndpoints: This is where Cilium's identity-based model shines. We are not specifying IP ranges or CIDRs. We are specifying the identity* of the allowed peer, based on its labels. If a user-service pod is rescheduled and gets a new IP, this policy remains effective without any changes.

    * Cross-Namespace Communication: Note that the backend-services-policy allows egress to an endpoint with app: postgres. Cilium resolves this identity across the entire cluster, regardless of namespace. This is a powerful abstraction over IP-based rules.

    Policy Evaluation Logic: Cilium policies are allow-by-default. When you apply a policy to an endpoint, it enters a default-deny state for that policy's scope (ingress/egress). Any traffic not explicitly allowed by a matching fromEndpoints or toEndpoints rule is dropped. If multiple policies select the same endpoint, their rules are unioned. A flow is allowed if any* applicable policy permits it.

    Pattern 2: Advanced L7 Policy Enforcement for APIs

    L3/L4 policies are good, but for true zero-trust, we need to understand the application protocol. What if a compromised api-gateway tries to execute a destructive DELETE operation on user-service, even though it's an allowed peer at L4?

    Cilium can enforce L7 policies, particularly for HTTP, gRPC, and Kafka, by transparently injecting a proxy (like Envoy) into the data path only for traffic that requires L7 inspection. This is done without sidecar containers, managed directly by the Cilium agent.

    Requirement: Allow api-gateway to call user-service, but only GET /v1/users and POST /v1/users. All other paths, including DELETE /v1/users, must be blocked.

    We modify the api-gateway-policy from the previous example.

    yaml
    # 03-api-gateway-l7-policy.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "api-gateway-policy"
      namespace: app
    spec:
      endpointSelector:
        matchLabels:
          app: api-gateway
      # Ingress policy remains the same (from frontend on 8080)
      ingress:
      - fromEndpoints:
        - matchLabels:
            app: frontend
        toPorts:
        - ports:
          - port: "8080"
            protocol: TCP
      # Egress policy is now L7-aware
      egress:
      - toEndpoints:
        - matchLabels:
            app: user-service
        toPorts:
        - ports:
          - port: "80"
            protocol: TCP
          rules:
            http:
            - method: "GET"
              path: "/v1/users/?.*"
            - method: "POST"
              path: "/v1/users"
      # Egress to payment-service can remain L4 if no L7 rules are needed
      - toEndpoints:
        - matchLabels:
            app: payment-service
        toPorts:
        - ports:
          - port: "80"
            protocol: TCP

    Implementation Details and Performance Considerations:

    * rules.http: This section activates L7 policy enforcement. The eBPF program on the api-gateway's node will redirect traffic destined for user-service on port 80 to the Envoy proxy managed by the Cilium agent.

    Path Matching: The path field is an extended regular expression. /v1/users/?. is used to match both the base path and any queries like /v1/users?id=123.

    * Proxy Overhead: L7 inspection is not free. It consumes additional CPU and memory on the node's Cilium agent and introduces a small amount of latency for the proxy hop. The key benefit of Cilium's approach is that this cost is only paid for the specific traffic flows where you define L7 rules. All other traffic (like the flow to payment-service) remains purely in the eBPF fast path.

    * Production Strategy: Use L7 policies judiciously. Apply them to critical API boundaries, sensitive data services, or inter-service communication that requires fine-grained access control. For bulk data transfer or latency-sensitive internal RPCs where L4 trust is sufficient, stick to L4 policies to maximize performance.

    * TLS Inspection: For HTTPS traffic, L7 visibility requires TLS termination or using technologies like mutual TLS (mTLS) where the proxy can be part of the trust chain. Cilium integrates with service meshes like Istio or can use its own mechanisms to manage certificates for this purpose, but that adds another layer of complexity.

    Pattern 3: DNS-Aware Egress Policies for External Services

    Microservices often need to communicate with external, third-party APIs (e.g., payment gateways, monitoring services, cloud provider APIs). Hardcoding IP addresses is brittle and unmanageable, especially for services hosted on CDNs with dynamic IPs.

    Cilium's DNS-aware policies solve this elegantly. The eBPF program can inspect DNS traffic, cache the results, and dynamically update the allowed egress IP list for a given FQDN.

    Requirement: The payment-service must be able to connect to api.stripe.com on port 443, but nowhere else on the internet.

    yaml
    # 04-payment-service-egress-policy.yaml
    apiVersion: "cilium.io/v2"
    kind: CiliumNetworkPolicy
    metadata:
      name: "payment-service-egress-policy"
      namespace: app
    spec:
      endpointSelector:
        matchLabels:
          app: payment-service
      egress:
        # Rule 1: Allow DNS lookups to kube-dns
        - toEndpoints:
          - matchLabels:
              "k8s:io.kubernetes.pod.namespace": kube-system
              "k8s:k8s-app": kube-dns
          toPorts:
            - ports:
              - port: "53"
                protocol: UDP
              rules:
                dns:
                  - matchPattern: "*"
    
        # Rule 2: Allow egress to the FQDN for Stripe
        - toFQDNs:
          - matchName: "api.stripe.com"
          toPorts:
            - ports:
              - port: "443"
                protocol: TCP
    
        # Rule 3: We still need the rule to talk to our internal DB!
        - toEndpoints:
          - matchLabels:
              app: postgres
              tier: database
          toPorts:
            - ports:
              - port: "5432"
                protocol: TCP

    How It Works and Critical Edge Cases:

    * DNS Interception: When payment-service attempts a DNS lookup for api.stripe.com, the request (typically to kube-dns) is allowed by the first egress rule. Cilium's eBPF program on the node sees the DNS response. It extracts the IP addresses (e.g., 52.85.124.21, 52.85.124.53, etc.) and stores them in an eBPF map associated with the identity of payment-service and the FQDN api.stripe.com.

    * Dynamic IP Updates: When payment-service then tries to open a TCP connection to one of Stripe's IPs, the eBPF program on egress checks if the destination IP is in the allowed map for that pod's identity. If it is, the connection is allowed.

    * The DNS TTL Race Condition: This is the most critical edge case. What happens if the DNS TTL for api.stripe.com is very short (e.g., 60 seconds), and the IP address changes? The pod might have a cached IP that is no longer valid, or Cilium's cache might not have the newest IP yet.

    * Cilium's Mitigation: Cilium respects the DNS record's TTL. When a DNS response is intercepted, the corresponding IPs are allowed for the duration of the TTL. After the TTL expires, the rule is flushed. The application must perform a new DNS lookup to re-populate the cache.

    * Application Impact: Your application's DNS caching behavior is critical. If your application (or language runtime, like the JVM) caches DNS lookups indefinitely, it may try to connect to a stale IP after Cilium's rule has expired. This will result in a blocked connection. You must configure your application's DNS cache TTL to be lower than the TTLs of the external services you interact with.

    * Debugging: This is a notoriously difficult problem to debug. A connection that worked moments ago is now failing. The key is to check Cilium's FQDN cache and correlate it with nslookup results from within the pod.

    Debugging and Observability with Hubble

    Complex policies are useless if you can't verify or debug them. iptables -L -v is not an option here. The tool of choice is Hubble, Cilium's observability component.

    Let's debug a failed request based on our L7 policy. A developer tries to add a DELETE endpoint to the api-gateway.

    bash
    # From inside the api-gateway pod
    $ curl -X DELETE http://user-service/v1/users/123
    curl: (56) Recv failure: Connection reset by peer

    The request fails, but the reason is unclear. We use the Hubble CLI to inspect recent traffic from the api-gateway.

    bash
    # On the Kubernetes node or via the hubble-cli pod
    $ hubble observe --from-pod app/api-gateway-xxxxxxxx-yyyyy -n app --to-pod app/user-service-zzzzzz-wwwww
    
    # Sample Hubble CLI Output
    TIMESTAMP            SOURCE                       DESTINATION                  TYPE          VERDICT     SUMMARY
    May 20 14:30:10.123  app/api-gateway-xxxxx:34872  app/user-service-zzzzz:80    l7-request    DENIED      HTTP/1.1 DELETE /v1/users/123
    May 20 14:30:10.124  app/api-gateway-xxxxx:34872  app/user-service-zzzzz:80    l7-response   DENIED      HTTP/1.1 403 Forbidden

    Hubble's output is unambiguous. It shows:

    * The source and destination pods.

    * The traffic TYPE is l7-request, confirming it was intercepted by the proxy.

    * The VERDICT is DENIED.

    * The SUMMARY provides the exact reason: an HTTP DELETE request to /v1/users/123 was observed, which did not match our allow policy.

    This level of detail is impossible to get from iptables logs or standard flow logs. For visual debugging, the Hubble UI can render a full service map, showing allowed flows in green and denied flows in red, allowing you to instantly spot misconfigurations.

    To debug the DNS issue, you can use Hubble to filter for DNS traffic and policy verdicts:

    bash
    # See DNS lookups and their outcomes from payment-service
    hubble observe --from-pod app/payment-service-xxxxx -n app --protocol udp --port 53 -t dns
    
    # See if egress traffic to a specific IP is being dropped
    hubble observe --from-pod app/payment-service-xxxxx -n app --to-ip 52.85.124.21 --verdict DROPPED

    Conclusion: A New Baseline for Cloud-Native Security

    Implementing a zero-trust security model in Kubernetes requires moving beyond the limitations of traditional, IP-based filtering. Cilium's eBPF-powered datapath provides the necessary foundation for building highly performant, scalable, and granular security policies.

    By mastering identity-based L3/L4 rules, surgical L7 API-level enforcement, and dynamic DNS-aware egress controls, senior engineers can construct a security posture that is both robust and aligned with the ephemeral, dynamic nature of cloud-native applications. These advanced patterns are not just theoretical; they are battle-tested solutions to real-world production security challenges. The key to success lies not only in writing the policies but also in leveraging observability tools like Hubble to maintain visibility and control over the complex interactions within a modern microservices architecture.

    Found this article helpful?

    Share it with others who might benefit from it.

    More Articles