Advanced eBPF Network Policies in Cilium for Zero-Trust K8s

October 7, 2025

15 min read

Goh Ling Yong

Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Inadequacy of `iptables` for Modern Cloud-Native Security

For any engineer operating Kubernetes at scale, the limitations of the default NetworkPolicy resource, and more fundamentally, its common implementation via iptables, become painfully apparent. Standard policies operate at L3/L4, relying on IP addresses and ports. In a dynamic Kubernetes environment where pods are ephemeral and IPs change constantly, this model is brittle. Furthermore, iptables performance degrades linearly as the number of rules increases, leading to significant latency and CPU overhead in clusters with thousands of services and policies.

This isn't a critique of iptables itself—it's a testament to a tool being stretched beyond its intended design. The core problem is the context switch and packet traversal through chains of rules in kernel space. For every packet, the kernel must walk these chains to find a match. In a microservices mesh with heavy east-west traffic, this overhead is multiplied, creating a bottleneck that directly impacts application performance.

This is the context in which eBPF (extended Berkeley Packet Filter) emerges as a transformative technology for cloud-native networking. By allowing sandboxed programs to run directly within the Linux kernel, eBPF enables tools like Cilium to create a networking and security datapath that is more performant, programmable, and context-aware. Cilium bypasses iptables entirely for pod-to-pod traffic, attaching eBPF programs to network interfaces to make policy decisions at the earliest possible point in the packet processing pipeline. Policy lookups become O(1) operations using eBPF maps (highly efficient key-value stores in the kernel), irrespective of the number of policies.

This article assumes you understand the basics of Kubernetes networking and NetworkPolicy. We will dive directly into advanced security patterns that are only possible with Cilium's eBPF datapath and its custom resource definitions (CRDs), CiliumNetworkPolicy and CiliumClusterwideNetworkPolicy.

Core Mechanism: Identity-Based Security with eBPF

Before we dive into policy examples, it's crucial to understand Cilium's core concept: identity-based security. Instead of using a pod's transient IP address as its primary identifier, Cilium uses a stable identity derived from its Kubernetes labels.

Here's the workflow:

Identity Allocation: When a pod is created, the Cilium agent on the node inspects its labels (e.g., app=frontend, role=api).

KVStore Registration: The agent communicates with a central key-value store (like etcd) to either retrieve an existing numeric Security Identity for that unique set of labels or allocate a new one. This identity is a 16-bit unsigned integer, allowing for up to 65,536 unique identities cluster-wide.

eBPF Map Programming: The agent programs an eBPF map on the node, mapping the pod's local IP address to its newly assigned Security Identity.

Policy Enforcement: When a packet leaves a pod, an eBPF program attached to the pod's network interface (veth pair) looks up the destination IP in another eBPF map. This map contains the Security Identities of all other pods in the cluster. The eBPF program then checks a policy map to see if SourceIdentity is allowed to communicate with DestinationIdentity on the specified port and protocol. This entire check happens in the kernel with a few hash table lookups, resulting in near-native packet processing speed.

This abstraction from IP addresses is what enables powerful, declarative policies that remain stable even as pods are rescheduled and reassigned IPs across the cluster.

Production Example 1: Strict Ingress Control for a Backend API

Let's model a common microservice scenario. We have a payment-api that should only accept ingress traffic from the checkout-api and a specific batch-processor job. Any other pod, even within the same namespace, should be denied access.

First, let's define our pods with appropriate labels:

yaml

# checkout-api.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: checkout-api
  namespace: production
spec:
  selector:
    matchLabels:
      app: checkout-api
  template:
    metadata:
      labels:
        app: checkout-api
        role: frontend-api
    spec:
      containers:
      - name: main
        image: my-repo/checkout-api:1.2.0
---
# payment-api.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-api
  namespace: production
spec:
  selector:
    matchLabels:
      app: payment-api
  template:
    metadata:
      labels:
        app: payment-api
        role: backend-api
    spec:
      containers:
      - name: main
        image: my-repo/payment-api:2.5.1
        ports:
        - containerPort: 8080
---
# batch-processor.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: nightly-settlement
  namespace: ops
spec:
  template:
    metadata:
      labels:
        app: batch-processor
        task: settlement
    spec:
      containers:
      - name: processor
        image: my-repo/batch-processor:3.0.0
      restartPolicy: Never

Now, we'll use a CiliumNetworkPolicy to enforce our desired ingress rule on the payment-api.

yaml

# payment-api-policy.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: payment-api-ingress-policy
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: payment-api
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: checkout-api
        role: frontend-api
    - matchLabels:
        "k8s:io.kubernetes.pod.namespace": ops
        app: batch-processor
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP

Dissection of the Policy:

* endpointSelector: This targets the pods to which the policy applies. Here, any pod with the label app: payment-api in the production namespace.

* ingress: This block defines the allowed incoming traffic rules. By default, if an ingress block is present, all other ingress traffic is denied (zero-trust default).

* fromEndpoints: This is the core of identity-based security. Instead of specifying CIDRs, we specify label selectors for allowed source pods.

The first selector allows any pod with both* app: checkout-api and role: frontend-api.

* The second selector demonstrates a cross-namespace rule. It allows pods from the ops namespace ("k8s:io.kubernetes.pod.namespace": ops) that also have the app: batch-processor label. The k8s: prefix denotes a reserved label that Cilium automatically applies to endpoints.

* toPorts: This specifies that the allowed traffic must be on TCP port 8080. Traffic to other ports on the payment-api pod will be dropped.

This policy is far more robust and readable than an IP-based equivalent. It describes intent rather than network topology.

Advanced Scenario: L7-Aware Policies for API Security

L3/L4 policies are often insufficient. Consider a shared internal API gateway that routes requests to different backend services based on the HTTP path. We might want to allow a user-service to read data (GET /api/v1/users/{id}) but prevent it from deleting data (DELETE /api/v1/users/{id}). An admin-service, however, should be allowed to perform both actions.

This requires L7 visibility, which Cilium provides by transparently integrating with an Envoy proxy. When an L7 policy is applied, Cilium's eBPF datapath intercepts the relevant traffic and redirects it to an Envoy proxy running on the same node without any application configuration changes.

Production Example 2: Granular HTTP Method and Path Control

Let's secure an internal-gateway pod. We have two clients: user-profile-service and admin-dashboard.

yaml

# internal-gateway-l7-policy.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: internal-gateway-l7-policy
  namespace: core-infra
spec:
  endpointSelector:
    matchLabels:
      app: internal-gateway
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: user-profile-service
    toPorts:
    - ports:
      - port: "80"
        protocol: TCP
      rules:
        http:
        - method: "GET"
          path: "/api/v1/users/.*"
        - method: "PUT"
          path: "/api/v1/users/[0-9]+/profile"
  - fromEndpoints:
    - matchLabels:
        app: admin-dashboard
    toPorts:
    - ports:
      - port: "80"
        protocol: TCP
      rules:
        http:
        - method: "GET"
          path: "/api/v1/users/.*"
        - method: "DELETE"
          path: "/api/v1/users/[0-9]+"
        - method: "GET"
          path: "/api/v1/metrics"

Dissection of the L7 Policy:

* The policy is split into two ingress stanzas, one for each source identity.

For user-profile-service: We open TCP port 80, but add an L7 rules block. This block specifies that only GET requests to paths matching the regex /api/v1/users/. and PUT requests to /api/v1/users/[0-9]+/profile are allowed. Any other request from this service (e.g., POST /api/v1/users or DELETE /api/v1/users/123) will receive an HTTP 403 Forbidden response from the Envoy proxy.

* For admin-dashboard: This service has more privileges. It can GET and DELETE users, and also access the /api/v1/metrics endpoint. The path matching uses POSIX ERE (Extended Regular Expression) syntax, providing powerful matching capabilities.

Performance Considerations for L7 Policies:

Enabling L7 inspection is not free. It involves redirecting traffic from the pure eBPF datapath to the Envoy proxy in user space. This introduces latency compared to L3/L4-only policies. However, the overhead is localized to the specific pods and ports targeted by the L7 policy. Best practice is to apply L7 policies surgically only where deep packet inspection is required, while using performant L3/L4 identity-based policies for the majority of traffic.

Advanced Scenario: DNS-Aware Egress Policies for External Services

Controlling egress traffic is a critical component of a zero-trust posture. A common requirement is to allow a pod to connect to a specific external service (e.g., a third-party payment provider like api.stripe.com) but nothing else. The challenge is that the IP addresses for these FQDNs can change frequently and unpredictably.

Basing egress policies on static IP addresses is a maintenance nightmare and prone to failure. Cilium solves this with DNS-aware policies.

Production Example 3: Locking Down Egress to a Specific FQDN

Imagine a reporting-service that needs to upload data to an S3 bucket (my-company-reports.s3.us-east-1.amazonaws.com) and send metrics to Datadog (api.datadoghq.com).

yaml

# reporting-service-egress-policy.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: reporting-service-egress
  namespace: analytics
spec:
  endpointSelector:
    matchLabels:
      app: reporting-service
  egress:
  - toFQDNs:
    - matchName: "my-company-reports.s3.us-east-1.amazonaws.com"
    - matchName: "api.datadoghq.com"
    toPorts:
    - ports:
      - port: "443"
        protocol: TCP
  # Allow DNS traffic itself, otherwise FQDN lookups will fail!
  - toEndpoints:
    - matchLabels:
        "k8s:io.kubernetes.pod.namespace": kube-system
        "k8s:k8s-app": kube-dns
    toPorts:
    - ports:
      - port: "53"
        protocol: UDP
      rules:
        dns:
        - matchPattern: "*"

Dissection of the FQDN Policy:

* egress block: This defines allowed outbound traffic. If present, all other egress traffic is denied by default.

* toFQDNs: This is the key element. We specify the fully qualified domain names the pod is allowed to connect to.

* How it works: When the reporting-service pod attempts to resolve api.datadoghq.com, the Cilium agent intercepts the DNS request. It allows the request to proceed to kube-dns, but it also inspects the response. It then caches the mapping of api.datadoghq.com to its resolved IP addresses (e.g., 52.20.126.253). Cilium dynamically programs the eBPF maps on the node to allow egress traffic from the reporting-service pod to these specific destination IPs on TCP port 443.

* DNS TTL: Cilium respects the TTL of the DNS records. When a record expires from its cache, it will re-allow a DNS lookup for that name and update its eBPF maps with the new IPs. This ensures the policy remains effective even as DNS records change.

* Allowing DNS: A critical and often overlooked part of FQDN policies is that you must explicitly allow the pod to make DNS requests. The second egress stanza in the example does exactly this. It allows UDP traffic on port 53 to the kube-dns service endpoints.

Edge Cases and Production Hardening

Implementing these policies in production requires consideration for several edge cases.

Policy for `hostNetwork` Pods

Pods running with hostNetwork: true (e.g., monitoring agents like Datadog or Prometheus node-exporter) bypass the pod network namespace and bind directly to the node's network interface. Standard NetworkPolicy cannot target them effectively. Cilium can enforce policies on these pods using CiliumClusterwideNetworkPolicy (CCNP) combined with a nodeSelector.

yaml

# Secure node-exporter access
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: allow-prometheus-to-nodes
spec:
  nodeSelector:
    matchLabels: {}
  ingress:
  - fromEndpoints:
    - matchLabels:
        "k8s:io.kubernetes.pod.namespace": monitoring
        app: prometheus
    toPorts:
    - ports:
      - port: "9100" # node-exporter port
        protocol: TCP

This CCNP applies to the host networking stack on all nodes (nodeSelector: {}) and allows ingress on port 9100 only from pods labeled app: prometheus in the monitoring namespace.

Policy Auditing and Dry-Run

Applying a restrictive network policy in a production environment can be risky. Cilium provides tools to mitigate this.

Policy Audit Mode: You can apply a policy in a non-enforcing, audit-only mode by adding the annotation policy.cilium.io/mode: audit to the policy manifest. In this mode, traffic that would have been dropped* is instead allowed, but a trace notification is logged. This allows you to observe the impact of a policy before enabling enforcement.

* Hubble CLI for Observability: Hubble provides deep observability into the network flows within your cluster. You can use it to verify policy behavior in real-time.

bash

    # Watch for dropped packets to the payment-api pod
    $ hubble observe --namespace production --to-pod payment-api --verdict DROPPED -f
    
    # See if a specific flow is allowed or denied
    $ cilium policy verdict --from-pod default:my-app --to-pod production:payment-api --dport 8080
    -> INGRESS: ALLOWED
    -> EGRESS: ALLOWED
    Final verdict: ALLOWED

Explicit Deny Policies

While the default-deny model is powerful, sometimes you need to carve out exceptions in a generally permissive environment. CiliumNetworkPolicy supports explicit deny rules. These rules take precedence over any allow rules.

yaml

# Deny access to an internal-only endpoint from the outside
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: deny-admin-endpoint-from-dmz
  namespace: backend
spec:
  endpointSelector:
    matchLabels:
      app: user-database
  ingress:
  - fromEndpoints:
    - matchLabels:
        "k8s:io.kubernetes.pod.namespace": dmz
    toPorts:
    - ports:
      - port: "5432"
        protocol: TCP
      rules:
        # This is a hypothetical L7 rule for Postgres
        # Demonstrating the concept of a deny rule
        l7proto: "postgresql"
        l7: 
        - "query_action": "DROP"
          "query_table": "users"

Note: This is a conceptual example; actual L7 deny rules depend on the supported parsers. The structure demonstrates the deny capability.

Conclusion: A Paradigm Shift in Kubernetes Security

Transitioning from iptables-based CNIs to an eBPF-powered solution like Cilium represents a fundamental shift in how we secure Kubernetes clusters. It's a move from a brittle, IP-centric model to a robust, identity-aware model that aligns with cloud-native principles.

By leveraging CiliumNetworkPolicy, senior engineers can:

Implement True Zero-Trust: Build on a default-deny foundation where only explicitly allowed communication paths are permitted.

Decouple Security from Topology: Define policies based on service identity (labels), making them resilient to pod churn and IP changes.

Achieve Granular Control: Enforce L7 rules for APIs and DNS-aware rules for egress, providing a level of security unattainable with standard NetworkPolicy.

Enhance Performance and Scalability: Eliminate the iptables bottleneck, leading to lower latency and higher throughput, especially in large and high-traffic clusters.

The patterns discussed here—identity-based ingress, L7 API filtering, and FQDN-based egress—are not just features; they are the building blocks for a modern, secure, and performant microservices architecture. Mastering them is essential for any engineer responsible for the security and stability of production Kubernetes environments.

The Inadequacy of `iptables` for Modern Cloud-Native Security

Core Mechanism: Identity-Based Security with eBPF

Production Example 1: Strict Ingress Control for a Backend API

Advanced Scenario: L7-Aware Policies for API Security

Production Example 2: Granular HTTP Method and Path Control

Advanced Scenario: DNS-Aware Egress Policies for External Services

Production Example 3: Locking Down Egress to a Specific FQDN

Edge Cases and Production Hardening

Policy for `hostNetwork` Pods

Policy Auditing and Dry-Run

Explicit Deny Policies

Conclusion: A Paradigm Shift in Kubernetes Security

Found this article helpful?