Beyond NetworkPolicy: eBPF-Powered L7 Security with Cilium

October 1, 2025

15 min read

Goh Ling Yong

Technology enthusiast and software architect specializing in AI-driven development tools and modern software engineering practices. Passionate about the intersection of artificial intelligence and human creativity in building tomorrow's digital solutions.

The Inadequacy of L3/L4 for Microservice Security

As senior engineers building distributed systems on Kubernetes, we're all familiar with the native NetworkPolicy resource. It's a foundational tool for network segmentation, allowing us to control ingress and egress traffic between pods based on IP blocks (L3) and ports (L4). While essential, this model falls critically short in a microservices paradigm.

Consider a typical scenario: a user-service exposes a REST API on port 8080. A frontend-service needs to read user data (GET /api/v1/users/{id}), while an admin-service needs to delete users (DELETE /api/v1/users/{id}). A standard NetworkPolicy can only permit or deny traffic from the frontend-service to the user-service on port 8080. It has zero visibility into the HTTP method or path. Both the read and the destructive delete operations are equally allowed, leaving a significant security gap. You're left to handle this authorization logic within the application code, which can lead to inconsistencies and vulnerabilities.

This is where Container Network Interfaces (CNIs) that leverage eBPF, most notably Cilium, fundamentally change the game. By attaching eBPF programs to kernel hooks, Cilium can inspect and make policy decisions on network packets with full L7 context, directly in the kernel, before the packet even reaches the application's network stack. This provides a highly efficient, transparent, and powerful mechanism for enforcing application-aware security policies.

This article will not be an introduction to Cilium. We assume you understand its role as a CNI and its identity-based security model. Instead, we will focus on the advanced implementation details of crafting, deploying, and debugging L7 policies in a production environment.

Cilium's eBPF Datapath: A Kernel-Level Deep Dive

To appreciate the power of L7 policies in Cilium, you must first understand how it works under the hood. Unlike iptables-based CNIs that rely on traversing long, complex chains of rules in kernel space, Cilium's eBPF datapath is event-driven and significantly more performant.

Identity, Not IP: Cilium assigns a security identity to each pod based on its labels. This identity is a numeric identifier stored in an eBPF map. When a pod is created, the Cilium agent derives its identity and programs the eBPF maps on its node. Policies are then translated into rules based on these identities, not ephemeral pod IPs. This is crucial for scalability, as a policy allowing identity 123 to talk to identity 456 is a single rule, regardless of how many pods share those identities.

eBPF Program Attachment: For network policy enforcement, Cilium attaches eBPF programs to the Traffic Control (TC) hooks of the network interface (e.g., the veth pair connected to the pod). When a packet enters or leaves a pod, it triggers the attached eBPF program.

The Enforcement Flow:

* A packet is sent from pod-A to pod-B.

* The eBPF program on pod-A's egress hook is triggered.

* The program extracts the source identity (from pod-A) and destination IP.

* It performs a lookup in an eBPF map (the cilium_ipcache) to find the security identity associated with the destination IP (pod-B's identity).

* It then consults another eBPF map containing the policy rules. It checks if identity(pod-A) is allowed to communicate with identity(pod-B) on the given destination port.

* This is where L7 parsing begins. If the policy includes L7 rules (e.g., for HTTP), the eBPF program doesn't immediately forward the packet. It signals to the kernel that it needs to see more of the TCP stream. The kernel then feeds subsequent packets for that connection to the eBPF program, which parses the application layer protocol. For HTTP, it can identify the method (GET, POST), path (/api/v1/users), and even headers.

* Based on the full L3/L4/L7 context, the eBPF program makes a final verdict: ALLOW or DROP. This decision happens entirely within the kernel.

This kernel-level parsing for common protocols like HTTP, Kafka, and gRPC is what makes Cilium so efficient. It avoids the overhead of context switching and memory copies associated with diverting traffic to a user-space proxy for every single packet.

Production Pattern 1: Granular HTTP/REST API Control

Let's implement the scenario described earlier. We have three services:

* frontend: app=frontend

* billing-service: app=billing

* user-service: app=users

The user-service exposes its API on port 80. Our security requirements are:

frontend can only perform GET requests to /api/v1/users and /api/v1/users/{id}.

billing-service can only perform POST requests to /api/v1/payments.

No other service can access the user-service.

All other API calls to user-service should be denied.

Here is the CiliumNetworkPolicy to enforce this:

yaml

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "user-service-api-policy"
  namespace: "default"
spec:
  endpointSelector:
    matchLabels:
      app: users
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: frontend
    toPorts:
    - ports:
      - port: "80"
        protocol: TCP
      rules:
        http:
        - method: "GET"
          path: "/api/v1/users/?.*"
  - fromEndpoints:
    - matchLabels:
        app: billing
    toPorts:
    - ports:
      - port: "80"
        protocol: TCP
      rules:
        http:
        - method: "POST"
          path: "/api/v1/payments"

Analysis of the Implementation:

* endpointSelector: This policy applies to all pods with the label app: users.

* ingress: We are defining rules for incoming traffic.

* First Rule Block (fromEndpoints: app: frontend):

* This rule applies to traffic originating from pods with the app: frontend label.

* toPorts: It targets traffic destined for port 80 on the user-service pods.

* rules.http: This is the L7 magic. We specify an array of HTTP rules.

* method: "GET": We only allow the GET method.

path: "/api/v1/users/?.": We use a regex-like pattern. This allows both /api/v1/users and any sub-path like /api/v1/users/123. Cilium supports ERE (Extended Regular Expression) syntax here, but it's important to note that complex regex can have performance implications as the matching is done per-packet/per-request.

* Second Rule Block (fromEndpoints: app: billing):

* This rule is for the billing-service, allowing only POST requests to the exact path /api/v1/payments.

Verification and Debugging:

To see this policy in action, you can exec into the frontend pod and test the endpoints:

bash

# From inside the frontend pod
# This should succeed (HTTP 200 OK)
curl -s -o /dev/null -w "%{http_code}" http://user-service/api/v1/users/123

# This should fail (Connection will time out as packets are dropped)
curl -X DELETE -s -o /dev/null -w "%{http_code}" http://user-service/api/v1/users/123

How do we know why the DELETE request failed? This is where Cilium's observability tools are indispensable. On the node running the user-service pod, run cilium monitor:

bash

# Filter for dropped packets related to the user-service pod
$ cilium monitor --type drop --from <user-service-pod-id>

# Example output for the denied DELETE request
xx drop (Policy denied) flow 0x... identity world->4321 proto 6 port 80 sport 49876 -> 10.0.1.55:80 tcp flags 0x10 HTTP/1.1 DELETE /api/v1/users/123

The monitor output explicitly tells you Policy denied and even shows the parsed L7 data: HTTP/1.1 DELETE /api/v1/users/123. This level of introspection is invaluable for debugging complex policy interactions in a production system.

Production Pattern 2: Securing gRPC and Kafka Traffic

Modern systems are not just REST. Let's extend our policies to gRPC and Kafka, where Cilium's deep packet inspection capabilities truly shine.

Scenario: gRPC Service Protection

Imagine a product-catalog service (label app=catalog) that exposes a gRPC API. A recommendation-service (app=reco) should only be allowed to call the GetProduct RPC, while an inventory-service (app=inventory) should only be able to call the UpdateStock RPC.

yaml

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "catalog-grpc-policy"
  namespace: "default"
spec:
  endpointSelector:
    matchLabels:
      app: catalog
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: reco
    toPorts:
    - ports:
      - port: "50051"
        protocol: TCP
      rules:
        http: # gRPC is carried over HTTP/2
        - method: "POST"
          path: "/com.example.catalog.ProductService/GetProduct"
  - fromEndpoints:
    - matchLabels:
        app: inventory
    toPorts:
    - ports:
      - port: "50051"
        protocol: TCP
      rules:
        http:
        - method: "POST"
          path: "/com.example.catalog.ProductService/UpdateStock"

Key Insight: gRPC calls are essentially HTTP/2 POST requests where the path is /package.Service/Method. Cilium's HTTP parser understands this convention, allowing you to create highly specific policies for individual RPC methods. You are effectively creating a micro-firewall for your gRPC API at the kernel level.

Scenario: Kafka Topic Authorization

For event-driven architectures, securing Kafka is paramount. Cilium's Kafka protocol parser allows for policy enforcement based on topic and role (produce/consume).

Requirements:

* order-service (app=orders) can produce to the orders-topic.

* shipping-service (app=shipping) can consume from the orders-topic.

* No other access to orders-topic is allowed.

yaml

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "kafka-broker-policy"
  namespace: "kafka"
spec:
  endpointSelector:
    matchLabels:
      app: kafka-broker
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: orders
    toPorts:
    - ports:
      - port: "9092"
        protocol: TCP
      rules:
        kafka:
        - role: "produce"
          topic: "orders-topic"
  - fromEndpoints:
    - matchLabels:
        app: shipping
    toPorts:
    - ports:
      - port: "9092"
        protocol: TCP
      rules:
        kafka:
        - role: "consume"
          topic: "orders-topic"

This policy, applied to the Kafka brokers themselves, ensures that only the order-service can write to the topic and only the shipping-service can read from it. Any other pod attempting to access this topic will have its Kafka protocol requests parsed and dropped by eBPF programs on the broker's node. This offloads authorization from the Kafka ACL system to the network layer, simplifying configuration and providing a defense-in-depth layer.

Advanced Edge Case: The Envoy Interception Trade-off

The pure eBPF datapath is incredibly fast, but it has limitations. The in-kernel parsers are optimized for specific, well-known protocols. What happens when you need to enforce policy on a more obscure protocol, or require complex logic that eBPF can't handle, like JWT token validation?

This is where Cilium's integration with Envoy comes into play. When a CiliumNetworkPolicy contains rules that the eBPF parser cannot handle natively, Cilium transparently injects an Envoy proxy into the pod's network path to handle the L7 processing.

When does this happen?

* Using headerMatches in HTTP rules to inspect specific headers.

* Using more advanced features like CiliumEnvoyConfig for custom Lua scripting or external authorization checks.

* Policies for protocols without a native eBPF parser (e.g., Memcached, Redis).

Example: Policy with Header Matching

Let's modify our first HTTP policy to only allow requests from the frontend if they contain the header X-Request-Source: frontend-app.

yaml

# ... (previous policy structure)
      rules:
        http:
        - method: "GET"
          path: "/api/v1/users/?.*"
          headerMatches:
          - name: "X-Request-Source"
            value: "frontend-app"

When you apply this policy, Cilium detects the headerMatches field and automatically enables the Envoy proxy for this traffic path. You can verify this using cilium endpoint list and inspecting the Proxy column.

Performance Considerations:

* Pure eBPF: Sub-millisecond overhead. The processing happens in the kernel with no context switches.

* Envoy Path: Traffic flows from the application, through the kernel (eBPF), is redirected to the Envoy proxy in user-space, processed, sent back to the kernel (eBPF), and then out to the network. This path involves multiple context switches and memory copies, adding latency (typically a few milliseconds) and consuming more CPU/memory.

As a senior engineer, the key takeaway is to prefer policies that can be handled by the pure eBPF datapath whenever possible. Use Envoy-backed policies judiciously for scenarios where their advanced capabilities are strictly necessary. Always benchmark the performance impact of introducing Envoy-proxied L7 rules in your critical latency-sensitive paths.

Debugging in the Trenches: Kernel Versioning and Hubble

Working with eBPF means you are closer to the kernel, and this comes with its own set of challenges.

Kernel Version Dependencies: The capabilities of eBPF have evolved rapidly with Linux kernel versions. A feature available in kernel 5.10 might not be present in 4.19. Cilium is good at detecting kernel capabilities at startup, but it's crucial to be aware of this. For example, some of the more advanced eBPF-based host routing or service mesh features may require newer kernels. Always check the Cilium documentation for the minimum kernel version required for the features you intend to use. Running a cilium status command on a node will give you a quick overview of detected kernel capabilities.

Advanced Observability with Hubble: While cilium monitor is excellent for real-time event streams on a single node, it's not practical for debugging distributed flows across a cluster. This is where Hubble, Cilium's observability component, becomes essential.

Hubble provides a UI, CLI, and metrics to visualize and understand network flows and policy decisions. The hubble observe command is your best friend for debugging.

bash

# Trace the flow between the frontend and user-service
hubble observe --from-app frontend --to-app users -n default --follow

# Example output showing a dropped flow
TIMESTAMP            SOURCE                  DESTINATION              TYPE     VERDICT   SUMMARY
Jan 12 10:30:01.123  default/frontend-xyz-1  default/user-service-abc-2 L7_REQUEST  DROPPED   HTTP/1.1 DELETE /api/v1/users/123 (Policy denied)

Hubble's service map can visually represent these dependencies and highlight where policies are dropping traffic, making it exponentially faster to pinpoint issues in a complex microservices graph than tailing logs on individual nodes.

Handling Encrypted Traffic: The Visibility Challenge

An obvious question arises: how can Cilium enforce L7 policies on encrypted traffic, such as TLS?

The short answer is: it can't, not directly. eBPF operates at a layer where the application payload is just an opaque stream of encrypted bytes.

There are two primary production patterns to address this:

Coexistence with a Service Mesh (e.g., Istio): This is a very common and robust pattern. You let Cilium handle the highly efficient CNI, network routing, and L3/L4 policy enforcement. You then deploy a service mesh like Istio on top. The Istio sidecar proxy terminates TLS, at which point it can inspect the decrypted L7 traffic and apply its own set of policies (AuthorizationPolicy). In this model, Cilium and Istio work together: Cilium secures the network fabric, and Istio secures the application layer post-decryption.

Terminating TLS at the Ingress: For traffic entering the cluster, you can use an Ingress controller (like NGINX, or Cilium's built-in Gateway API implementation) to terminate TLS. From that point on, traffic within the cluster can be unencrypted, allowing Cilium's L7 policies to function. This is simpler than a full service mesh but only addresses north-south traffic, not east-west (pod-to-pod) encryption.

Emerging capabilities in Cilium are exploring transparent TLS encryption and even limited inspection using kernel-level TLS (kTLS), but the service mesh pattern remains the most mature and flexible solution for securing encrypted L7 traffic today.

Conclusion: A New Paradigm for Network Security

Cilium's eBPF-powered L7 policies represent a paradigm shift from traditional network security models. By moving application-aware enforcement into the kernel, we gain a level of performance, transparency, and security granularity that is unattainable with iptables-based solutions or user-space proxies alone.

For senior engineers, mastering these capabilities is no longer optional. It is a fundamental tool for building secure, observable, and high-performance cloud-native systems. The ability to craft precise rules for HTTP paths, gRPC methods, and Kafka topics, and to debug them effectively using tools like Hubble, provides a powerful defense-in-depth layer that hardens your application posture against both internal and external threats. While the learning curve is steeper and requires a deeper understanding of networking and kernel concepts, the operational and security benefits are immense.

The Inadequacy of L3/L4 for Microservice Security

Cilium's eBPF Datapath: A Kernel-Level Deep Dive

Production Pattern 1: Granular HTTP/REST API Control

Production Pattern 2: Securing gRPC and Kafka Traffic

Advanced Edge Case: The Envoy Interception Trade-off

Debugging in the Trenches: Kernel Versioning and Hubble

Handling Encrypted Traffic: The Visibility Challenge

Conclusion: A New Paradigm for Network Security

Found this article helpful?