Container Runtime Security with eBPF: Beyond Image Scanning

The Image Scanning Gap

Container image scanning identifies vulnerabilities in packages and libraries baked into the image. It answers: "what CVEs are present in this image?" What it cannot answer is anything about runtime behaviour: what system calls the container makes, what files it reads or writes during execution, what processes it spawns, or what network connections it initiates.

The gap is critical. A container with zero known CVEs can be compromised through a logic vulnerability in the application, a zero-day exploit, or a supply chain attack that injects malicious behaviour at runtime. The image scan passes; the container is actively compromised. Without runtime visibility, you find out when customer data appears on a dark web forum.

Conversely, a container with dozens of CVEs may run for years without any of them being exploited — the vulnerable code paths are never reached, the vulnerable service isn't network-accessible, or the CVE requires local access. Runtime visibility shows you which vulnerabilities are actually being triggered versus which are theoretical risks.

eBPF for Security

eBPF (extended Berkeley Packet Filter) is a Linux kernel technology that allows running sandboxed programs in the kernel in response to events. Originally designed for network packet filtering, it now supports hooking into virtually any kernel event: system calls, kernel functions, network events, file operations, process scheduling.

For security, eBPF provides a privileged observation point. Every system call a container makes — execve to spawn a process, connect to make a network connection, openat to read a file — passes through hooks that eBPF programs can observe and act upon. Because eBPF runs in the kernel, it cannot be bypassed by malware running in the container's user space. The malware can hide from tools running within the container namespace, but it cannot hide from the kernel.

The security-relevant capabilities: observing all system calls with full argument detail, attaching to kernel probes (kprobes/tracepoints) for deeper visibility, blocking specific system calls or arguments in real-time (using eBPF programs that return blocking error codes), and maintaining kernel-level audit trails that cannot be tampered with from user space.

Falco: Runtime Threat Detection

Falco (CNCF graduated project) is the most widely-deployed open-source runtime security tool. It uses eBPF (or a kernel module) to observe system calls and matches them against a rule engine to generate security alerts. Rules are written in a YAML-like DSL that expresses conditions on system call attributes.

# Falco rule: detect crypto-mining by process name
- rule: Container Cryptomining
  desc: Detect execution of known cryptomining binaries
  condition: >
    spawned_process and
    container and
    proc.name in (xmrig, minerd, cpuminer, ethminer, stratum)
  output: >
    Cryptominer detected (proc=%proc.name user=%user.name
    container=%container.name image=%container.image.repository)
  priority: CRITICAL
  tags: [container, cryptomining, mitre_execution]

# Falco rule: detect shell spawned from web server
- rule: Shell Spawned by Web Server
  desc: Web server process spawning a shell indicates potential RCE
  condition: >
    spawned_process and
    proc.name in (bash, sh, zsh, ash) and
    proc.pname in (nginx, apache2, httpd, php-fpm, node, python3)
  output: >
    Shell spawned by web server (shell=%proc.name parent=%proc.pname
    cmdline=%proc.cmdline container=%container.name)
  priority: HIGH
            

Default Falco rules cover the most common attack patterns: process spawning from unexpected parents (shell from a web server indicating RCE), reading sensitive files (/etc/shadow, private keys), writing to binary directories (/bin, /usr/bin), and outbound connections to unexpected ports. Custom rules can be written for application-specific baselines.

Cilium Tetragon: Enforcement at the Kernel Level

Tetragon extends eBPF-based observability to enforcement. While Falco generates alerts that trigger downstream response automation, Tetragon can block specific system calls in real-time — the malicious action is prevented before it completes, rather than detected after. This is the difference between detection and prevention at the kernel level.

Tetragon uses TracingPolicies to define what to observe and what to enforce. A TracingPolicy can specify: which system calls to trace, which arguments to match on, and what action to take on match (Notify, Override, Sigkill). The Sigkill action terminates the offending process immediately when a matching syscall is observed — before the syscall completes.

Practical use cases for enforcement: blocking container processes from calling ptrace (used for code injection), blocking writes to /proc/sys/kernel/core_pattern (used in container escapes), killing processes that attempt to create new namespaces without explicit permission, and blocking outbound connections to known-bad IPs at the kernel level rather than network level.

Seccomp and AppArmor

eBPF tools operate alongside, not instead of, kernel security mechanisms. Seccomp (Secure Computing Mode) allows restricting the set of system calls a container process can make. The Docker default seccomp profile blocks around 44 system calls that are not needed by most containers but are commonly used in exploitation. Kubernetes defaults to using this profile when seccompProfile is not explicitly set.

Custom seccomp profiles for specific workloads can be far more restrictive than the default. A Go web server needs perhaps 30-40 system calls to operate. Restricting it to exactly those calls means that any exploit trying to use ptrace, mount, or clone for privilege escalation will be blocked by the kernel before any userspace tool sees it.

AppArmor provides MAC (Mandatory Access Control) at the file and network level — what files a process can access, what capabilities it can use, and what network addresses it can connect to. AppArmor and seccomp enforce the policy; eBPF observes and alerts on policy violations and any behaviour that the policies don't cover.

Key Attack Detection Patterns

Container escape attempts: Calls to unshare with NEWNS or NEWPID flags, writes to /proc/sysrq-trigger, mounts of host paths, or access to host PID namespace. Any of these from an application container is an indicator of compromise.
Credential access: Reads of /etc/shadow, ~/.ssh/, Kubernetes service account tokens at /var/run/secrets/, or environment variable dumps via /proc/self/environ.
Lateral movement: Outbound connections to new internal IPs or ports that were not present in the baseline, particularly connections to the Kubernetes API server or cloud metadata endpoint from application containers.
Persistence: Writes to cron directories, /etc/init.d, ~/.bashrc, or startup script locations from processes that do not normally write there.
Cryptomining: High CPU usage combined with outbound connections to common mining pool ports (3333, 4444, 5555, 14444, 45700).

Deployment Considerations

Start in audit mode: Deploy Falco with alerts only. Run for 2-4 weeks to establish a baseline of normal behaviour and tune out false positives before integrating with automated response.
Integrate with SIEM: Falco JSON output integrates with Elasticsearch, Splunk, and any SIEM. Correlate runtime alerts with image scan findings — a runtime alert from a container with a known critical CVE is high priority.
Kubernetes admission controllers: Use OPA Gatekeeper or Kyverno to reject deployments that don't have seccomp profiles, AppArmor annotations, or Falco-compatible configurations. Runtime security and admission control enforce the same policies from different angles.
Performance overhead: eBPF-based tools have significantly lower overhead than kernel module alternatives. Falco in eBPF mode typically adds 1-5% CPU overhead on typical workloads. Measure against your specific load profile before production deployment.

The security stack: Image scanning at build and admission, seccomp/AppArmor at runtime boundary, eBPF observation throughout execution, and SIEM correlation across the full event stream. No single layer is sufficient — the defense is in their combination.

Container Runtime Security with eBPF:What Image Scanning Cannot See.