K8s eBPF Observability: 5 Practical Patterns from Kernel Tracing to Full-Stack Monitoring

DevOps

When Traditional Monitoring Hits the K8s Kernel Black Hole

Have you ever experienced this—Prometheus metrics look perfectly normal, but service latency inexplicably spikes? Your Sidecar proxy consumes 15% CPU but only tells you "connection timeout"? Logs are full of application-level errors, but you have zero visibility into what's happening at the kernel level?

This is the "triple blind spot" of K8s observability: Traditional monitoring only sees user space, completely blind to kernel-level events; Sidecar injection adds overhead, with Istio's data plane adding 2-5ms of latency; Distributed tracing's sampling rate means you'll never catch that critical 1% of requests.

eBPF changes everything. It lets you capture every detail of system calls directly in kernel space—without modifying the kernel, injecting Sidecars, or changing application code. From TCP retransmissions to process execution, from network packet drops to security events—eBPF gives K8s clusters true "full-stack X-ray vision."

This article walks you through 5 eBPF observability patterns from scratch, covering kernel tracing, network monitoring, security auditing, and performance analysis across the entire stack.

Core Concepts Reference Table

Concept Full Name Description
eBPF Extended Berkeley Packet Filter Sandbox VM in the Linux kernel allowing safe execution of custom programs in kernel space
BPF Program BPF Program eBPF code written and loaded into the kernel, attached to specific hook points
BPF Map BPF Mapping Table Data sharing structure between kernel and user space, supporting hash/array/ring types
bpftrace bpftrace High-level eBPF tracing language with awk-like syntax, ideal for quick prototyping
Cilium Cilium eBPF-based K8s CNI plugin providing networking, security, and observability
Hubble Hubble Cilium's observability component providing network traffic visualization and service dependency mapping
Kprobe Kernel Probe Dynamic kernel probe that can attach to kernel function entry/exit points
Tracepoint Tracepoint Static kernel tracing points predefined by kernel developers, more stable than kprobes
XDP eXpress Data Path eBPF hook for processing network packets at the NIC driver level with ultra-low latency
BPF Verifier BPF Verifier Safety checker in the kernel ensuring eBPF programs cannot crash the kernel
BTF BPF Type Format eBPF type information format enabling CO-RE (Compile Once, Run Everywhere)
Perf Event Performance Event Linux performance event subsystem, an important attachment point for eBPF programs

Five Challenges: Why K8s eBPF Observability Isn't "Just Install a Plugin"

Challenge 1: Kernel Version Compatibility Hell

eBPF features expand with each kernel version iteration. BPF trampoline requires 5.5+, BTF support needs 5.2+, yet many enterprise K8s nodes still run 4.19 or 5.4 kernels. Your carefully crafted eBPF program may fail to load on different nodes.

Challenge 2: BPF Verifier's Strict Restrictions

The BPF verifier rejects any program it cannot prove safe. Loops must be bounded, pointer accesses require null checks, and stack space is limited to 512 bytes. A slightly complex tracing logic may require repeated adjustments to pass verification.

Challenge 3: Production Environment Safety Concerns

eBPF programs run in kernel space. While the verifier provides safety guarantees, many security teams remain cautious about "running custom code in the kernel." Especially in finance and healthcare with strict compliance requirements, eBPF adoption requires rigorous security audits.

Challenge 4: Observability Data Explosion

eBPF can capture massive amounts of events from the kernel—every system call, every network packet, every context switch. In large K8s clusters, unfiltered eBPF data can generate millions of events per second, overwhelming storage and analysis systems.

Challenge 5: Multi-Cluster Correlation Tracing

When requests span multiple K8s clusters, kernel events captured by eBPF lack unified correlation identifiers. You can see TCP retransmissions in cluster A and DNS timeouts in cluster B, but correlating them to the same user request chain is extremely difficult.

Five-Step Implementation: From Kernel Tracing to Full-Stack Monitoring

Step 1: eBPF Program Basics—bpftrace One-Liners and C BPF Programs

bpftrace quick tracing:

# Trace all TCP connection establishment events
bpftrace -e 'kprobe:tcp_connect { printf("PID: %d, Comm: %s\n", pid, comm); }'

# Trace TCP retransmissions, count by process
bpftrace -e 'kprobe:tcp_retransmit_skb { @retrans[comm] = count(); }'

# Trace process execution (security audit)
bpftrace -e 'tracepoint:sched:sched_process_exec { printf("%s -> %s\n", comm, args->filename); }'

# Trace VFS read/write latency distribution
bpftrace -e 'kprobe:vfs_read { @start[tid] = nsecs; } kretprobe:vfs_read /@start[tid]/ { @ns = hist(nsecs - @start[tid]); delete(@start[tid]); }'

# Trace network connection state changes
bpftrace -e 'kprobe:tcp_set_state { printf("state: %d -> %d, pid: %d\n", arg1, arg2, pid); }'

C language eBPF program (TCP connection tracing):

// tcp_connect.bpf.c
#include <vmlinux.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>

struct tcp_connect_event {
    u32 pid;
    u32 saddr;
    u32 daddr;
    u16 dport;
    char comm[16];
};

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 256 * 1024);
} tcp_connect_events SEC(".maps");

SEC("kprobe/tcp_connect")
int BPF_KPROBE(trace_tcp_connect, struct sock *sk)
{
    struct tcp_connect_event *event;
    event = bpf_ringbuf_reserve(&tcp_connect_events, sizeof(*event), 0);
    if (!event)
        return 0;

    event->pid = bpf_get_current_pid_tgid() >> 32;
    event->saddr = BPF_CORE_READ(sk, __sk_common.skc_rcv_saddr);
    event->daddr = BPF_CORE_READ(sk, __sk_common.skc_daddr);
    event->dport = BPF_CORE_READ(sk, __sk_common.skc_dport);
    bpf_get_current_comm(&event->comm, sizeof(event->comm));

    bpf_ringbuf_submit(event, 0);
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

Step 2: Go-based eBPF Loader (cilium/ebpf library)

// main.go - eBPF TCP Connection Tracer
package main

import (
	"bytes"
	"encoding/binary"
	"errors"
	"fmt"
	"log"
	"net"
	"os"
	"os/signal"
	"syscall"

	"github.com/cilium/ebpf"
	"github.com/cilium/ebpf/link"
	"github.com/cilium/ebpf/ringbuf"
	"github.com/cilium/ebpf/rlimit"
)

//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -type tcp_connect_event bpf tcp_connect.bpf.c

type tcpConnectEvent struct {
	Pid   uint32
	Saddr uint32
	Daddr uint32
	Dport uint16
	Comm  [16]byte
}

func main() {
	if err := rlimit.RemoveMemlock(); err != nil {
		log.Fatalf("Failed to remove memlock limit: %v", err)
	}

	objs := bpfObjects{}
	if err := loadBpfObjects(&objs, nil); err != nil {
		log.Fatalf("Failed to load eBPF objects: %v", err)
	}
	defer objs.Close()

	kp, err := link.Kprobe("tcp_connect", objs.TraceTcpConnect, nil)
	if err != nil {
		log.Fatalf("Failed to attach kprobe: %v", err)
	}
	defer kp.Close()

	rd, err := ringbuf.NewReader(objs.TcpConnectEvents)
	if err != nil {
		log.Fatalf("Failed to create ringbuf reader: %v", err)
	}
	defer rd.Close()

	sig := make(chan os.Signal, 1)
	signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)

	fmt.Println("TCP connection tracing started, press Ctrl+C to exit...")
	fmt.Println("PID\tComm\t\tSrcAddr\t\tDstAddr")

	go func() {
		<-sig
		fmt.Println("\nStopping tracing...")
		rd.Close()
	}()

	for {
		record, err := rd.Read()
		if err != nil {
			if errors.Is(err, ringbuf.ErrClosed) {
				fmt.Println("Ringbuf closed")
				return
			}
			log.Printf("Failed to read ringbuf: %v", err)
			continue
		}

		var event tcpConnectEvent
		if err := binary.Read(bytes.NewReader(record.RawSample), binary.LittleEndian, &event); err != nil {
			log.Printf("Failed to parse event: %v", err)
			continue
		}

		srcIP := net.IP(uint32ToBytes(event.Saddr))
		dstIP := net.IP(uint32ToBytes(event.Daddr))
		dstPort := binary.BigEndian.Uint16([]byte{byte(event.Dport >> 8), byte(event.Dport & 0xff)})

		fmt.Printf("%d\t%s\t\t%s\t%s:%d\n",
			event.Pid,
			string(bytes.TrimRight(event.Comm[:], "\x00")),
			srcIP,
			dstIP,
			dstPort,
		)
	}
}

func uint32ToBytes(v uint32) [4]byte {
	var b [4]byte
	binary.LittleEndian.PutUint32(b[:], v)
	return b
}

Project go generate configuration:

// bpf_bpfel.go - Auto-generated by bpf2go (example structure)
// Code generated by bpf2go; DO NOT EDIT.
package main

import "github.com/cilium/ebpf"

type bpfTcpConnectEvent struct {
	Pid   uint32
	Saddr uint32
	Daddr uint32
	Dport uint16
	Comm  [16]byte
}

type bpfPrograms struct {
	TraceTcpConnect *ebpf.Program `ebpf:"trace_tcp_connect"`
}

type bpfMaps struct {
	TcpConnectEvents *ebpf.Map `ebpf:"tcp_connect_events"`
}

type bpfObjects struct {
	Programs bpfPrograms
	Maps     bpfMaps
}

func loadBpfObjects(obj *bpfObjects, opts *ebpf.CollectionOptions) error {
	return errors.New("This file is generated by bpf2go, please run go generate")
}

Step 3: Cilium Hubble Network Observability Setup

# cilium-values.yaml - Helm values for Cilium + Hubble
kubeProxyReplacement: true
hubble:
  enabled: true
  listenAddress: ":4244"
  relay:
    enabled: true
  ui:
    enabled: true
  metrics:
    enabled:
      - dns
      - drop
      - tcp
      - flow
      - icmp
      - http
    enableOpenMetrics: true
    dashboards:
      enabled: true
      namespace: monitoring
operator:
  replicas: 2
  prometheus:
    enabled: true
hostPort:
  enabled: true
ipam:
  mode: kubernetes
tunnel: vxlan
# Install Cilium with Hubble
helm repo add cilium https://helm.cilium.io/
helm repo update
helm install cilium cilium/cilium --version 1.17.0 \
  --namespace kube-system \
  -f cilium-values.yaml

# Enable Hubble
cilium hubble port-forward&
hubble observe --since 1m --output json

# View DNS queries
hubble observe --type l7-dns --since 5m

# View TCP connections
hubble observe --type tcp --verdict DROPPED --since 10m

# View traffic for a specific service
hubble observe --to-service my-app.default.svc.cluster.local --since 5m

# Export flow logs to file
hubble observe --output json --since 1h > hubble-flows.json

Hubble API Client (Go):

// hubble_client.go - Hubble Flow Monitoring Client
package main

import (
	"context"
	"fmt"
	"log"
	"os"
	"os/signal"
	"syscall"
	"time"

	"github.com/cilium/hubble/api/v1/flow"
	"github.com/cilium/hubble/api/v1/observer"
	"google.golang.org/grpc"
	"google.golang.org/grpc/credentials/insecure"
)

func main() {
	conn, err := grpc.NewClient("localhost:4245",
		grpc.WithTransportCredentials(insecure.NewCredentials()),
	)
	if err != nil {
		log.Fatalf("Failed to connect to Hubble gRPC: %v", err)
	}
	defer conn.Close()

	client := observer.NewObserverClient(conn)

	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	stream, err := client.GetFlows(ctx, &observer.GetFlowsRequest{
		Whitelist: []*flow.FlowFilter{
			{Verdict: []flow.Verdict{flow.Verdict_DROPPED}},
		},
		Since:  time.Now().Add(-5 * time.Minute).Format(time.RFC3339),
		Until:  time.Now().Add(1 * time.Hour).Format(time.RFC3339),
		Follow: true,
	})
	if err != nil {
		log.Fatalf("Failed to subscribe to Hubble flows: %v", err)
	}

	sig := make(chan os.Signal, 1)
	signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)

	fmt.Println("Monitoring dropped network traffic...")
	fmt.Println("Time\t\tSource Pod\t\tDest Pod\t\tReason")

	go func() {
		<-sig
		cancel()
	}()

	for {
		resp, err := stream.Recv()
		if err != nil {
			log.Printf("Failed to receive flow data: %v", err)
			return
		}

		if f := resp.GetFlow(); f != nil {
			srcPod := f.GetSource().GetPodName()
			dstPod := f.GetDestination().GetPodName()
			reason := f.GetDropReasonDesc().String()

			fmt.Printf("%s\t%s\t%s\t%s\n",
				time.Now().Format("15:04:05"),
				srcPod,
				dstPod,
				reason,
			)
		}
	}
}

Step 4: Security Tracing—Process Execution Monitoring

// exec_monitor.bpf.c - Process Execution Security Monitor
#include <vmlinux.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>

#define MAX_COMM_LEN 16
#define MAX_ARGS_LEN 128
#define MAX_FILENAME_LEN 128

struct exec_event {
    u32 pid;
    u32 ppid;
    u32 uid;
    u32 gid;
    char comm[MAX_COMM_LEN];
    char filename[MAX_FILENAME_LEN];
    char args[MAX_ARGS_LEN];
};

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 256 * 1024);
} exec_events SEC(".maps");

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 1024);
    __type(key, u32);
    __type(value, struct exec_event);
} pending_execs SEC(".maps");

SEC("tracepoint/sched/sched_process_exec")
int trace_exec(struct trace_event_raw_sched_process_exec *ctx)
{
    struct exec_event *event;
    event = bpf_ringbuf_reserve(&exec_events, sizeof(*event), 0);
    if (!event)
        return 0;

    event->pid = bpf_get_current_pid_tgid() >> 32;
    event->uid = bpf_get_current_uid_gid() & 0xFFFFFFFF;
    event->gid = bpf_get_current_uid_gid() >> 32;

    bpf_get_current_comm(&event->comm, sizeof(event->comm));
    bpf_probe_read_kernel_str(&event->filename, sizeof(event->filename), ctx->filename);

    struct task_struct *task = (struct task_struct *)bpf_get_current_task();
    event->ppid = BPF_CORE_READ(task, real_parent, tgid);

    bpf_ringbuf_submit(event, 0);
    return 0;
}

SEC("tracepoint/sched/sched_process_exit")
int trace_exit(struct trace_event_raw_sched_process_template *ctx)
{
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    bpf_map_delete_elem(&pending_execs, &pid);
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

Security Monitoring Policy Engine (Go):

// security_monitor.go - Process Execution Security Monitor
package main

import (
	"bytes"
	"encoding/binary"
	"fmt"
	"log"
	"os"
	"os/signal"
	"strings"
	"syscall"

	"github.com/cilium/ebpf"
	"github.com/cilium/ebpf/link"
	"github.com/cilium/ebpf/ringbuf"
	"github.com/cilium/ebpf/rlimit"
)

//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -type exec_event bpf exec_monitor.bpf.c

type execEvent struct {
	Pid      uint32
	Ppid     uint32
	Uid      uint32
	Gid      uint32
	Comm     [16]byte
	Filename [128]byte
	Args     [128]byte
}

type SecurityRule struct {
	Name        string
	Description string
	Check       func(event execEvent) bool
}

var securityRules = []SecurityRule{
	{
		Name:        "suspicious_shell",
		Description: "Detect suspicious shell execution",
		Check: func(e execEvent) bool {
			comm := strings.TrimSpace(string(bytes.TrimRight(e.Comm[:], "\x00")))
			return comm == "bash" || comm == "sh" || comm == "zsh"
		},
	},
	{
		Name:        "privilege_escalation",
		Description: "Detect potential privilege escalation",
		Check: func(e execEvent) bool {
			filename := strings.TrimSpace(string(bytes.TrimRight(e.Filename[:], "\x00")))
			return strings.Contains(filename, "sudo") ||
				strings.Contains(filename, "su") ||
				strings.Contains(filename, "pkexec")
		},
	},
	{
		Name:        "container_escape",
		Description: "Detect container escape risk",
		Check: func(e execEvent) bool {
			filename := strings.TrimSpace(string(bytes.TrimRight(e.Filename[:], "\x00")))
			return strings.Contains(filename, "nsenter") ||
				strings.Contains(filename, "docker") ||
				strings.Contains(filename, "crictl")
		},
	},
}

func main() {
	if err := rlimit.RemoveMemlock(); err != nil {
		log.Fatalf("Failed to remove memlock limit: %v", err)
	}

	objs := bpfObjects{}
	if err := loadBpfObjects(&objs, nil); err != nil {
		log.Fatalf("Failed to load eBPF objects: %v", err)
	}
	defer objs.Close()

	tpExec, err := link.Tracepoint("sched", "sched_process_exec", objs.TraceExec, nil)
	if err != nil {
		log.Fatalf("Failed to attach exec tracepoint: %v", err)
	}
	defer tpExec.Close()

	tpExit, err := link.Tracepoint("sched", "sched_process_exit", objs.TraceExit, nil)
	if err != nil {
		log.Fatalf("Failed to attach exit tracepoint: %v", err)
	}
	defer tpExit.Close()

	rd, err := ringbuf.NewReader(objs.ExecEvents)
	if err != nil {
		log.Fatalf("Failed to create ringbuf reader: %v", err)
	}
	defer rd.Close()

	sig := make(chan os.Signal, 1)
	signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)

	fmt.Println("Security monitoring started...")

	go func() {
		<-sig
		rd.Close()
	}()

	for {
		record, err := rd.Read()
		if err != nil {
			if err == ringbuf.ErrClosed {
				return
			}
			log.Printf("Failed to read event: %v", err)
			continue
		}

		var event execEvent
		if err := binary.Read(bytes.NewReader(record.RawSample), binary.LittleEndian, &event); err != nil {
			log.Printf("Failed to parse event: %v", err)
			continue
		}

		for _, rule := range securityRules {
			if rule.Check(event) {
				comm := string(bytes.TrimRight(event.Comm[:], "\x00"))
				filename := string(bytes.TrimRight(event.Filename[:], "\x00"))
				log.Printf("[ALERT] %s: PID=%d PPID=%d UID=%d Comm=%s File=%s",
					rule.Name, event.Pid, event.Ppid, event.Uid, comm, filename)
			}
		}
	}
}

Step 5: eBPF Performance Profiling—CPU Flame Graphs

# Generate CPU flame graph data using bpftrace
bpftrace -e 'profile:hz:99 /pid/ { @stacks[ustack, kstack] = count(); }' > profile.out

# Generate flame graph using BCC tools
profile -F 99 -a -p <pid> 60 > perf.out
flamegraph.pl perf.out > cpu_flame.svg

Go Performance Profiler:

// cpu_profiler.go - eBPF CPU Performance Profiler
package main

import (
	"bytes"
	"encoding/binary"
	"fmt"
	"log"
	"os"
	"os/signal"
	"syscall"
	"time"

	"github.com/cilium/ebpf"
	"github.com/cilium/ebpf/link"
	"github.com/cilium/ebpf/perf"
	"github.com/cilium/ebpf/rlimit"
)

//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -type stack_event bpf cpu_profiler.bpf.c

type stackEvent struct {
	Pid       uint32
	Tid       uint32
	KernelIp  [10]uint64
	UserIp    [10]uint64
	KstackLen uint32
	UstackLen uint32
}

func main() {
	if err := rlimit.RemoveMemlock(); err != nil {
		log.Fatalf("Failed to remove memlock limit: %v", err)
	}

	objs := bpfObjects{}
	if err := loadBpfObjects(&objs, nil); err != nil {
		log.Fatalf("Failed to load eBPF objects: %v", err)
	}
	defer objs.Close()

	lk, err := link.AttachPerfEvent(objs.DoProfile, -1, 0, -1)
	if err != nil {
		log.Fatalf("Failed to attach perf event: %v", err)
	}
	defer lk.Close()

	rd, err := perf.NewReader(objs.ProfileEvents, os.Getpagesize()*64)
	if err != nil {
		log.Fatalf("Failed to create perf reader: %v", err)
	}
	defer rd.Close()

	stackCounts := make(map[string]int)
	sig := make(chan os.Signal, 1)
	signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)

	ticker := time.NewTicker(30 * time.Second)
	defer ticker.Stop()

	fmt.Println("CPU profiling started, output every 30 seconds...")

	go func() {
		<-sig
		rd.Close()
	}()

	for {
		select {
		case <-ticker.C:
			fmt.Printf("\n=== CPU Profile at %s ===\n", time.Now().Format("15:04:05"))
			for stack, count := range stackCounts {
				if count > 10 {
					fmt.Printf("  %s: %d samples\n", stack, count)
				}
			}
			stackCounts = make(map[string]int)
		default:
			record, err := rd.Read()
			if err != nil {
				if err == perf.ErrClosed {
					return
				}
				continue
			}

			if record.LostSamples != 0 {
				log.Printf("Lost %d samples", record.LostSamples)
				continue
			}

			var event stackEvent
			if err := binary.Read(bytes.NewReader(record.RawSample), binary.LittleEndian, &event); err != nil {
				continue
			}

			stackKey := fmt.Sprintf("pid=%d kstack=%d ustack=%d",
				event.Pid, event.KstackLen, event.UstackLen)
			stackCounts[stackKey]++
		}
	}
}

CPU Profiler eBPF C Program:

// cpu_profiler.bpf.c - CPU Performance Sampling
#include <vmlinux.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>

#define MAX_STACK_DEPTH 10

struct stack_event {
    u32 pid;
    u32 tid;
    u64 kernel_ip[MAX_STACK_DEPTH];
    u64 user_ip[MAX_STACK_DEPTH];
    u32 kstack_len;
    u32 ustack_len;
};

struct {
    __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
    __uint(key_size, sizeof(u32));
    __uint(value_size, sizeof(u32));
} profile_events SEC(".maps");

struct {
    __uint(type, BPF_MAP_TYPE_STACK_TRACE);
    __uint(max_entries, 10000);
    __uint(key_size, sizeof(u32));
    __uint(value_size, MAX_STACK_DEPTH * sizeof(u64));
} stacks SEC(".maps");

SEC("perf_event")
int do_profile(struct bpf_perf_event_data *ctx)
{
    struct stack_event *event;
    event = bpf_ringbuf_reserve(&profile_events, sizeof(*event), 0);
    if (!event)
        return 0;

    u64 pid_tgid = bpf_get_current_pid_tgid();
    event->pid = pid_tgid >> 32;
    event->tid = pid_tgid & 0xFFFFFFFF;

    int kstack_id = bpf_get_stackid(ctx, &stacks, 0);
    int ustack_id = bpf_get_stackid(ctx, &stacks, BPF_F_USER_STACK);

    event->kstack_len = (kstack_id >= 0) ? MAX_STACK_DEPTH : 0;
    event->ustack_len = (ustack_id >= 0) ? MAX_STACK_DEPTH : 0;

    bpf_ringbuf_submit(event, 0);
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

Five Pitfall Guide

Pitfall 1: Loading eBPF Programs Without Removing memlock Limits

Wrong approach:

// Load eBPF program directly without adjusting memlock
objs := bpfObjects{}
err := loadBpfObjects(&objs, nil)
// Error: failed to load eBPF objects: map create: operation not permitted

Correct approach:

// Remove memlock limit first, then load eBPF program
if err := rlimit.RemoveMemlock(); err != nil {
    log.Fatalf("Failed to remove memlock limit: %v", err)
}
objs := bpfObjects{}
if err := loadBpfObjects(&objs, nil); err != nil {
    log.Fatalf("Failed to load eBPF objects: %v", err)
}

Pitfall 2: Using Infinite Loops in eBPF Programs

Wrong approach:

// BPF verifier will reject infinite loops
SEC("kprobe/tcp_connect")
int trace_tcp(struct pt_regs *ctx) {
    while (1) {
        // Verifier error: back-edge in program
    }
    return 0;
}

Correct approach:

// Use bounded loops, verifier needs to prove the loop terminates
SEC("kprobe/tcp_connect")
int trace_tcp(struct pt_regs *ctx) {
    #pragma unroll
    for (int i = 0; i < 10; i++) {
        // Max 10 iterations, verifier can accept this
    }
    return 0;
}

Pitfall 3: Ignoring BTF Compatibility Causing CO-RE Failures

Wrong approach:

# Run directly on target kernel without checking BTF support
./ebpf-program
# Error: CO-RE relocation failed: kernel does not support BTF

Correct approach:

# Check kernel BTF support first
bpftool btf list
ls /sys/kernel/btf/vmlinux

# Add BTF compatibility check in Go code
// Check BTF compatibility
func checkBTFSupport() error {
    if _, err := os.Stat("/sys/kernel/btf/vmlinux"); err != nil {
        return fmt.Errorf("Kernel does not support BTF, upgrade to 5.2+ or install BTF file: %w", err)
    }
    return nil
}

Pitfall 4: Ring Buffer Not Properly Handled Causing Data Loss

Wrong approach:

// Using an undersized ring buffer, data loss under high load
rd, err := ringbuf.NewReader(objs.Events) // Default size may be insufficient
// LostSamples events not handled

Correct approach:

// Set a sufficiently large ring buffer in eBPF C code
// __uint(max_entries, 256 * 1024); // 256KB

// Handle data loss in Go code
record, err := rd.Read()
if err != nil {
    if errors.Is(err, ringbuf.ErrClosed) {
        return
    }
    log.Printf("Read failed: %v", err)
    continue
}
// Note: ringbuf.NewReader doesn't report lost samples, but perf.NewReader does

Pitfall 5: Hubble Not Properly Configured Causing Invisible Traffic

Wrong approach:

# Only enabled Hubble without configuring metrics and relay
hubble:
  enabled: true
  # Missing relay and metrics configuration

Correct approach:

hubble:
  enabled: true
  listenAddress: ":4244"
  relay:
    enabled: true
    rollOutPods: true
  ui:
    enabled: true
  metrics:
    enabled:
      - dns
      - drop
      - tcp
      - flow
      - icmp
      - http
    enableOpenMetrics: true
  networkPolicy:
    enabled: true

Error Troubleshooting Reference Table

Error Message Cause Solution
failed to load eBPF objects: map create: operation not permitted memlock limit not removed Call rlimit.RemoveMemlock() or set ulimit -l unlimited
back-edge in program eBPF program contains infinite loop Use #pragma unroll and bounded loops instead
CO-RE relocation failed: kernel does not support BTF Kernel version too low or missing BTF Upgrade to 5.2+ kernel, or install bpf-tools to generate BTF
map create: read-only Insufficient eBPF Map permissions Check CAP_BPF/CAP_SYS_ADMIN capabilities
invalid argument: couldn't find kprobe target Kernel function doesn't exist Use bpftool prog list to confirm available kprobe points
ringbuf reserve failed Ring buffer is full Increase ring buffer size, or reduce event frequency
Hubble agent not ready Hubble not properly started Check cilium status, confirm hubble-relay Pod is running
connection refused:4245 Hubble gRPC port not exposed Run cilium hubble port-forward
BPF verifier: unreachable instruction Dead code or branches unverifiable by verifier Simplify conditional logic, remove unreachable code
failed to attach perf event: invalid argument Perf event parameters incorrect Check CPU frequency and sampling rate parameters

Three Advanced Optimization Techniques

Technique 1: eBPF Map Batch Operations to Reduce System Call Overhead

When interacting between user space and kernel space, per-entry Map operations generate many system calls. Using Batch operations processes multiple entries at once:

// Batch update eBPF Map
func batchUpdateMap(m *ebpf.Map, entries map[uint32]uint64) error {
    keys := make([]uint32, 0, len(entries))
    values := make([]uint64, 0, len(entries))
    for k, v := range entries {
        keys = append(keys, k)
        values = append(values, v)
    }

    var batchSize = uint32(64)
    var done uint32

    for done < uint32(len(keys)) {
        remaining := uint32(len(keys)) - done
        if remaining < batchSize {
            batchSize = remaining
        }

        batchKeys := keys[done : done+batchSize]
        batchValues := values[done : done+batchSize]

        err := m.UpdateBatch(batchKeys, batchValues, nil)
        if err != nil {
            return fmt.Errorf("batch update failed(offset=%d): %w", done, err)
        }
        done += batchSize
    }
    return nil
}

Technique 2: Tail Call-Based eBPF Program Chaining

When a single eBPF program's logic is too complex, use Tail Calls to split it into multiple sub-programs, bypassing verifier complexity limits:

// tail_call_chain.bpf.c - Tail Call Chaining
#include <vmlinux.h>
#include <bpf/bpf_helpers.h>

#define MAX_TAIL_CALLS 4

struct {
    __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
    __uint(max_entries, MAX_TAIL_CALLS);
    __type(key, __u32);
    __type(value, __u32);
} tail_call_map SEC(".maps");

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 256 * 1024);
} events SEC(".maps");

struct event_data {
    u32 phase;
    u32 pid;
    char comm[16];
};

SEC("kprobe/tcp_connect")
int phase0(struct pt_regs *ctx)
{
    struct event_data *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);
    if (!e) return 0;

    e->phase = 0;
    e->pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&e->comm, sizeof(e->comm));
    bpf_ringbuf_submit(e, 0);

    bpf_tail_call(ctx, &tail_call_map, 1);
    return 0;
}

SEC("kprobe/tcp_connect")
int phase1(struct pt_regs *ctx)
{
    struct event_data *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);
    if (!e) return 0;

    e->phase = 1;
    e->pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&e->comm, sizeof(e->comm));
    bpf_ringbuf_submit(e, 0);

    bpf_tail_call(ctx, &tail_call_map, 2);
    return 0;
}

SEC("kprobe/tcp_connect")
int phase2(struct pt_regs *ctx)
{
    struct event_data *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0);
    if (!e) return 0;

    e->phase = 2;
    e->pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&e->comm, sizeof(e->comm));
    bpf_ringbuf_submit(e, 0);

    return 0;
}

char LICENSE[] SEC("license") = "GPL";
// Register Tail Call sub-programs
progArray := objs.TailCallMap
if err := progArray.Update(uint32(1), objs.Phase1.ProgramFD(), ebpf.UpdateAny); err != nil {
    log.Fatalf("Failed to register tail call phase1: %v", err)
}
if err := progArray.Update(uint32(2), objs.Phase2.ProgramFD(), ebpf.UpdateAny); err != nil {
    log.Fatalf("Failed to register tail call phase2: %v", err)
}

Technique 3: eBPF Event Aggregation and Sampling to Reduce Data Volume

In high-traffic scenarios, kernel-space aggregation and sampling dramatically reduce the number of events user space needs to process:

// aggregate.bpf.c - Kernel-Space Event Aggregation
#include <vmlinux.h>
#include <bpf/bpf_helpers.h>

struct flow_key {
    u32 saddr;
    u32 daddr;
    u16 dport;
    u8 protocol;
};

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 65536);
    __type(key, struct flow_key);
    __type(value, u64);
} flow_counter SEC(".maps");

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 65536);
    __type(key, struct flow_key);
    __type(value, u64);
} flow_latency SEC(".maps");

SEC("kprobe/tcp_sendmsg")
int count_sendmsg(struct pt_regs *ctx)
{
    struct flow_key key = {};
    struct sock *sk = (struct sock *)PT_REGS_PARM1(ctx);

    key.saddr = BPF_CORE_READ(sk, __sk_common.skc_rcv_saddr);
    key.daddr = BPF_CORE_READ(sk, __sk_common.skc_daddr);
    key.dport = BPF_CORE_READ(sk, __sk_common.skc_dport);
    key.protocol = IPPROTO_TCP;

    u64 *count = bpf_map_lookup_elem(&flow_counter, &key);
    if (count) {
        __sync_fetch_and_add(count, 1);
    } else {
        u64 init = 1;
        bpf_map_update_elem(&flow_counter, &key, &init, BPF_ANY);
    }

    return 0;
}

char LICENSE[] SEC("license") = "GPL";
// User-space periodic reading of aggregated data
func pollAggregatedMap(m *ebpf.Map, interval time.Duration) {
    ticker := time.NewTicker(interval)
    defer ticker.Stop()

    for range ticker.C {
        var key flowKey
        var value uint64
        iter := m.Iterate()

        fmt.Printf("\n=== Flow Stats at %s ===\n", time.Now().Format("15:04:05"))

        for iter.Next(&key, &value) {
            if value > 100 {
                srcIP := intToIP(key.Saddr)
                dstIP := intToIP(key.Daddr)
                fmt.Printf("  %s -> %s:%d: %d requests\n",
                    srcIP, dstIP, key.Dport, value)
            }
        }

        if err := iter.Err(); err != nil {
            log.Printf("Map iteration failed: %v", err)
        }
    }
}

Observability Solution Comparison Analysis

Dimension eBPF Prometheus OpenTelemetry Istio Datadog
Data Source Kernel space App/Exporter App SDK Sidecar proxy Agent+SDK
Performance Overhead Very low (<1%) Low Medium (SDK overhead) Medium-High (Sidecar) Medium
Code Intrusiveness Zero Needs Exporter Needs SDK Needs Sidecar Needs Agent
Kernel Visibility Complete None None None Partial
Network Visibility L3-L7 L7 metrics L7 tracing L4-L7 L3-L7
Security Auditing Native support Needs extra tools Needs extra tools Policy logs Native support
Real-time Microsecond Second Millisecond Millisecond Second
Learning Curve Steep Gentle Medium Medium Gentle
Multi-Cluster Support Needs custom build Federation Native Multi-cluster Mesh Native
Cost Open source free Open source free Open source free Open source free Commercial paid
Use Case Deep kernel tracing Metrics monitoring Distributed tracing Service mesh All-in-one monitoring

Summary

eBPF is not a silver bullet for observability, but it is the only solution that fills the kernel-space monitoring gap. In a K8s observability stack, eBPF should serve as the lowest-level data source, complementing Prometheus metrics and OpenTelemetry traces—eBPF tells you "what happened in the kernel," Prometheus tells you "how the system is performing," and OpenTelemetry tells you "what the request experienced." The combination of all three is true full-stack observability.

Try these browser-local tools — no sign-up required →

#Kubernetes#eBPF#可观测性#Cilium#内核追踪#2026#DevOps