Build Your First eBPF App

Introduction
Introduction to eBPF and BCC
Most Basic Example
Turn It Up A Notch
Crank It Up Again
Prometheus
Tidbits

Wednesday, 29th March 2023

eBPF is a game-changing technology that lets you supercharge the Linux kernel and peek into its inner workings like never before! We'll be using the power of C to craft our eBPF masterpiece and trusty Python to manage and decode its events. Sure, Rust enthusiasts might argue that you can do all this with Rust too, but why complicate things? Let's keep it simple and fun, and trust me, you'll be blown away by what we can achieve together! Are you ready? Let's get started!

Introduction to eBPF and BCC

If you haven't done so yet, I highly recommend catching up on my previous eBPF video to get all the juicy context and purpose behind this incredible technology. Trust me, you won't want to miss it!

Now, let's gear up and prep our workspace for some eBPF magic! We'll be installing the BPF Compiler Collection toolkit (BCC) to give us all the tools we need for eBPF wizardry.

Install BCC and its dependencies Ubuntu: sudo apt-get install bpfcc-tools libbpfcc-dev Fedora: sudo dnf install bcc Other distributions: Install equivalent packages
Set up a Python virtual environment python3 -m venv venvsource venv/bin/activate

Most Basic Example

We'll be tapping into the pulse of the sys_clone function, which springs into action when a brand-new process is born.

That may sound confusing, let's say you run a program or command on your computer, like opening a web browser or running a script. When this happens, the operating system needs to create a new process to manage the execution of that program or command. The sys_clone function plays a crucial role in this process creation. We can create a hook which taps into, and lets us monitor, that kernel command.

Let me show you, also, make sure to check out the comments, as I will post the sourcecode.

Let's kick things off by crafting our eBPF program in C, let's create a file called ebpf-probe.c.

lang c

int kprobe__sys_clone(void *ctx) {
    bpf_trace_printk("Hello, World!\\n");
    return 0;
}

We prefix our function with kprobe__, instructing the BCC library that we're ready to attach it to a kprobe. By then calling the function specifically sys_clone, we are watching the kernel sys_clone event. The ctx variable is a pointer to the current event context. We'll light up the kernel trace buffer with a simple message using this snazzy print statement. Perfect for debugging and testing our creation! Gracefully exiting with a 0, we give a big thumbs up, signaling that we're stepping out of this function on a high note.

Next, let's whip up our Python script to join forces with the eBPF program, let's call it ebpf-runner.py:

lang python

#!/usr/bin/env python3

from bcc import BPF
from pathlib import Path

bpf_source = Path('ebpf-watch.c').read_text()
bpf = BPF(text=bpf_source).trace_print()

We import the bcc library, fresh from our operating system's package manager. Loading the C code from its file, we put the pedal to the metal, compiling it with BPF and launching it into the Kernel. The trace_print statement keeps us in the loop, automatically printing any changes in the trace buffer.

Ready for liftoff? Remember to run the code with privileged access:

lang shell

$ sudo python ebpf-runner.py
b'            node-699799  [014] d..31 78173.681052: bpf_trace_printk: Hello, World!\\n'
b'            node-762656  [003] d..31 78173.706845: bpf_trace_printk: Hello, World!\\n'
b'              sh-762656  [003] d..31 78173.706960: bpf_trace_printk: Hello, World!\\n'
b'            node-699799  [000] d..31 78173.720799: bpf_trace_printk: Hello, World!\\n'
b'            node-762659  [004] d..31 78173.745352: bpf_trace_printk: Hello, World!\\n'
b'              sh-762659  [004] d..31 78173.745465: bpf_trace_printk: Hello, World!\\n'

And there you have it—our first eBPF program! It's simple, fast, and, well, a little useless for now. But just wait until we unleash its full power!

Turn It Up A Notch

Let's turn it up a notch and make it even more powerful and informative. No more plain "Hello, World" messages; we want real actionable data!

Our eBPF program runs in a super-secure sandboxed environment right alongside the mighty kernel. Meanwhile, our Python application is chilling in "user space" – a fancy term for where our everyday processes hang out. Now, even though our eBPF program has some limitations (you know, for security and stability reasons), that doesn't mean it can't flex its muscles! In fact, it's able to send data over to our user space application, which can then report, record, and perform all sorts of amazing tricks with the received data.

Let's do this. We can enhance our C code by adding extra header files to unlock access to vital data structures and functions. Next, we'll define a data_t structure to store essential process information like PID (Process ID), PPID (Parent Process ID), and the process name itself:

lang c

#include <uapi/linux/ptrace.h>
#include <linux/sched.h>
#include <bcc/proto.h>

struct data_t {
    u32 pid;
    u32 ppid;
    char comm[TASK_COMM_LEN];
};

BPF_PERF_OUTPUT(events);

int kprobe__sys_clone(void *ctx) {
    struct data_t data = {};
    struct task_struct *task;

    task = (struct task_struct *)bpf_get_current_task();
    data.pid = bpf_get_current_pid_tgid();
    data.ppid = task->real_parent->tgid;
    bpf_get_current_comm(&data.comm, sizeof(data.comm));

    events.perf_submit(ctx, &data, sizeof(data));
    return 0;
}

Now, it's time to unleash the power of "BPF_PERF_OUTPUT"! This nifty definition allows us to send data from our eBPF program right back to our Python application running in user space. We'll use the 'events' reference later to transmit our data.

With everything set, we can gather the current task using the bpf_get_current_task function and send our data back to our Python application with events.perf_submit.

But hold on, there's more!

Our Python code needs a little makeover to properly report the collected information. Let's say goodbye to the trace_print() method since we're done with printing log lines straight from our eBPF buffer. Instead, we'll listen to the perf_output from the eBPF application and pass the results to our brand-new logging method.

lang python

#!/usr/bin/env python3

from bcc import BPF
from pathlib import Path


def process_event(cpu, data, size):
    event = bpf["events"].event(data)
    print(f"Process {event.comm.decode()} (PID: {event.pid}, PPID: {event.ppid}) called sys_clone")


bpf_source = Path('ebpf-watch.c').read_text()
bpf = BPF(text=bpf_source)

bpf["events"].open_perf_buffer(process_event)

print("Monitoring sys_clone events... Press Ctrl+C to exit.")
while True:
    try:
        bpf.perf_buffer_poll()
    except KeyboardInterrupt:
        break

For now, let's keep things simple. Our process_event function takes in three parameters: the CPU index where the event was generated, the event data itself, and the size of the event data.

Finally, we continuously poll the perf buffer for new events!

If we run this, we can see our output packed with exciting process data:

lang shell

$ sudo python file_open_monitor.py
Monitoring sys_clone events... Press Ctrl+C to exit.
Process node (PID: 699799, PPID: 698907) called sys_clone
Process sh (PID: 769242, PPID: 699799) called sys_clone
Process sh (PID: 769242, PPID: 699799) called sys_clone
Process node (PID: 699799, PPID: 698907) called sys_clone

Crank It Up Again

Let's take our eBPF probe to new heights by expanding its reach to even more events!

First, we'll whip up a shiny new data structure to hold the payload for our latest event – monitoring file open events. Then, we'll create a brand-new perf_output to send this data back to our Python application.

lang c

#include <uapi/linux/ptrace.h>
#include <linux/sched.h>
#include <linux/fs.h>
#include <bcc/proto.h>

struct clone_data_t {
    u32 pid;
    u32 ppid;
    char comm[TASK_COMM_LEN];
};

struct open_data_t {
    u32 pid;
    u64 timestamp;
    char comm[TASK_COMM_LEN];
    char filename[NAME_MAX];
};

BPF_PERF_OUTPUT(clone_events);
BPF_PERF_OUTPUT(open_events);

int kprobe__sys_clone(void *ctx) {
    struct clone_data_t data = {};
    struct task_struct *task;

    task = (struct task_struct *)bpf_get_current_task();
    data.pid = bpf_get_current_pid_tgid();
    data.ppid = task->real_parent->tgid;
    bpf_get_current_comm(&data.comm, sizeof(data.comm));

    clone_events.perf_submit(ctx, &data, sizeof(data));
    return 0;
}

int kprobe__do_sys_openat2(struct pt_regs *ctx, int dfd, const char __user *filename, int flags, umode_t mode) {
    struct open_data_t data = {};

    data.pid = bpf_get_current_pid_tgid();
    data.timestamp = bpf_ktime_get_ns();
    bpf_get_current_comm(&data.comm, sizeof(data.comm));

    bpf_probe_read_user_str(&data.filename, sizeof(data.filename), filename);

    open_events.perf_submit(ctx, &data, sizeof(data));
    return 0;
}

Time to do a bit of housekeeping: let's tweak our sys_clone function to point to its dedicated perf_output. And now, for the grand finale, we'll craft an awesome new function called kprobe__do_sys_openat2! This beauty will hook into the do_sys_openat2 event like a charm, collecting valuable file opening information.

I think you guys know what's next. A few swift changes to our Python code, and we'll be reporting on both events in no time! Just create a new event handler, print that fresh event information, and voilà – we're now in full control of an eBPF probe that's more powerful and informative than ever before! I think it is time to take a break to celebrate our eBPF mastery!

lang python

#!/usr/bin/env python3

from bcc import BPF
from pathlib import Path

def process_clone_event(cpu, data, size):
    event = bpf["clone_events"].event(data)
    print(f"Process {event.comm.decode()} (PID: {event.pid}, PPID: {event.ppid}) called sys_clone")

def process_open_event(cpu, data, size):
    event = bpf["open_events"].event(data)
    print(f"[{event.timestamp / 1e9:.6f}] Process {event.comm.decode()} (PID: {event.pid}) opened file: {event.filename.decode()}")

bpf_source = Path('ebpf_watch_both.c').read_text()
bpf = BPF(text=bpf_source)

bpf["clone_events"].open_perf_buffer(process_clone_event)
bpf["open_events"].open_perf_buffer(process_open_event)
print("Monitoring sys_clone and file open events... Press Ctrl+C to exit.")

while True:
    try:
        bpf.perf_buffer_poll()
    except KeyboardInterrupt:
        break

Prometheus

We can now extrapolate out how we could integrate such functionality into our organisation. Let's integrate it with Prometheus, the go-to open-source monitoring and alerting toolkit that'll make your metrics shine!

First things first, let's snatch the python prometheus_client with a quick pip install. With that in place, let's get our hands on it by importing the package into our Python application.

Now, it's time to create a couple of Prometheus metrics: sys_clone_calls_total and sys_openat_calls_total. Fire up a HTTP server to host our Prometheus metrics and let it work its magic in the background.

Next up, let's expand our process_clone_event and process_open_event functions by incrementing our new Prometheus metrics. Once it is running, all we have to do, is go to our new server, and we will see the metrics.

lang python

#!/usr/bin/env python3

from bcc import BPF
from pathlib import Path
from prometheus_client import start_http_server, Counter

sys_clone_counter = Counter('sys_clone_calls_total', 'Number of sys_clone calls')
sys_openat_counter = Counter('sys_openat_calls_total', 'Number of sys_openat calls')

start_http_server(3000)


def process_clone_event(cpu, data, size):
    event = bpf["clone_events"].event(data)
    print(f"Process {event.comm.decode()} (PID: {event.pid}, PPID: {event.ppid}) called sys_clone")
    sys_clone_counter.inc()


def process_open_event(cpu, data, size):
    event = bpf["open_events"].event(data)
    print(
        f"[{event.timestamp / 1e9:.6f}] Process {event.comm.decode()} (PID: {event.pid}) opened file: {event.filename.decode()}")
    sys_openat_counter.inc()


bpf_source = Path('ebpf-watch.c').read_text()
bpf = BPF(text=bpf_source)

bpf["clone_events"].open_perf_buffer(process_clone_event)
bpf["open_events"].open_perf_buffer(process_open_event)
print("Monitoring sys_clone and file open events... Press Ctrl+C to exit.")

while True:
    try:
        bpf.perf_buffer_poll()
    except KeyboardInterrupt:
        break

Once it has started scraping, Prometheus will let us automate alerts and query this new valuable data. Why not step it up a notch and integrate with tools like Grafana to transform your data into stunning charts?

Unleash the full potential of your implementation by tracking even more metrics per sys_call. Add labels to filter metrics, such as by file path, and get ready to be blown away by the insights you'll uncover!

Tidbits

Some quick tidbits to leave you guys with:

Streamline your eBPF programs! Keep your code lean, mean, and efficient. Offload as much logic as you can to your userspace application.
Test across kernel versions! Don't let kernel updates catch you off guard. Ensure your eBPF programs are rock-solid by testing them on different kernel versions.
Explore the world of sys_calls! There's a treasure trove of sys_calls waiting to be discovered. Dive deep into their purpose and unlock endless possibilities for your eBPF applications.
Leverage the power of strace! This fantastic tool lets you peek under the hood and see what sys_calls an application is using. It's like having X-ray vision for your apps!

Now it's time for you to go forth and harness the power of eBPF in your own projects.

Build Your First eBPF App

Table of contents

Introduction to eBPF and BCC

Most Basic Example

Turn It Up A Notch

Crank It Up Again

Prometheus

Tidbits