eBPF is a powerful and versatile technology that allows you to run custom programs in the Linux kernel. It's used for a wide range of purposes, from networking to security, and even observability. If you're eager to dive deeper into observability, don't miss our two previous thrilling videos on eBPF!

Today, we are going to meet XDP, the eXpress Data Path - a high-performance data path that harnesses the power of eBPF to process packets at the lowest possible level in the kernel. With this remarkable duo, you can create lightning-fast, flexible, and efficient networking applications that will leave you in awe!

We will discover the 'why' and 'how' behind this technology and set up an example project. Make sure you have the bcc package installed on your linux machine.


Basic Packet Viewer

Without further ado, let's dive right into the action and kick things off by crafting our very own ebpf-probe.c file:

lang c

Copy codeExpand code
// Include necessary header files for eBPF, XDP, and networking
#include <uapi/linux/bpf.h>
#include <uapi/linux/if_ether.h>
#include <uapi/linux/if_packet.h>
#include <uapi/linux/ip.h>
#include <linux/in.h>
#include <bcc/helpers.h>

// Create a BPF_ARRAY named packet_count_map with a single __u64 element
// This map will store the packet count
BPF_ARRAY(packet_count_map, __u64, 1);

// Define the XDP program function, xdp_packet_counter
// This function is triggered for every incoming packet on the attached network interface
int xdp_packet_counter(struct xdp_md *ctx) {
    // Declare a key variable to access the packet_count_map
    __u32 key = 0;
    // Declare a pointer to a __u64 variable, which will point to the counter value in the packet_count_map
    __u64 *counter;

    // Lookup the counter value in the packet_count_map using the key
    counter = packet_count_map.lookup(&key);
    // If the lookup fails (counter is NULL), abort the XDP program
    if (!counter)
        return XDP_ABORTED;

    // Atomically increment the counter value by 1
    // This ensures that the counter is updated correctly even when multiple packets are processed concurrently
    __sync_fetch_and_add(counter, 1);

    // Pass the packet along to the next stage in the networking stack
    return XDP_PASS;
}

First up, we create a function named xdp_packet_counter, which is triggered for every incoming packet on the attached network interface.

We'll craft a BPF_ARRAY called packet_count_map, which elegantly holds a single __u64 element, keeping track of the packet count. As each packet rolls in, our program deftly searches for the counter value in the packet_count_map using the key 0.

When the lookup succeeds, our program increments the counter value by 1 with the help of __sync_fetch_and_add, ensuring that even when multiple packets are processed simultaneously, the counter stays accurate. Finally, like a relay race, the packet is handed off to the next stage in the networking stack as the program returns XDP_PASS.

Now, we need to create our python code which will interact with, and manage our fancy new ebpf program. We do this by creating a new file, ebpf-runner.py:

lang python

Copy codeExpand code
# Import necessary libraries
from bcc import BPF
from time import sleep
from pathlib import Path
import signal


class TerminateSignal(Exception):
    pass


# Signal handler for SIGTERM
def handle_sigterm(signum, frame):
    raise TerminateSignal("Received SIGTERM, terminating...")


# Load and compile the eBPF program from the source file
def load_bpf_program():
    bpf_source = Path('ebpf-probe.c').read_text()
    bpf = BPF(text=bpf_source)
    return bpf


# Attach the eBPF program to the specified interface
def attach_xdp_program(bpf, interface):
    xdp_fn = bpf.load_func("xdp_packet_counter", BPF.XDP)
    bpf.attach_xdp(interface, xdp_fn, 0)
    return bpf


# Detach the eBPF program from the specified interface
def detach_xdp_program(bpf, interface):
    bpf.remove_xdp(interface, 0)


# Main function to execute the script
def main():
    # Register the signal handler for SIGTERM
    signal.signal(signal.SIGTERM, handle_sigterm)

    # Define the network interface to monitor
    INTERFACE = "enp5s0"

    # Load the eBPF program and attach it to the network interface
    bpf = load_bpf_program()
    attach_xdp_program(bpf, INTERFACE)

    # Access the packet_count_map defined in the eBPF program
    packet_count_map = bpf.get_table("packet_count_map")

    try:
        print("Counting packets, press Ctrl+C to stop...")
        prev_total_packets = 0
        while True:
            # Sleep for 1 second before checking the packet count again
            sleep(1)
            total_packets = 0
            # Iterate over keys in the packet_count_map and sum their values
            for key in packet_count_map.keys():
                counter = packet_count_map[key]
                if counter:
                    total_packets += counter.value

            # Calculate the number of packets received per second
            packets_per_second = total_packets - prev_total_packets
            prev_total_packets = total_packets
            print(f"Packets per second: {packets_per_second}")
    except (KeyboardInterrupt, TerminateSignal) as e:
        print(f"{e}. Interrupting eBPF runner.")
    finally:
        print("Detaching eBPF program and exiting.")
        # Detach the eBPF program from the network interface and clean up when the script is terminated
        detach_xdp_program(bpf, INTERFACE)


# Execute the main function when the script is run directly
if __name__ == "__main__":
    main()

This Python script will unleash the power of the bcc library to load, compile, and attach an eBPF program to your chosen network interface! The eBPF program, aptly named xdp_packet_counter, takes on the crucial role of counting incoming packets.

Our script features a custom exception class TerminateSignal and a signal handler handle_sigterm to tackle the SIGTERM signal, making sure the script exits gracefully when needed. It also defines functions to load the eBPF program, attach the program to a network interface, and detach it from the interface.

In the main function, our script registers the signal handler for SIGTERM, loads the eBPF program, and attaches it to the specified network interface like a well-oiled machine. Be sure to change the INTERFACE variable to whatever interface your computer is using. Then, it grabs the packet_count_map from the eBPF program and dives into a loop to count packets. This loop takes a 1-second power nap before checking the packet count again, calculates the number of packets received per second, and proudly displays the results. The script can be interrupted with a simple Ctrl+C press or by receiving a SIGTERM signal.

As the script comes to an end, it detaches the eBPF program from the network interface and makes a stylish exit. Detaching the eBPF program is vital; otherwise, it would continue running wild in the Kernel, unmanaged.

When we run our code sudo python ebpf-runner.py , will be greeted with this output:

Copy codeExpand code
Counting packets, press Ctrl+C to stop...
Packets per second: 4
Packets per second: 19
Packets per second: 18

Advanced XDP Actions

Let's delve into some advanced XDP actions that allow you to manipulate and process packets in various ways.

XDP boasts a collection of powerful actions that grant you ultimate control over packet processing:

  • XDP_PASS: Effortlessly pass the packet to the next stage in the network stack.
  • XDP_DROP: Silently drop the packet, halting further processing in its tracks.
  • XDP_TX: Transmit the packet directly to the same or another interface.
  • XDP_REDIRECT: Redirect the packet to a different interface.
  • XDP_ABORTED: Assertively stop processing the packet and signal an error.

With these dynamic actions at your fingertips, you'll have unparalleled flexibility and control over packet processing, empowering you to create cutting-edge networking applications.


Creating A Basic Firewall

Time to drop these pesky unwanted network packets with XDP_DROP. By tweaking our existing code, we'll craft a nifty firewall that puts a stop to incoming connections from 8.8.8.8.

lang c

Copy codeExpand code
#include <uapi/linux/bpf.h>
#include <uapi/linux/if_ether.h>
#include <uapi/linux/if_packet.h>
#include <uapi/linux/ip.h>
#include <linux/in.h>
#include <bcc/helpers.h>

BPF_ARRAY(packet_count_map, __u64, 1);

// The function takes in a pointer to an XDP metadata struct and a destination IP address
static int drop_packet_to_destination(struct xdp_md *ctx, __be32 blocked_ip) {
    // Extract the end of the packet data from the metadata struct
    void *data_end = (void *)(long)ctx->data_end;

    // Extract the start of the packet data from the metadata struct
    void *data = (void *)(long)ctx->data;

    // Get a pointer to the Ethernet header of the packet
    struct ethhdr *eth = data;

    // If the Ethernet header extends beyond the end of the packet data, pass the packet
    if ((void *)(eth + 1) > data_end)
        return XDP_PASS;

    // If the Ethernet header does not indicate an IP packet, pass the packet
    if (eth->h_proto != bpf_htons(ETH_P_IP))
        return XDP_PASS;

    // Get a pointer to the IP header of the packet
    struct iphdr *iph = (struct iphdr *)(data + ETH_HLEN);

    // If the IP header extends beyond the end of the packet data, pass the packet
    if ((void *)(iph + 1) > data_end)
        return XDP_PASS;

    // If the destination IP address of the packet matches the specified IP address, drop the packet
    if (iph->saddr == blocked_ip) {
        return XDP_DROP;
    }

    // Otherwise, pass the packet
    return XDP_PASS;
}


// Define the XDP program function, xdp_packet_counter
// This function is triggered for every incoming packet on the attached network interface
int xdp_packet_counter(struct xdp_md *ctx) {
    // Declare a key variable to access the packet_count_map
    __u32 key = 0;
    // Declare a pointer to a __u64 variable, which will point to the counter value in the packet_count_map
    __u64 *counter;

    // Lookup the counter value in the packet_count_map using the key
    counter = packet_count_map.lookup(&key);
    // If the lookup fails (counter is NULL), abort the XDP program
    if (!counter)
        return XDP_ABORTED;

    // Atomically increment the counter value by 1
    // This ensures that the counter is updated correctly even when multiple packets are processed concurrently
    __sync_fetch_and_add(counter, 1);

    // Call the drop_packet_to_destination function with the desired destination IP (8.8.8.8)
    __be32 blocked_ip = (8 << 24) | (8 << 16) | (8 << 8) | 8;
    return drop_packet_to_destination(ctx, blocked_ip);
}

Introducing the drop_packet_to_destination function! This addition checks if a packet's source IP address matches a given blocked IP, and if it does – BAM – the packet gets dropped. We'll integrate this into our xdp_packet_counter function, right after incrementing the packet counter.

We'll craft the blocked IP address in network byte order without any external dependencies, using the power of bitwise OR magic. Here's how we conjure up the blocked_ip variable:

  1. (8 << 24) zaps the value 8 left by 24 bits, transforming it into 0x08000000.
  2. (8 << 16) shifts the value 8 left by 16 bits, morphing it into 0x00080000.
  3. (8 << 8) slides the value 8 left by 8 bits, creating 0x00000800.
  4. 8 stays as is, becoming 0x00000008.

When these values unite with bitwise OR (|), they form 0x08080808, corresponding to the IPv4 address 8.8.8.8. The __be32 type serves as an alias for a 32-bit big-endian integer, which is widely used to represent IPv4 addresses in network byte order.

Good news – no Python code updates needed!

Now, if you dare to ping on this interface using ping -I enp5s0 8.8.8.8, you'll witness the impressive power of our code as no packets slip through. Victory!


Adding Transparency To Our Firewall

Imagine effortlessly gaining insights into the packets being dropped by our eBPF program! Let's amp up our observability game and make it happen.

Start by defining a new BPF_PERF_OUTPUT(debug_events); right beneath the BPF_ARRAY definition. This will allow us to send events from our eBPF program to user space.

Next, it's time for a little makeover of our drop if block:

lang c

Copy codeExpand code
if (iph->saddr == blocked_ip) {
    __be32 saddr_copy = iph->saddr;
    debug_events.perf_submit(ctx, &saddr_copy, sizeof(saddr_copy));

    return XDP_DROP;
}

We can report the dropped packets to our ebpf-runner application.

Which we can then update as follows:

lang python

Copy codeExpand code
# Import necessary libraries
from bcc import BPF
from time import sleep
from pathlib import Path
import signal
import ctypes
import socket
import struct


class TerminateSignal(Exception):
    pass


# Signal handler for SIGTERM
def handle_sigterm(signum, frame):
    raise TerminateSignal("Received SIGTERM, terminating...")


# Load and compile the eBPF program from the source file
def load_bpf_program():
    bpf_source = Path('ebpf-probe.c').read_text()
    bpf = BPF(text=bpf_source)
    return bpf


# Attach the eBPF program to the specified interface
def attach_xdp_program(bpf, interface):
    xdp_fn = bpf.load_func("xdp_packet_counter", BPF.XDP)
    bpf.attach_xdp(interface, xdp_fn, 0)
    return bpf


# Detach the eBPF program from the specified interface
def detach_xdp_program(bpf, interface):
    bpf.remove_xdp(interface, 0)


def print_debug_event(cpu, data, size):
    dest_ip = ctypes.cast(data, ctypes.POINTER(ctypes.c_uint32)).contents.value
    print(f"Packet to {socket.inet_ntoa(struct.pack('!L', dest_ip))} dropped")

# Main function to execute the script
def main():
    # Register the signal handler for SIGTERM
    signal.signal(signal.SIGTERM, handle_sigterm)

    # Define the network interface to monitor
    INTERFACE = "enp5s0"

    # Load the eBPF program and attach it to the network interface
    bpf = load_bpf_program()
    attach_xdp_program(bpf, INTERFACE)

    # Access the packet_count_map defined in the eBPF program
    packet_count_map = bpf.get_table("packet_count_map")
    bpf["debug_events"].open_perf_buffer(print_debug_event)

    try:
        print("Counting packets, press Ctrl+C to stop...")
        prev_total_packets = 0
        while True:
            # Sleep for 1 second before checking the packet count again
            sleep(1)
            total_packets = 0
            # Iterate over keys in the packet_count_map and sum their values
            for key in packet_count_map.keys():
                counter = packet_count_map[key]
                if counter:
                    total_packets += counter.value

            # Calculate the number of packets received per second
            packets_per_second = total_packets - prev_total_packets
            prev_total_packets = total_packets
            print(f"Packets per second: {packets_per_second}")
            bpf.perf_buffer_poll(1)
    except (KeyboardInterrupt, TerminateSignal) as e:
            print(f"{e}. Interrupting eBPF runner.")
    finally:
        print("Detaching eBPF program and exiting.")
        # Detach the eBPF program from the network interface and clean up when the script is terminated
        detach_xdp_program(bpf, INTERFACE)


# Execute the main function when the script is run directly
if __name__ == "__main__":
    main()

We extend the functionality of our original python code by printing a message whenever a packet is dropped due to its source IP address matching the specified blocked IP address (8.8.8.8). We do this with our new print_debug_event method, being called from the new open_perf_buffer definition. And don't forget, we've set up bpf.perf_buffer_poll(1) to diligently poll buffer events, keeping us in the loop. Now, our application is all set to smoothly integrate with Prometheus or any other observability stack we fancy.


Offload to your NIC

Running our XDP program on the host's CPU is cool, but doing it on the NIC? That's next-level awesome! Let's look at what is happening in our current flow:

  1. Packet arrives at the physical Network Interface Card (NIC) from the outside world.
  2. Our NIC processes the packet and generates an interrupt to signal the CPU.
  3. The Linux kernel receives the interrupt and schedules the appropriate driver to handle the incoming packet.
  4. The packet is placed in a receive buffer by the NIC driver.
  5. The eBPF XDP program is executed by the kernel, processing the packet before it reaches the full network stack.
  6. Depending on the XDP action chosen by the eBPF program: If XDP_PASS, the packet continues through the Linux kernel networking stack. If XDP_DROP or XDP_ABORTED, the packet is dropped and does not continue through the network stack. If XDP_TX or XDP_REDIRECT, the packet is transmitted or redirected to another interface without going through the rest of the network stack.
  7. If the packet continues through the network stack (XDP_PASS), it goes through the following stages: Network layer processing (e.g., IP layer) Transport layer processing (e.g., TCP, UDP)
  8. The packet is handed off to the appropriate socket buffer.
  9. The userspace application reads the packet from the socket buffer.

Our current implementation runs solely on our host's CPU, after being initialised by the linux kernel. By letting our trusty Network Interface Card handle the heavy lifting, we can avoid the need for our CPU to step in.

  1. Packet arrives at the physical Network Interface Card (NIC) from the outside world.
  2. The eBPF XDP program is executed directly on the NIC hardware, processing the packet before it reaches the Linux kernel.
  3. Depending on the XDP action chosen by the eBPF program: If XDP_PASS, the packet continues to the Linux kernel. If XDP_DROP or XDP_ABORTED, the packet is dropped and does not continue to the kernel. If XDP_TX or XDP_REDIRECT, the packet is transmitted or redirected to another interface without necessarily going through the Linux kernel networking stack.
  4. If the packet continues to the Linux kernel (XDP_PASS): The linux kernel now processes the packet as normal.

To unleash the power of NIC offloading, make sure your hardware is up to the task. While consumer-grade devices and laptops might not be quite ready, high-performance and data center grade NICs are born for this job. Keep in mind, though, that not all XDP features are supported in offloaded mode. Consult your NIC's documentation to ensure seamless compatibility.


Outro

By embracing optimization techniques and prioritizing observability, you can take your projects to new heights. So, get ready to unleash your creativity and explore the endless possibilities that XDP has to offer. Happy coding, and here's to your next high-performance networking masterpiece!