Netdata's extended Berkeley Packet Filter (eBPF) collector monitors kernel-level metrics for file descriptors, virtual filesystem IO, and process management on Linux systems. You can use our eBPF collector to analyze how and when a process accesses files, when it makes system calls, whether it leaks memory or creating zombie processes, and more.
Netdata's eBPF monitoring toolkit uses two custom eBPF programs. The default, called
entry, monitors calls to a
variety of kernel functions, such as
_do_fork, and more. The
return program also monitors the return of each kernel functions to deliver more granular metrics about how your
system and its applications interact with the Linux kernel.
eBPF monitoring can help you troubleshoot and debug how applications interact with the Linux kernel. See our guide on troubleshooting apps with eBPF metrics for configuration and troubleshooting tips.
Enable the collector on Linux
The eBPF collector is installed and enabled by default on most new installations of the Agent. The eBPF collector does not currently work with static build installations, but improved support is in active development.
eBPF monitoring only works on Linux systems and with specific Linux kernels, including all kernels newer than
and all kernels on CentOS 7.6 or later.
If your Agent is v1.22 or older, you may to enable the collector yourself. See the configuration section for details.
The eBPF collector creates an eBPF menu in the Agent's dashboard along with three sub-menus: File, VFS, and Process. All the charts in this section update every second. The collector stores the actual value inside of its process, but charts only show the difference between the values collected in the previous and current seconds.
This group has two charts demonstrating how software interacts with the Linux kernel to open and close file descriptors.
This chart contains two dimensions that show the number of calls to the functions
software do not commonly call these functions directly, but they are behind the system calls
This chart shows the number of times some software tried and failed to open or close a file descriptor.
A virtual file system (VFS) is a layer on top of regular filesystems. The functions present inside this API are used for all filesystems, so it's possible the charts in this group won't show all the actions that occurred on your system.
This chart monitors calls for
vfs_unlink. This function is responsible for removing objects from the file system.
This chart shows the number of calls to the functions
This chart also monitors
vfs_write, but instead shows the total of bytes read and written with these
The Agent displays the number of bytes written as negative because they are moving down to disk.
The Agent counts and shows the number of instances where a running program experiences a read or write error.
For this group, the eBPF collector monitors process/thread creation and process end, and then displays any errors in the following charts.
Internally, the Linux kernel treats both processes and threads as
tasks. To create a thread, the kernel offers a few
clone(2). In turn, each of these system calls use the function
generate this chart, the eBPF collector monitors
_do_fork to populate the
process dimension, and monitors
sys_clone to identify threads.
Ending a task requires two steps. The first is a call to the internal function
do_exit, which notifies the operating
system that the task is finishing its work. The second step is to release the kernel information with the internal
release_task. The difference between the two dimensions can help you discover zombie
The functions responsible for ending tasks do not return values, so this chart contains information about failures on process and thread creation.
Enable or disable the entire eBPF collector by editing
To enable the collector, scroll down to the
[plugins] section ensure the relevant line references
ebpf_process), is uncommented, and is set to
You can also configure the eBPF collector's behavior by editing
[global] section defines settings for the whole eBPF collector.
ebpf load mode
The collector has two different eBPF programs. These programs monitor the same functions inside the kernel, but they monitor, process, and display different kinds of information.
By default, this plugin uses the
entry mode. Changing this mode can create significant overhead on your operating
system, but also offer valuable information if you are developing or debugging software. The
ebpf load mode option
accepts the following values:
entry: This is the default mode. In this mode, the eBPF collector only monitors calls for the functions described in the sections above, and does not show charts related to errors.
return: In the
returnmode, the eBPF collector monitors the same kernel functions as
entry, but also creates new charts for the return of these functions, such as errors. Monitoring function returns can help in debugging software, such as failing to close file descriptors or creating zombie processes.
The eBPF collector also creates charts for each running application through an integration with the
apps.plugin. This integration helps you understand how specific applications
interact with the Linux kernel.
When the integration is enabled, your dashboard will also show the following charts using low-level Linux metrics:
- eBPF syscall
- Number of calls to open files.
- Number of files closed.
- Number of calls to delete files.
- Number of calls to
- Number of calls to
- Number of bytes written trough
- Number of bytes read trough
- Number of process created trough
- Number of threads created trough
__x86_64_sys_clone, depending on your system's kernel version.
- Number of times that a process called
- Number of calls to open files that returned errors.
- Number of calls to close files that returned errors.
- Number of calls to read a file that returned errors.
- Number of calls to read a file that returned errors.
- eBPF net
- Number of bytes transmited per seconds.
If you want to disable these charts, change the setting
disable apps to
The eBPF collector enables and runs the following eBPF programs by default:
process: This eBPF program creates charts that show information about process creation, VFS IO, and files removed. When in
returnmode, it also creates charts showing errors when these operations are executed.
network viewer: This eBPF program creates charts with information about
UDPfunctions, including the bandwidth consumed by each.
If the eBPF collector does not work, you can troubleshoot it by running the
ebpf.plugin command and investigating its
You can also use
grep to search the Agent's
error.log for messages related to eBPF monitoring.
Confirm kernel compatibility
The eBPF collector only works on Linux systems and with specific Linux kernels. We support all kernels more recent than
4.11.0, and all kernels on CentOS 7.6 or later.
You can run our helper script to determine whether your system can support eBPF monitoring.
If this script returns no output, your system is ready to compile and run the eBPF collector.
If you see a warning about a missing kerkel configuration (
KPROBES KPROBES_ON_FTRACE HAVE_KPROBES BPF BPF_SYSCALL
BPF_JIT), you will need to recompile your kernel to support this configuration. The process of recompiling Linux
kernels varies based on your distribution and version. Read the documentation for your system's distribution to learn
more about the specific workflow for recompiling the kernel, ensuring that you set all the necessary
The eBPF collector also requires both the
debugfs filesystems. Try mounting the
filesystems using the commands below:
If they are already mounted, you will see an error. You can also configure your system's
/etc/fstab configuration to
mount these filesystems on startup. More information can be found in the ftrace documentation.
Because eBPF monitoring is complex, we are evaluating the performance of this new collector in various real-world conditions, across various system loads, and when monitoring complex applications.
Our initial testing shows the performance of the eBPF collector is nearly identical to our apps.plugin collector, despite collecting and displaying much more sophisticated metrics. You can now use the eBPF to gather deeper insights without affecting the performance of your complex applications at any load.
When SELinux is enabled, it may prevent
starting correctly. Check the Agent's
error.log file for errors like the ones below:
You can also check for errors related to
If you see similar errors, you will have to adjust SELinux's policies to enable the eBPF collector.
Creation of bpf policies
ebpf.plugin to run on a distribution with SELinux enabled, it will be necessary to take the following
First, stop the Netdata Agent.
Next, create a policy with the
audit.log file you examined earlier.
This will create two new files:
netdata_ebpf.te file to change the options
allow. You should have the following at the end of
Then compile your
netdata_ebpf.te file with the following commands to create a binary that loads the new policies:
Finally, you can load the new policy and start the Netdata agent again:
Beginning with version 5.4, the Linux kernel has
a feature called "lockdown," which may affect
ebpf.plugin depending how the kernel was compiled. The following table
shows how the lockdown module impacts
ebpf.plugin based on the selected options:
|Enforcing kernel lockdown||Enable lockdown LSM early in init||Default lockdown mode||Can |
If you or your distribution compiled the kernel with the last combination, your system cannot load shared libraries
required to run
The eBPF collector adds entries to the file
/sys/kernel/debug/tracing/kprobe_events, and cleans them on exit, unless
another process prevents it. If you need to clean the eBPF entries safely, you can manually run the script