BPF Updates 13
This is issue 13 of the regular newsletter around BPF written by Alexander Alemayhu. It summarizes ongoing development, presentations, videos and other information related to BPF and XDP. It is released roughly once a week.
The v4.15 merge window is open and LWN.net already has a summary on part 1 out. Which contains a BPF section listing some of the new things:
BPF
The user-space bpftool utility can be used to examine and manipulate BPF programs and maps; see this man page for more information.
Hooks have been added to allow security modules to control access to BPF objects; see this changelog for more information.
A new BPF-based device controller has been added; it uses the version-2 control-group interface. Documentation for this feature is entirely absent, but one can look at the sample program added in this commit that uses it.
The highlights since last time
- New helper function
bpf_getsockops
to retrieve socket options. supportsTCP_CONGESTION
for now. The newBPF_SOCK_OPS_BASE_RTT
feature significantly improves TCP-NV. - It is now possible to attach multiple programs to tracepoint / kprobes / uprobes. The programs will run in sequence. With the change for trace points one application does not exclude others from attaching to the same call.
More interesting topics
- New helper function
bpf_override_function
under discussion to allow for error injection via kprobes. - BPF runtime finally gets a FAQ section in the kernel's documentation directory.
- bpftool gets support for dumping JSON.
Presentations
Cilium - Kernel Native Security & DDOS Mitigation for Microservices with BPF
The slides of Cynthia's talk were already in the last issue. Docker has since published the recording as well, definitely worth watching the recording. Fun talk on Cilium, BPF, and Kafka.
Linux Networking Development
Focusing on development areas in the kernel. Also some advice in there for aspiring kernel developers. ;-)
XDP: The Future of Networks
Great introduction to BPF and XDP. With some myth busting and potential improvements.
A Gentle Introduction to [e]BPF - Michael Schubert, Kinvolk GmbH
Good introduction to BPF. Also nice that it shows the structures, links to some tools and verifier.
LISA 17 - Fast and Safe Production Monitoring of JVM Applications with BPF Magic
Focusing on the tracing case with Java but the approaches could still be applied to other environments.
LISA17 Container Performance Analysis
Goes through some of the tools used at Netflix and a lot of other smaller tools for tracing. The emphasis on identifying the bottlenecks sounds good.
LISA17 Linux Performance Monitoring With BPF
Lab session for tracing tools with BCC. This is useful for learning about tracing on Linux. It also answers basic question what is tracepoints, kprobes, uprobes, etc. and what are some of the limitations to dynamic tracing. Looks like a lot of fun.
XDP – eXpress Data Path An in-kernel network fast-path A technology overview
Great introduction to BPF and XDP. Also explains the problems and why it is needed.
In case you missed it
Reports from Netconf and Netdev
LWN.net coverage of the discussions from netconf and all the talks from netdev. All lot of interesting BPF topics in there. Check it out!
security things in Linux v4.14
The security summary contains a section eBPF JIT 32-bit ARM support and seccomp improvements.
SystemTap 3.2 release
SystemTap now has an experimental eBPF backend.
Another attempt to address the tracepoint ABI problem
Steven Rostedt proposes different scheme where tracepoints are placed but no trace event. Then on userspace a kernel module have to be loaded and there would be no need to add this to the kernel ABI. Will moving the ABI to a module really solve this problem?
Using eBPF and XDP in Suricata
LWN.net coverage of Eric Leblond's talk from Kernel Recipes. The recording was already in the last issue.
Projects
awesome-ebpf
A curated list of awesome projects related to eBPF
k8s-snowflake
Configs and scripts for bootstrapping an opinionated Kubernetes cluster anywhere.
libseccomp
The libseccomp library provides an easy to use, platform independent, interface to the Linux Kernel's syscall filtering mechanism. The libseccomp API is designed to abstract away the underlying BPF based syscall filter language and present a more conventional function-call based filtering interface that should be familiar to, and easily adopted by, application developers.
cbpf-rust
Userspace cBPF interpreter and cBPF to eBPF converter
vltrace
vltrace is a syscall tracing tool which utilizes eBPF - an efficient tracing feature of the Linux kernel.
Random cool note
We blew way past 7Mpps with UDP+XDP. I’m sure you know that already though :)
Patches
Please note that netdev and llvm-commits receive a lot of patches and the list below is not meant to be comprehensive.
LLVM
- Alexei Starovoitov, [llvm] r318615 - [bpf] remove unused variable
- Alexei Starovoitov, [llvm] r318614 - [bpf] allow direct and indirect calls
- Yonghong Song, [llvm] r318358 - bpf: enable llvm-objdump to print out symbolized jmp target
- Yonghong Song, [llvm] r318442 - bpf: print backward branch target properly
- Yonghong Song, [llvm] r316469 - bpf: fix a bug in bpf-isel trunc-op optimization
- Yonghong Song, [llvm] r316519 - bpf: fix an uninitialized variable issue
- Yonghong Song, [llvm] r316481 - bpf: fix a bug in trunc-op optimization
netdev
- Jakub Kicinski, [PATCH net 00/10] bpf: offload: check netdev pointer in the drivers and namespace trouble
- [PATCH net 01/10] bpf: offload: add comment warning developers about double destroy
- [PATCH net 02/10] bpf: offload: limit offload to
cls_bpf
and xdp programs only - [PATCH net 03/10] bpf: offload: rename the ifindex field
- [PATCH net 04/10] bpf: offload: move offload device validation out to the drivers
- [PATCH net 05/10] net: xdp: don't allow device-bound programs in driver mode
- [PATCH net 06/10] bpf: turn
bpf_prog_get_type()
into a wrapper - [PATCH net 07/10] bpf: offload: ignore namespace moves
- [PATCH net 08/10] bpftool: revert printing program device bound info
- [PATCH net 09/10] bpf: revert report offload info to user space
- [PATCH net 10/10] bpf: make
bpf_prog_offload_verifier_prep()
static inline
- Song Liu, [RFC v2 0/6] enable creating [k,u]probe with
perf_event_open
- [RFC v2 1/6] perf: Add new type
PERF_TYPE_PROBE
- [RFC v2 2/6] perf: copy new perf_event.h to tools/include/uapi
- [RFC v2 3/6] perf: implement kprobe support to
PERF_TYPE_PROBE
- [RFC v2 4/6] perf: implement uprobe support to
PERF_TYPE_PROBE
- [RFC v2 5/6] bpf: add option for
bpf_load.c
to usePERF_TYPE_PROBE
- [RFC v2 6/6] bpf: add new test
test_many_kprobe
- [RFC] bcc: Try use new API to create [k,u]probe with
perf_event_open
- [RFC]
perf_event_open.2
: add new typePERF_TYPE_PROBE
- [RFC v2 1/6] perf: Add new type
- Yonghong Song, [PATCH net-next 0/3 v3] bpf: improve verifier
ARG_CONST_SIZE_OR_ZERO
semantics - Lawrence Brakmo, [PATCH net-next v2 0/6] bpf: Fix bugs in sock_ops samples
- [PATCH net-next v2 1/6] bpf: Fix
tcp_synrto_kern.c
sample program - [PATCH net-next v2 2/6] bpf: Fix
tcp_rwnd_kern.c
sample program - [PATCH net-next v2 3/6] bpf: Fix
tcp_bufs_kern.c
sample program - [PATCH net-next v2 4/6] bpf: Fix
tcp_cong_kern.c
sample program - [PATCH net-next v2 5/6] bpf: Fix
tcp_iw_kern.c
sample program - [PATCH net-next v2 6/6] bpf: Fix
tcp_clamp_kern.c
sample program
- [PATCH net-next v2 1/6] bpf: Fix
- Prashant Bhole, [PATCH net-next V4 0/3] tools: bpftool: show filenames of pinned objects
- Jakub Kicinski, [PATCH net-next v2 00/15] bpf: add offload as a first class citizen
- [PATCH net-next v2 01/15] net: bpf: rename
ndo_xdp
to ndo_bpf - [PATCH net-next v2 02/15] bpf: offload: add infrastructure for loading programs for a specific netdev
- [PATCH net-next v2 03/15] bpf: report offload info to user space
- [PATCH net-next v2 04/15] bpftool: print program device bound info
- [PATCH net-next v2 05/15] xdp: allow attaching programs loaded for specific device
- [PATCH net-next v2 06/15]
cls_bpf
: allow attaching programs loaded for specific device - [PATCH net-next v2 07/15] nfp: bpf: drop support for
cls_bpf
with legacy actions - [PATCH net-next v2 08/15] nfp: bpf: remove the register renumbering leftovers
- [PATCH net-next v2 09/15] nfp: bpf: remove unnecessary include of nfp_net.h
- [PATCH net-next v2 10/15] nfp: bpf: refactor offload logic
- [PATCH net-next v2 11/15] nfp: bpf: require seamless reload for program replace
- [PATCH net-next v2 12/15] nfp: bpf: move program prepare and free into offload.c
- [PATCH net-next v2 13/15] nfp: bpf: move translation prepare to offload.c
- [PATCH net-next v2 14/15] nfp: bpf: move to new BPF program offload infrastructure
- [PATCH net-next v2 15/15] bpf: remove old offload/analyzer
- [PATCH net-next v2 01/15] net: bpf: rename
- Christina Jacob, [PATCH v4 0/1] XDP program for ip forward
- Dan Carpenter, [PATCH net-next] xdp: sample: Missing curly braces in read_route()
- Josef Bacik, [PATCH 0/4] [v6] Add the ability to do BPF directed error injection
- Lawrence Brakmo, [PATCH net-next] bpf: Rename
tcp_bbf.readme
totcp_bpf.readme
- Sandipan Das, [RFC PATCH] bpf: Add helpers to read useful
task_struct
members - Roman Gushchin, [PATCH v3 net-next 0/5] eBPF-based device cgroup controller
- [PATCH v3 net-next 1/5]
device_cgroup
: addDEVCG_
prefix toACC_*
andDEV_*
constants - [PATCH v3 net-next 2/5]
device_cgroup
: prepare code for bpf-based device controller - [PATCH v3 net-next 3/5] bpf, cgroup: implement eBPF-based device controller for cgroup v2
- [PATCH v3 net-next 4/5] bpf: move
cgroup_helpers
from samples/bpf/ to tools/testing/selftesting/bpf/ - [PATCH v3 net-next 5/5] selftests/bpf: add a test for device cgroup controller
- [PATCH v3 net-next 1/5]
- Jakub Kicinski, [PATCH net-next] tools: bpftool: move
p_err()
andp_info()
from main.h to common.c - Colin King, [PATCH net-next] net: sched:
cls_bpf
: use bitwise & rather than logical && ongen_flags
- Craig Gallek, [PATCH] [net-next v2] bpf: fix verifier NULL pointer dereference
- Arnd Bergmann, [PATCH 1/2] [net-next] bpf: fix link error without CONFIG_NET
- Eric Dumazet, [PATCH net] bpf: fix lockdep splat
- Prashant Bhole, tools: bpf: handle long path in jit disasm
- Jakub Kicinski, [PATCH net-next 0/8] nfp: TC block fixes, app fallback and dev_alloc()
- [PATCH net-next 1/8] nfp: flower: app should use struct nfp_repr
- [PATCH net-next 2/8] nfp: flower: vxlan - ensure no sleep in atomic context
- [PATCH net-next 3/8] nfp: bpf: reject TC offload if XDP loaded
- [PATCH net-next 4/8] nfp: reorganize the app table
- [PATCH net-next 5/8] nfp: bpf: fall back to core NIC app if BPF not selected
- [PATCH net-next 6/8] nfp: switch to
dev_alloc_page()
- [PATCH net-next 7/8] nfp: use a counter instead of log message for allocation failures
- [PATCH net-next 8/8] nfp: improve defines for constants in ethtool
- Daniel Borkmann, [PATCH net-next 0/3] BPF range marking improvements for meta data
- Jakub Kicinski, [PATCH net-next] security: bpf: replace include of linux/bpf.h with forward declarations
- Jakub Kicinski, [PATCH net-next 0/2] nfp: bpf: rename
ALU_OP_NEG
and supportBPF_NEG
- Jesper Dangaard Brouer, [net-next PATCH] bpf: cpumap micro-optimization in
cpu_map_enqueue
- Alexei Starovoitov, [PATCH net-next] bpf: fix verifier memory leaks
- John Fastabend, [net PATCH] bpf: remove SK_REDIRECT from UAPI
- Alexei Starovoitov, [PATCH v2 net-next] bpf: reduce verifier memory consumption
- Jakub Kicinski, [RFC] net: dummy: add BPF offload callbacks for test purposes
- Björn Töpel, [RFC PATCH 00/14] Introducing AF_PACKET V4 support
- [RFC PATCH 01/14] packet: introduce AF_PACKET V4 userspace API
- [RFC PATCH 02/14] packet: implement PACKET_MEMREG setsockopt
- [RFC PATCH 03/14] packet: enable AF_PACKET V4 rings
- [RFC PATCH 04/14] packet: enable Rx for AF_PACKET V4
- [RFC PATCH 05/14] packet: enable Tx support for AF_PACKET V4
- [RFC PATCH 06/14] netdevice: add AF_PACKET V4 zerocopy ops
- [RFC PATCH 07/14] packet: wire up zerocopy for AF_PACKET V4
- [RFC PATCH 08/14] i40e:
AF_PACKET
V4ndo_tp4_zerocopy
Rx support - [RFC PATCH 09/14] i40e:
AF_PACKET
V4ndo_tp4_zerocopy
Tx support - [RFC PATCH 10/14] samples/tpacket4: added tpbench
- [RFC PATCH 11/14] veth: added support for PACKET_ZEROCOPY
- [RFC PATCH 12/14] samples/tpacket4: added veth support
- [RFC PATCH 13/14] i40e: added XDP support for TP4 enabled queue pairs
- [RFC PATCH 14/14] xdp: introducing
XDP_PASS_TO_KERNEL
forPACKET_ZEROCOPY
use
- Jason Wang, [PATCH net-next V2 0/3] support changing steering policies in tuntap
- Alexei Starovoitov, [PATCH net-next] bpf: document answers to common questions about BPF
- Alexei Starovoitov, [PATCH net-next] bpf: reduce verifier memory consumption
- Yonghong Song, [PATCH net-next] bpf: avoid
rcu_dereference
insidebpf_event_mutex
lock region - Alexei Starovoitov, [PATCH net-next] selftests/bpf: remove useless
bpf_trace_printk
- Tushar Dave, [PATCH net-next] samples/bpf: adjust rlimit
RLIMIT_MEMLOCK
forxdp_redirect_map
- Tushar Dave, [PATCH net-next] samples/bpf: adjust rlimit
RLIMIT_MEMLOCK
for xdp1 - John Fastabend, [net PATCH 0/2] sockmap fixes
- Quentin Monnet, [PATCH net-next] tools: bpftool: add bash completion for bpftool
- Gianluca Borello, [PATCH net-next] bpf: remove
tail_call
andget_stackid
helper declarations from bpf.h - Chenbo Feng, [PATCH net-next v7 0/5] bpf: security: New file mode and LSM hooks for eBPF object permission control
- [PATCH net-next v7 1/5] bpf: Add file mode configuration into bpf maps
- [PATCH net-next v7 2/5] bpf: Add tests for eBPF file mode
- [PATCH net-next v7 3/5] security: bpf: Add LSM hooks for bpf object related syscall
- [PATCH net-next v7 4/5] selinux: bpf: Add selinux check for eBPF syscall operations
- [PATCH net-next v7 5/5] selinux: bpf: Add addtional check for bpf object file receive
- Jakub Kicinski, [PATCH net-next 00/12] tools: bpftool: Add JSON output to bpftool
- [PATCH net-next 01/12] tools: bpftool: copy JSON writer from iproute2 repository
- [PATCH net-next 02/12] tools: bpftool: add option parsing to bpftool, --help and --version
- [PATCH net-next 03/12] tools: bpftool: introduce --json and --pretty options
- [PATCH net-next 04/12] tools: bpftool: add JSON output for
bpftool prog show *
command - [PATCH net-next 05/12] tools: bpftool: add JSON output for
bpftool prog dump jited *
command - [PATCH net-next 06/12] tools: bpftool: add JSON output for
bpftool prog dump xlated *
command - [PATCH net-next 07/12] tools: bpftool: add JSON output for
bpftool map *
commands - [PATCH net-next 08/12] tools: bpftool: add JSON output for
bpftool batch file FILE
command - [PATCH net-next 09/12] tools: bpftool: turn err() and info() macros into functions
- [PATCH net-next 10/12] tools: bpftool: provide JSON output for all possible commands
- [PATCH net-next 11/12] tools: bpftool: add cosmetic changes for the manual pages
- [PATCH net-next 12/12] tools: bpftool: update documentation for --json and --pretty usage
- Jakub Kicinski, [PATCH net-next 0/8] tools: bpftool: add a "version" command, and fix several items
- [PATCH net-next 1/8] tools: bpftool: add pointer to file argument to
print_hex()
- [PATCH net-next 2/8] tools: bpftool: fix return value when all eBPF programs have been shown
- [PATCH net-next 3/8] tools: bpftool: use err() instead of info() if there are too many insns
- [PATCH net-next 4/8] tools: bpftool: add
bpftool prog help
as real command i.r.t exit code - [PATCH net-next 5/8] tools: bpftool: print only one error message on byte parsing failure
- [PATCH net-next 6/8] tools: bpftool: print all relevant byte opcodes for "load double word"
- [PATCH net-next 7/8] tools: bpftool: show that
opcodes
orfile FILE
should be exclusive - [PATCH net-next 8/8] tools: bpftool: add a command to display bpftool version
- [PATCH net-next 1/8] tools: bpftool: add pointer to file argument to
- Lawrence Brakmo, [PATCH net-next 0/5] bpf: add support for
BASE_RTT
- [PATCH net-next 1/5] bpf: add support for
BPF_SOCK_OPS_BASE_RTT
- [PATCH net-next 2/5] bpf: Adding helper function
bpf_getsockops
- [PATCH net-next 3/5] bpf: Add
BPF_SOCKET_OPS_BASE_RTT
support totcp_nv
- [PATCH net-next 4/5] bpf: sample
BPF_SOCKET_OPS_BASE_RTT
program - [PATCH net-next 5/5] bpf: create samples/bpf/tcp_bpf.readme
- [PATCH net-next 1/5] bpf: add support for
- John Fastabend, [net PATCH 0/5] sockmap fixes for net
- [net PATCH 1/5] bpf: enforce TCP only support for sockmap
- [net PATCH 2/5] bpf: avoid preempt enable/disable in sockmap using
tcp_skb_cb
region - [net PATCH 3/5] bpf: remove mark access for SK_SKB program types
- [net PATCH 4/5] bpf: require
CAP_NET_ADMIN
when using sockmap maps - [net PATCH 5/5] bpf: require
CAP_NET_ADMIN
when using devmap
- Daniel Borkmann, [PATCH net 0/3] Two BPF fixes for range marking
- Jesper Dangaard Brouer, [net-next PATCH] bpf: cpumap fix potential lost wake-up problem
- Jakub Kicinski, [PATCH net-next 0/9] nfp: bpf: stack support in offload
- [PATCH net-next 1/9] nfp: bpf: add helper for emitting nops
- [PATCH net-next 2/9] nfp: bpf: refactor
nfp_bpf_check_ptr()
- [PATCH net-next 3/9] nfp: bpf: add stack write support
- [PATCH net-next 4/9] nfp: bpf: add stack read support
- [PATCH net-next 5/9] nfp: bpf: optimize the RMW for stack accesses
- [PATCH net-next 6/9] nfp: bpf: allow stack accesses via modified stack registers
- [PATCH net-next 7/9] nfp: bpf: support accessing the stack beyond 64 bytes
- [PATCH net-next 8/9] nfp: bpf: support stack accesses via non-constant pointers
- [PATCH net-next 9/9] nfp: bpf: optimize mov64 a little
- Yonghong Song, [PATCH net-next v3 0/3] bpf: permit multiple bpf attachments for a single perf tracepoint event
- Quentin Monnet, [PATCH net-next] tools: bpftool: try to mount bpffs if required for pinning objects
- John Fastabend, [net PATCH] bpf: devmap fix arithmetic overflow in bitmap_size calculation
- Alexei Starovoitov, [PATCH v2 net-next] selftests/bpf: fix broken build of test_maps