summaryrefslogtreecommitdiff
path: root/tools/lib/bpf
AgeCommit message (Collapse)AuthorFilesLines
2023-10-25libbpf: Add link-based API for netkitDaniel Borkmann5-0/+76
This adds bpf_program__attach_netkit() API to libbpf. Overall it is very similar to tcx. The API looks as following: LIBBPF_API struct bpf_link * bpf_program__attach_netkit(const struct bpf_program *prog, int ifindex, const struct bpf_netkit_opts *opts); The struct bpf_netkit_opts is done in similar way as struct bpf_tcx_opts for supporting bpf_mprog control parameters. The attach location for the primary and peer device is derived from the program section "netkit/primary" and "netkit/peer", respectively. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20231024214904.29825-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-17libbpf: Don't assume SHT_GNU_verdef presence for SHT_GNU_versym sectionAndrii Nakryiko1-6/+10
Fix too eager assumption that SHT_GNU_verdef ELF section is going to be present whenever binary has SHT_GNU_versym section. It seems like either SHT_GNU_verdef or SHT_GNU_verneed can be used, so failing on missing SHT_GNU_verdef actually breaks use cases in production. One specific reported issue, which was used to manually test this fix, was trying to attach to `readline` function in BASH binary. Fixes: bb7fa09399b9 ("libbpf: Support symbol versioning for uprobe") Reported-by: Liam Wisehart <liamwisehart@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Manu Bretelle <chantr4@gmail.com> Reviewed-by: Fangrui Song <maskray@google.com> Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Link: https://lore.kernel.org/bpf/20231016182840.4033346-1-andrii@kernel.org
2023-10-12libbpf: Add support for cgroup unix socket address hooksDaan De Meyer1-0/+10
Add the necessary plumbing to hook up the new cgroup unix sockaddr hooks into libbpf. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-6-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-04libbpf: Fix syscall access arguments on riscvAlexandre Ghiti1-2/+0
Since commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers"), riscv selects ARCH_HAS_SYSCALL_WRAPPER so let's use the generic implementation of PT_REGS_SYSCALL_REGS(). Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-2-bjorn@kernel.org
2023-09-30libbpf: Allow Golang symbols in uprobe secdefHengqi Chen1-6/+16
Golang symbols in ELF files are different from C/C++ which contains special characters like '*', '(' and ')'. With generics, things get more complicated, there are symbols like: github.com/cilium/ebpf/internal.(*Deque[go.shape.interface { Format(fmt.State, int32); TypeName() string;github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type}]).Grow Matching such symbols using `%m[^\n]` in sscanf, this excludes newline which typically does not appear in ELF symbols. This should work in most use-cases and also work for unicode letters in identifiers. If newline do show up in ELF symbols, users can still attach to such symbol by specifying bpf_uprobe_opts::func_name. A working example can be found at this repo ([0]). [0]: https://github.com/chenhengqi/libbpf-go-symbols Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230929155954.92448-1-hengqi.chen@gmail.com
2023-09-26libbpf: Add ring__consumeMartin Kelly3-0/+22
Add ring__consume to consume a single ringbuffer, analogous to ring_buffer__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-14-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__map_fdMartin Kelly3-0/+15
Add ring__map_fd to get the file descriptor underlying a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-12-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__sizeMartin Kelly3-0/+16
Add ring__size to get the total size of a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-10-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__avail_data_sizeMartin Kelly3-0/+21
Add ring__avail_data_size for querying the currently available data in the ringbuffer, similar to the BPF_RB_AVAIL_DATA flag in bpf_ringbuf_query. This is racy during ongoing operations but is still useful for overall information on how a ringbuffer is behaving. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-8-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__producer_pos, ring__consumer_posMartin Kelly3-0/+34
Add APIs to get the producer and consumer position for a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-6-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring_buffer__ringMartin Kelly3-0/+24
Add a new function ring_buffer__ring, which exposes struct ring * to the user, representing a single ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-4-martin.kelly@crowdstrike.com
2023-09-26libbpf: Switch rings to array of pointersMartin Kelly1-8/+12
Switch rb->rings to be an array of pointers instead of a contiguous block. This allows for each ring pointer to be stable after ring_buffer__add is called, which allows us to expose struct ring * to the user without gotchas. Without this change, the realloc in ring_buffer__add could invalidate a struct ring *, making it unsafe to give to the user. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-3-martin.kelly@crowdstrike.com
2023-09-26libbpf: Refactor cleanup in ring_buffer__addMartin Kelly1-6/+9
Refactor the cleanup code in ring_buffer__add to use a unified err_out label. This reduces code duplication, as well as plugging a potential leak if mmap_sz != (__u64)(size_t)mmap_sz (currently this would miss unmapping tmp because ringbuf_unmap_ring isn't called). Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-2-martin.kelly@crowdstrike.com
2023-09-23libbpf: Support symbol versioning for uprobeHengqi Chen2-12/+124
In current implementation, we assume that symbol found in .dynsym section would have a version suffix and use it to compare with symbol user supplied. According to the spec ([0]), this assumption is incorrect, the version info of dynamic symbols are stored in .gnu.version and .gnu.version_d sections of ELF objects. For example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 In this case, specify pthread_rwlock_wrlock@@GLIBC_2.34 or pthread_rwlock_wrlock@GLIBC_2.2.5 in bpf_uprobe_opts::func_name won't work. Because the qualified name does NOT match `pthread_rwlock_wrlock` (without version suffix) in .dynsym sections. This commit implements the symbol versioning for dynsym and allows user to specify symbol in the following forms: - func - func@LIB_VERSION - func@@LIB_VERSION In case of symbol conflicts, error out and users should resolve it by specifying a qualified name. [0]: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-3-hengqi.chen@gmail.com
2023-09-23libbpf: Resolve symbol conflicts at the same offset for uprobeHengqi Chen1-1/+4
Dynamic symbols in shared library may have the same name, for example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 Currently, users can't attach a uprobe to pthread_rwlock_wrlock because there are two symbols named pthread_rwlock_wrlock and both are global bind. And libbpf considers it as a conflict. Since both of them are at the same offset we could accept one of them harmlessly. Note that we already does this in elf_resolve_syms_offsets. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-2-hengqi.chen@gmail.com
2023-09-16libbpf: Add support for custom exception callbacksKumar Kartikeya Dwivedi1-5/+109
Add support to libbpf to append exception callbacks when loading a program. The exception callback is found by discovering the declaration tag 'exception_callback:<value>' and finding the callback in the value of the tag. The process is done in two steps. First, for each main program, the bpf_object__sanitize_and_load_btf function finds and marks its corresponding exception callback as defined by the declaration tag on it. Second, bpf_object__reloc_code is modified to append the indicated exception callback at the end of the instruction iteration (since exception callback will never be appended in that loop, as it is not directly referenced). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230912233214.1518551-16-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-16libbpf: Refactor bpf_object__reloc_codeKumar Kartikeya Dwivedi1-19/+33
Refactor bpf_object__append_subprog_code out of bpf_object__reloc_code to be able to reuse it to append subprog related code for the exception callback to the main program. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230912233214.1518551-15-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-08libbpf: Add __percpu_kptr macro definitionYonghong Song1-0/+1
Add __percpu_kptr macro definition in bpf_helpers.h. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152800.1998492-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-08libbpf: Add basic BTF sanity validationAndrii Nakryiko1-0/+160
Implement a simple and straightforward BTF sanity check when parsing BTF data. Right now it's very basic and just validates that all the string offsets and type IDs are within valid range. For FUNC we also check that it points to FUNC_PROTO kinds. Even with such simple checks it fixes a bunch of crashes found by OSS fuzzer ([0]-[5]) and will allow fuzzer to make further progress. Some other invariants will be checked in follow up patches (like ensuring there is no infinite type loops), but this seems like a good start already. Adding FUNC -> FUNC_PROTO check revealed that one of selftests has a problem with FUNC pointing to VAR instead, so fix it up in the same commit. [0] https://github.com/libbpf/libbpf/issues/482 [1] https://github.com/libbpf/libbpf/issues/483 [2] https://github.com/libbpf/libbpf/issues/485 [3] https://github.com/libbpf/libbpf/issues/613 [4] https://github.com/libbpf/libbpf/issues/618 [5] https://github.com/libbpf/libbpf/issues/619 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Song Liu <song@kernel.org> Closes: https://github.com/libbpf/libbpf/issues/617 Link: https://lore.kernel.org/bpf/20230825202152.1813394-1-andrii@kernel.org
2023-08-24libbpf: fix signedness determination in CO-RE relo handling logicAndrii Nakryiko1-1/+1
Extracting btf_int_encoding() is only meaningful for BTF_KIND_INT, so we need to check that first before inferring signedness. Closes: https://github.com/libbpf/libbpf/issues/704 Reported-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230824000016.2658017-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-24libbpf: Add bpf_object__unpin()Daniel Xu3-0/+17
For bpf_object__pin_programs() there is bpf_object__unpin_programs(). Likewise bpf_object__unpin_maps() for bpf_object__pin_maps(). But no bpf_object__unpin() for bpf_object__pin(). Adding the former adds symmetry to the API. It's also convenient for cleanup in application code. It's an API I would've used if it was available for a repro I was writing earlier. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/b2f9d41da4a350281a0b53a804d11b68327e14e5.1692832478.git.dxu@dxuuu.xyz
2023-08-23libbpf: Free btf_vmlinux when closing bpf_objectHao Luo1-0/+1
I hit a memory leak when testing bpf_program__set_attach_target(). Basically, set_attach_target() may allocate btf_vmlinux, for example, when setting attach target for bpf_iter programs. But btf_vmlinux is freed only in bpf_object_load(), which means if we only open bpf object but not load it, setting attach target may leak btf_vmlinux. So let's free btf_vmlinux in bpf_object__close() anyway. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230822193840.1509809-1-haoluo@google.com
2023-08-22libbpf: Add uprobe multi link support to bpf_program__attach_usdtJiri Olsa2-17/+82
Adding support for usdt_manager_attach_usdt to use uprobe_multi link to attach to usdt probes. The uprobe_multi support is detected before the usdt program is loaded and its expected_attach_type is set accordingly. If uprobe_multi support is detected the usdt_manager_attach_usdt gathers uprobes info and calls bpf_program__attach_uprobe to create all needed uprobes. If uprobe_multi support is not detected the old behaviour stays. Also adding usdt.s program section for sleepable usdt probes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-18-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add uprobe multi link detectionJiri Olsa2-0/+38
Adding uprobe-multi link detection. It will be used later in bpf_program__attach_usdt function to check and use uprobe_multi link over standard uprobe links. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-17-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add support for u[ret]probe.multi[.s] program sectionsJiri Olsa1-0/+36
Adding support for several uprobe_multi program sections to allow auto attach of multi_uprobe programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-16-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add bpf_program__attach_uprobe_multi functionJiri Olsa3-0/+166
Adding bpf_program__attach_uprobe_multi function that allows to attach multiple uprobes with uprobe_multi link. The user can specify uprobes with direct arguments: binary_path/func_pattern/pid or with struct bpf_uprobe_multi_opts opts argument fields: const char **syms; const unsigned long *offsets; const unsigned long *ref_ctr_offsets; const __u64 *cookies; User can specify 2 mutually exclusive set of inputs: 1) use only path/func_pattern/pid arguments 2) use path/pid with allowed combinations of: syms/offsets/ref_ctr_offsets/cookies/cnt - syms and offsets are mutually exclusive - ref_ctr_offsets and cookies are optional Any other usage results in error. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-15-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add bpf_link_create support for multi uprobesJiri Olsa2-1/+21
Adding new uprobe_multi struct to bpf_link_create_opts object to pass multiple uprobe data to link_create attr uapi. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-14-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add elf_resolve_pattern_offsets functionJiri Olsa3-1/+67
Adding elf_resolve_pattern_offsets function that looks up offsets for symbols specified by pattern argument. The 'pattern' argument allows wildcards (*?' supported). Offsets are returned in allocated array together with its size and needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-13-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add elf_resolve_syms_offsets functionJiri Olsa2-0/+112
Adding elf_resolve_syms_offsets function that looks up offsets for symbols specified in syms array argument. Offsets are returned in allocated array with the 'cnt' size, that needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-12-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add elf symbol iteratorJiri Olsa1-64/+115
Adding elf symbol iterator object (and some functions) that follow open-coded iterator pattern and some functions to ease up iterating elf object symbols. The idea is to iterate single symbol section with: struct elf_sym_iter iter; struct elf_sym *sym; if (elf_sym_iter_new(&iter, elf, binary_path, SHT_DYNSYM)) goto error; while ((sym = elf_sym_iter_next(&iter))) { ... } I considered opening the elf inside the iterator and iterate all symbol sections, but then it gets more complicated wrt user checks for when the next section is processed. Plus side is the we don't need 'exit' function, because caller/user is in charge of that. The returned iterated symbol object from elf_sym_iter_next function is placed inside the struct elf_sym_iter, so no extra allocation or argument is needed. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-11-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add elf_open/elf_close functionsJiri Olsa3-42/+57
Adding elf_open/elf_close functions and using it in elf_find_func_offset_from_file function. It will be used in following changes to save some common code. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-10-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Move elf_find_func_offset* functions to elf objectJiri Olsa4-186/+202
Adding new elf object that will contain elf related functions. There's no functional change. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-9-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add uprobe_multi attach type and link namesJiri Olsa1-0/+2
Adding new uprobe_multi attach type and link names, so the functions can resolve the new values. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-8-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-18libbpf: Support triple-underscore flavors for kfunc relocationDave Marchevsky1-1/+19
The function signature of kfuncs can change at any time due to their intentional lack of stability guarantees. As kfuncs become more widely used, BPF program writers will need facilities to support calling different versions of a kfunc from a single BPF object. Consider this simplified example based on a real scenario we ran into at Meta: /* initial kfunc signature */ int some_kfunc(void *ptr) /* Oops, we need to add some flag to modify behavior. No problem, change the kfunc. flags = 0 retains original behavior */ int some_kfunc(void *ptr, long flags) If the initial version of the kfunc is deployed on some portion of the fleet and the new version on the rest, a fleetwide service that uses some_kfunc will currently need to load different BPF programs depending on which some_kfunc is available. Luckily CO-RE provides a facility to solve a very similar problem, struct definition changes, by allowing program writers to declare my_struct___old and my_struct___new, with ___suffix being considered a 'flavor' of the non-suffixed name and being ignored by bpf_core_type_exists and similar calls. This patch extends the 'flavor' facility to the kfunc extern relocation process. BPF program writers can now declare extern int some_kfunc___old(void *ptr) extern int some_kfunc___new(void *ptr, int flags) then test which version of the kfunc exists with bpf_ksym_exists. Relocation and verifier's dead code elimination will work in concert as expected, allowing this pattern: if (bpf_ksym_exists(some_kfunc___old)) some_kfunc___old(ptr); else some_kfunc___new(ptr, 0); Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230817225353.2570845-1-davemarchevsky@fb.com
2023-08-14libbpf: Set close-on-exec flag on gzopenMarco Vedovati1-2/+2
Enable the close-on-exec flag when using gzopen. This is especially important for multithreaded programs making use of libbpf, where a fork + exec could race with libbpf library calls, potentially resulting in a file descriptor leaked to the new process. This got missed in 59842c5451fe ("libbpf: Ensure libbpf always opens files with O_CLOEXEC"). Fixes: 59842c5451fe ("libbpf: Ensure libbpf always opens files with O_CLOEXEC") Signed-off-by: Marco Vedovati <marco.vedovati@crowdstrike.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230810214350.106301-1-martin.kelly@crowdstrike.com
2023-08-05libbpf: Use local includes inside the librarySergey Kacheev2-3/+3
In our monrepo, we try to minimize special processing when importing (aka vendor) third-party source code. Ideally, we try to import directly from the repositories with the code without changing it, we try to stick to the source code dependency instead of the artifact dependency. In the current situation, a patch has to be made for libbpf to fix the includes in bpf headers so that they work directly from libbpf/src. Signed-off-by: Sergey Kacheev <s.kacheev@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/CAJVhQqUg6OKq6CpVJP5ng04Dg+z=igevPpmuxTqhsR3dKvd9+Q@mail.gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-02libbpf: fix typos in MakefileRandy Dunlap1-2/+2
Capitalize ABI (acronym) and fix spelling of "destination". Fixes: 706819495921 ("libbpf: Improve usability of libbpf Makefile") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Cc: Xin Liu <liuxin350@huawei.com> Link: https://lore.kernel.org/r/20230722065236.17010-1-rdunlap@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19libbpf: Add helper macro to clear opts structsDaniel Borkmann1-0/+16
Add a small and generic LIBBPF_OPTS_RESET() helper macros which clears an opts structure and reinitializes its .sz member to place the structure size. Additionally, the user can pass option-specific data to reinitialize via varargs. I found this very useful when developing selftests, but it is also generic enough as a macro next to the existing LIBBPF_OPTS() which hides the .sz initialization, too. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20230719140858.13224-6-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19libbpf: Add link-based API for tcxDaniel Borkmann5-11/+88
Implement tcx BPF link support for libbpf. The bpf_program__attach_fd() API has been refactored slightly in order to pass bpf_link_create_opts pointer as input. A new bpf_program__attach_tcx() has been added on top of this which allows for passing all relevant data via extensible struct bpf_tcx_opts. The program sections tcx/ingress and tcx/egress correspond to the hook locations for tc ingress and egress, respectively. For concrete usage examples, see the extensive selftests that have been developed as part of this series. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230719140858.13224-5-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19libbpf: Add opts-based attach/detach/query API for tcxDaniel Borkmann4-53/+159
Extend libbpf attach opts and add a new detach opts API so this can be used to add/remove fd-based tcx BPF programs. The old-style bpf_prog_detach() and bpf_prog_detach2() APIs are refactored to reuse the new bpf_prog_detach_opts() internally. The bpf_prog_query_opts() API got extended to be able to handle the new link_ids, link_attach_flags and revision fields. For concrete usage examples, see the extensive selftests that have been developed as part of this series. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230719140858.13224-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19xsk: add new netlink attribute dedicated for ZC max fragsMaciej Fijalkowski2-1/+7
Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will carry maximum fragments that underlying ZC driver is able to handle on TX side. It is going to be included in netlink response only when driver supports ZC. Any value higher than 1 implies multi-buffer ZC support on underlying device. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-11-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-11libbpf: Remove HASHMAP_INIT static initialization helperJohn Sanpe1-10/+0
Remove the wrong HASHMAP_INIT. It's not used anywhere in libbpf. Signed-off-by: John Sanpe <sanpeqf@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230711070712.2064144-1-sanpeqf@gmail.com
2023-07-11libbpf: Fix realloc API handling in zero-sized edge casesAndrii Nakryiko2-4/+16
realloc() and reallocarray() can either return NULL or a special non-NULL pointer, if their size argument is zero. This requires a bit more care to handle NULL-as-valid-result situation differently from NULL-as-error case. This has caused real issues before ([0]), and just recently bit again in production when performing bpf_program__attach_usdt(). This patch fixes 4 places that do or potentially could suffer from this mishandling of NULL, including the reported USDT-related one. There are many other places where realloc()/reallocarray() is used and NULL is always treated as an error value, but all those have guarantees that their size is always non-zero, so those spot don't need any extra handling. [0] d08ab82f59d5 ("libbpf: Fix double-free when linker processes empty sections") Fixes: 999783c8bbda ("libbpf: Wire up spec management and other arch-independent USDT logic") Fixes: b63b3c490eee ("libbpf: Add bpf_program__set_insns function") Fixes: 697f104db8a6 ("libbpf: Support custom SEC() handlers") Fixes: b12688267280 ("libbpf: Change the order of data and text relocations.") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230711024150.1566433-1-andrii@kernel.org
2023-07-09libbpf: only reset sec_def handler when necessaryAndrii Nakryiko1-8/+19
Don't reset recorded sec_def handler unconditionally on bpf_program__set_type(). There are two situations where this is wrong. First, if the program type didn't actually change. In that case original SEC handler should work just fine. Second, catch-all custom SEC handler is supposed to work with any BPF program type and SEC() annotation, so it also doesn't make sense to reset that. This patch fixes both issues. This was reported recently in the context of breaking perf tool, which uses custom catch-all handler for fancy BPF prologue generation logic. This patch should fix the issue. [0] https://lore.kernel.org/linux-perf-users/ab865e6d-06c5-078e-e404-7f90686db50d@amd.com/ Fixes: d6e6286a12e7 ("libbpf: disassociate section handler on explicit bpf_program__set_type() call") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230707231156.1711948-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-07libbpf: Use available_filter_functions_addrs with multi-kprobesJackie Liu1-1/+61
Now that kernel provides a new available_filter_functions_addrs file which can help us avoid the need to cross-validate available_filter_functions and kallsyms, we can improve efficiency of multi-attach kprobes. For example, on my device, the sample program [1] of start time: $ sudo ./funccount "tcp_*" before after 1.2s 1.0s [1]: https://github.com/JackieLiu1/ketones/tree/master/src/funccount Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230705091209.3803873-2-liu.yun@linux.dev
2023-07-07libbpf: Cross-join available_filter_functions and kallsyms for multi-kprobesJackie Liu1-13/+97
When using regular expression matching with "kprobe multi", it scans all the functions under "/proc/kallsyms" that can be matched. However, not all of them can be traced by kprobe.multi. If any one of the functions fails to be traced, it will result in the failure of all functions. The best approach is to filter out the functions that cannot be traced to ensure proper tracking of the functions. Closes: https://lore.kernel.org/oe-kbuild-all/202307030355.TdXOHklM-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Jiri Olsa <jolsa@kernel.org> Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230705091209.3803873-1-liu.yun@linux.dev
2023-06-30libbpf: Add netfilter link attach helperFlorian Westphal5-0/+72
Add new api function: bpf_program__attach_netfilter. It takes a bpf program (netfilter type), and a pointer to a option struct that contains the desired attachment (protocol family, priority, hook location, ...). It returns a pointer to a 'bpf_link' structure or NULL on error. Next patch adds new netfilter_basic test that uses this function to attach a program to a few pf/hook/priority combinations. v2: change name and use bpf_link_create. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/CAEf4BzZrmUv27AJp0dDxBDMY_B8e55-wLs8DUKK69vCWsCG_pQ@mail.gmail.com/ Link: https://lore.kernel.org/bpf/CAEf4BzZ69YgrQW7DHCJUT_X+GqMq_ZQQPBwopaJJVGFD5=d5Vg@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20230628152738.22765-2-fw@strlen.de
2023-06-30libbpf: Skip modules BTF loading when CAP_SYS_ADMIN is missingAndrea Terzolo1-0/+4
If during CO-RE relocations libbpf is not able to find the target type in the running kernel BTF, it searches for it in modules' BTF. The downside of this approach is that loading modules' BTF requires CAP_SYS_ADMIN and this prevents BPF applications from running with more granular capabilities (e.g. CAP_BPF) when they don't need to search types into modules' BTF. This patch skips by default modules' BTF loading phase when CAP_SYS_ADMIN is missing. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Co-developed-by: Federico Di Pierro <nierro92@gmail.com> Signed-off-by: Federico Di Pierro <nierro92@gmail.com> Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/CAGQdkDvYU_e=_NX+6DRkL_-TeH3p+QtsdZwHkmH0w3Fuzw0C4w@mail.gmail.com Link: https://lore.kernel.org/bpf/20230626093614.21270-1-andreaterzolo3@gmail.com
2023-06-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-1/+4
Cross-merge networking fixes after downstream PR. Conflicts: net/sched/sch_taprio.c d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()") net/ipv4/sysctl_net_ipv4.c e209fee4118f ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294") ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear") https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-06bpf: netfilter: Add BPF_NETFILTER bpf_attach_typeFlorian Westphal2-1/+4
Andrii Nakryiko writes: And we currently don't have an attach type for NETLINK BPF link. Thankfully it's not too late to add it. I see that link_create() in kernel/bpf/syscall.c just bypasses attach_type check. We shouldn't have done that. Instead we need to add BPF_NETLINK attach type to enum bpf_attach_type. And wire all that properly throughout the kernel and libbpf itself. This adds BPF_NETFILTER and uses it. This breaks uabi but this wasn't in any non-rc release yet, so it should be fine. v2: check link_attack prog type in link_create too Fixes: 84601d6ee68a ("bpf: add bpf_link support for BPF_NETFILTER programs") Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/CAEf4BzZ69YgrQW7DHCJUT_X+GqMq_ZQQPBwopaJJVGFD5=d5Vg@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20230605131445.32016-1-fw@strlen.de