summaryrefslogtreecommitdiff
path: root/tools/lib
AgeCommit message (Collapse)AuthorFilesLines
2024-03-22Merge tag 'kbuild-v6.9' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Generate a list of built DTB files (arch/*/boot/dts/dtbs-list) - Use more threads when building Debian packages in parallel - Fix warnings shown during the RPM kernel package uninstallation - Change OBJECT_FILES_NON_STANDARD_*.o etc. to take a relative path to Makefile - Support GCC's -fmin-function-alignment flag - Fix a null pointer dereference bug in modpost - Add the DTB support to the RPM package - Various fixes and cleanups in Kconfig * tag 'kbuild-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (67 commits) kconfig: tests: test dependency after shuffling choices kconfig: tests: add a test for randconfig with dependent choices kconfig: tests: support KCONFIG_SEED for the randconfig runner kbuild: rpm-pkg: add dtb files in kernel rpm kconfig: remove unneeded menu_is_visible() call in conf_write_defconfig() kconfig: check prompt for choice while parsing kconfig: lxdialog: remove unused dialog colors kconfig: lxdialog: fix button color for blackbg theme modpost: fix null pointer dereference kbuild: remove GCC's default -Wpacked-bitfield-compat flag kbuild: unexport abs_srctree and abs_objtree kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1 kconfig: remove named choice support kconfig: use linked list in get_symbol_str() to iterate over menus kconfig: link menus to a symbol kbuild: fix inconsistent indentation in top Makefile kbuild: Use -fmin-function-alignment when available alpha: merge two entries for CONFIG_ALPHA_GAMMA alpha: merge two entries for CONFIG_ALPHA_EV4 kbuild: change DTC_FLAGS_<basetarget>.o to take the path relative to $(obj) ...
2024-03-15Merge tag 'perf-tools-for-v6.9-2024-03-13' of ↵Linus Torvalds4-8/+18
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "perf stat: - Support new 'cluster' aggregation mode for shared resources depending on the hardware configuration: $ sudo perf stat -a --per-cluster -e cycles,instructions sleep 1 Performance counter stats for 'system wide': S0-D0-CLS0 2 85,051,822 cycles S0-D0-CLS0 2 73,909,908 instructions # 0.87 insn per cycle S0-D0-CLS2 2 93,365,918 cycles S0-D0-CLS2 2 83,006,158 instructions # 0.89 insn per cycle S0-D0-CLS4 2 104,157,523 cycles S0-D0-CLS4 2 53,234,396 instructions # 0.51 insn per cycle S0-D0-CLS6 2 65,891,079 cycles S0-D0-CLS6 2 41,478,273 instructions # 0.63 insn per cycle 1.002407989 seconds time elapsed - Various fixes and cleanups for event metrics including NaN handling perf script: - Use libcapstone if available to disassemble the instructions. This enables 'perf script -F disasm' and 'perf script --insn-trace=disasm' (for Intel-PT): $ perf script -F event,ip,disasm cycles:P: ffffffffa988d428 wrmsr cycles:P: ffffffffa9839d25 movq %rax, %r14 cycles:P: ffffffffa9cdcaf0 endbr64 cycles:P: ffffffffa988d428 wrmsr cycles:P: ffffffffa988d428 wrmsr cycles:P: ffffffffaa401f86 iretq cycles:P: ffffffffa99c4de5 movq 0x30(%rcx), %r8 cycles:P: ffffffffa988d428 wrmsr cycles:P: ffffffffaa401f86 iretq cycles:P: ffffffffa9907983 movl 0x68(%rbx), %eax cycles:P: ffffffffa988d428 wrmsr - Expose sample ID / stream ID to python scripts perf test: - Add more perf test cases from Redhat internal test suites. This time it adds the base infra and a few perf probe tests. More to come. :) - Add 'perf test -p' for parallel execution and fix some issues found by the parallel test - Support symbol test to print symbols in given (active) module: $ perf test -F -v Symbols --dso /lib/modules/$(uname -r)/kernel/fs/ext4/ext4.ko --- start --- Testing /lib/modules/6.5.13-1rodete2-amd64/kernel/fs/ext4/ext4.ko Overlapping symbols: 7a990-7a9a0 l __pfx_ext4_exit_fs 7a990-7a9a0 g __pfx_cleanup_module Overlapping symbols: 7a9a0-7aa1c l ext4_exit_fs 7a9a0-7aa1c g cleanup_module ... JSON metric updates: - A new round of Intel metric updates - Support Power11 PVR (compatible to Power10) - Fix cache latency events on Zen 4 to set SliceId properly Internal: - Fix reference counting for 'map' data structure, tireless work from Ian! - More memory optimization for struct thread and annotate histogram. Now, 'perf report' (TUI) and 'perf annotate' should be much lighter-weight in terms of memory footprint - Support cross-arch perf register access. Clean up the build configuration so that it can detect arch-register support at runtime. This can allow to parse register data in sample which was recorded in a different arch Others: - Sync task state in 'perf sched' to kernel using trace event fields. The task states have been changed so tools cannot assume a fixed encoding - Clean up 'perf mem' to generalize the arch-specific events - Add support for local and global variables to data type profiling. This would increase the success rate of type resolution with DWARF - Add short option -H for --hierarchy in 'perf report' and 'perf top'" * tag 'perf-tools-for-v6.9-2024-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (154 commits) perf annotate: Add comments in the data structures perf annotate: Remove sym_hist.addr[] array perf annotate: Calculate instruction overhead using hashmap perf annotate: Add a hashmap for symbol histogram perf threads: Reduce table size from 256 to 8 perf threads: Switch from rbtree to hashmap perf threads: Move threads to its own files perf machine: Move machine's threads into its own abstraction perf machine: Move fprintf to for_each loop and a callback perf trace: Ignore thread hashing in summary perf report: Sort child tasks by tid perf vendor events amd: Fix Zen 4 cache latency events perf version: Display availability of OpenCSD support perf vendor events intel: Add umasks/occ_sel to PCU events. perf map: Fix map reference count issues libperf evlist: Avoid out-of-bounds access perf lock contention: Account contending locks too perf metrics: Fix segv for metrics with no events perf metrics: Fix metric matching perf pmu: Fix a potential memory leak in perf_pmu__lookup() ...
2024-03-12libbpf: Recognize __arena global variables.Andrii Nakryiko2-13/+107
LLVM automatically places __arena variables into ".arena.1" ELF section. In order to use such global variables bpf program must include definition of arena map in ".maps" section, like: struct { __uint(type, BPF_MAP_TYPE_ARENA); __uint(map_flags, BPF_F_MMAPABLE); __uint(max_entries, 1000); /* number of pages */ __ulong(map_extra, 2ull << 44); /* start of mmap() region */ } arena SEC(".maps"); libbpf recognizes both uses of arena and creates single `struct bpf_map *` instance in libbpf APIs. ".arena.1" ELF section data is used as initial data image, which is exposed through skeleton and bpf_map__initial_value() to the user, if they need to tune it before the load phase. During load phase, this initial image is copied over into mmap()'ed region corresponding to arena, and discarded. Few small checks here and there had to be added to make sure this approach works with bpf_map__initial_value(), mostly due to hard-coded assumption that map->mmaped is set up with mmap() syscall and should be munmap()'ed. For arena, .arena.1 can be (much) smaller than maximum arena size, so this smaller data size has to be tracked separately. Given it is enforced that there is only one arena for entire bpf_object instance, we just keep it in a separate field. This can be generalized if necessary later. All global variables from ".arena.1" section are accessible from user space via skel->arena->name_of_var. For bss/data/rodata the skeleton/libbpf perform the following sequence: 1. addr = mmap(MAP_ANONYMOUS) 2. user space optionally modifies global vars 3. map_fd = bpf_create_map() 4. bpf_update_map_elem(map_fd, addr) // to store values into the kernel 5. mmap(addr, MAP_FIXED, map_fd) after step 5 user spaces see the values it wrote at step 2 at the same addresses arena doesn't support update_map_elem. Hence skeleton/libbpf do: 1. addr = malloc(sizeof SEC ".arena.1") 2. user space optionally modifies global vars 3. map_fd = bpf_create_map(MAP_TYPE_ARENA) 4. real_addr = mmap(map->map_extra, MAP_SHARED | MAP_FIXED, map_fd) 5. memcpy(real_addr, addr) // this will fault-in and allocate pages At the end look and feel of global data vs __arena global data is the same from bpf prog pov. Another complication is: struct { __uint(type, BPF_MAP_TYPE_ARENA); } arena SEC(".maps"); int __arena foo; int bar; ptr1 = &foo; // relocation against ".arena.1" section ptr2 = &arena; // relocation against ".maps" section ptr3 = &bar; // relocation against ".bss" section Fo the kernel ptr1 and ptr2 has point to the same arena's map_fd while ptr3 points to a different global array's map_fd. For the verifier: ptr1->type == unknown_scalar ptr2->type == const_ptr_to_map ptr3->type == ptr_to_map_value After verification, from JIT pov all 3 ptr-s are normal ld_imm64 insns. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20240308010812.89848-11-alexei.starovoitov@gmail.com
2024-03-12libbpf: Add support for bpf_arena.Alexei Starovoitov2-8/+46
mmap() bpf_arena right after creation, since the kernel needs to remember the address returned from mmap. This is user_vm_start. LLVM will generate bpf_arena_cast_user() instructions where necessary and JIT will add upper 32-bit of user_vm_start to such pointers. Fix up bpf_map_mmap_sz() to compute mmap size as map->value_size * map->max_entries for arrays and PAGE_SIZE * map->max_entries for arena. Don't set BTF at arena creation time, since it doesn't support it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240308010812.89848-9-alexei.starovoitov@gmail.com
2024-03-12libbpf: Add __arg_arena to bpf_helpers.hAlexei Starovoitov1-0/+1
Add __arg_arena to bpf_helpers.h Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240308010812.89848-8-alexei.starovoitov@gmail.com
2024-03-10kbuild: unexport abs_srctree and abs_objtreeMasahiro Yamada1-1/+1
Commit 25b146c5b8ce ("kbuild: allow Kbuild to start from any directory") exported abs_srctree and abs_objtree to avoid recomputation after the sub-make. However, this approach turned out to be fragile. Commit 5fa94ceb793e ("kbuild: set correct abs_srctree and abs_objtree for package builds") moved them above "ifneq ($(sub_make_done),1)", eliminating the need for exporting them. These are only needed in the top Makefile. If an absolute path is required in sub-directories, you can use $(abspath ) or $(realpath ) as needed. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-03-08libbpf: Allow specifying 64-bit integers in map BTF.Alexei Starovoitov2-2/+43
__uint() macro that is used to specify map attributes like: __uint(type, BPF_MAP_TYPE_ARRAY); __uint(map_flags, BPF_F_MMAPABLE); It is limited to 32-bit, since BTF_KIND_ARRAY has u32 "number of elements" field in "struct btf_array". Introduce __ulong() macro that allows specifying values bigger than 32-bit. In map definition "map_extra" is the only u64 field, so far. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20240307031228.42896-5-alexei.starovoitov@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-03-07libbpf: Rewrite btf datasec names starting from '?'Eduard Zingerman3-2/+41
Optional struct_ops maps are defined using question mark at the start of the section name, e.g.: SEC("?.struct_ops") struct test_ops optional_map = { ... }; This commit teaches libbpf to detect if kernel allows '?' prefix in datasec names, and if it doesn't then to rewrite such names by replacing '?' with '_', e.g.: DATASEC ?.struct_ops -> DATASEC _.struct_ops Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-13-eddyz87@gmail.com
2024-03-07libbpf: Struct_ops in SEC("?.struct_ops") / SEC("?.struct_ops.link")Eduard Zingerman1-1/+14
Allow using two new section names for struct_ops maps: - SEC("?.struct_ops") - SEC("?.struct_ops.link") To specify maps that have bpf_map->autocreate == false after open. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-12-eddyz87@gmail.com
2024-03-07libbpf: Replace elf_state->st_ops_* fields with SEC_ST_OPS sec_typeEduard Zingerman1-29/+32
The next patch would add two new section names for struct_ops maps. To make working with multiple struct_ops sections more convenient: - remove fields like elf_state->st_ops_{shndx,link_shndx}; - mark section descriptions hosting struct_ops as elf_sec_desc->sec_type == SEC_ST_OPS; After these changes struct_ops sections could be processed uniformly by iterating bpf_object->efile.secs entries. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-11-eddyz87@gmail.com
2024-03-07libbpf: Sync progs autoload with maps autocreate for struct_ops mapsEduard Zingerman1-0/+43
Automatically select which struct_ops programs to load depending on which struct_ops maps are selected for automatic creation. E.g. for the BPF code below: SEC("struct_ops/test_1") int BPF_PROG(foo) { ... } SEC("struct_ops/test_2") int BPF_PROG(bar) { ... } SEC(".struct_ops.link") struct test_ops___v1 A = { .foo = (void *)foo }; SEC(".struct_ops.link") struct test_ops___v2 B = { .foo = (void *)foo, .bar = (void *)bar, }; And the following libbpf API calls: bpf_map__set_autocreate(skel->maps.A, true); bpf_map__set_autocreate(skel->maps.B, false); The autoload would be enabled for program 'foo' and disabled for program 'bar'. During load, for each struct_ops program P, referenced from some struct_ops map M: - set P.autoload = true if M.autocreate is true for some M; - set P.autoload = false if M.autocreate is false for all M; - don't change P.autoload, if P is not referenced from any map. Do this after bpf_object__init_kern_struct_ops_maps() to make sure that shadow vars assignment is done. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-9-eddyz87@gmail.com
2024-03-07libbpf: Honor autocreate flag for struct_ops mapsEduard Zingerman1-3/+15
Skip load steps for struct_ops maps not marked for automatic creation. This should allow to load bpf object in situations like below: SEC("struct_ops/foo") int BPF_PROG(foo) { ... } SEC("struct_ops/bar") int BPF_PROG(bar) { ... } struct test_ops___v1 { int (*foo)(void); }; struct test_ops___v2 { int (*foo)(void); int (*does_not_exist)(void); }; SEC(".struct_ops.link") struct test_ops___v1 map_for_old = { .test_1 = (void *)foo }; SEC(".struct_ops.link") struct test_ops___v2 map_for_new = { .test_1 = (void *)foo, .does_not_exist = (void *)bar }; Suppose program is loaded on old kernel that does not have definition for 'does_not_exist' struct_ops member. After this commit it would be possible to load such object file after the following tweaks: bpf_program__set_autoload(skel->progs.bar, false); bpf_map__set_autocreate(skel->maps.map_for_new, false); Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20240306104529.6453-4-eddyz87@gmail.com
2024-03-07libbpf: Tie struct_ops programs to kernel BTF ids, not to local idsEduard Zingerman1-23/+26
Enforce the following existing limitation on struct_ops programs based on kernel BTF id instead of program-local BTF id: struct_ops BPF prog can be re-used between multiple .struct_ops & .struct_ops.link as long as it's the same struct_ops struct definition and the same function pointer field This allows reusing same BPF program for versioned struct_ops map definitions, e.g.: SEC("struct_ops/test") int BPF_PROG(foo) { ... } struct some_ops___v1 { int (*test)(void); }; struct some_ops___v2 { int (*test)(void); }; SEC(".struct_ops.link") struct some_ops___v1 a = { .test = foo } SEC(".struct_ops.link") struct some_ops___v2 b = { .test = foo } Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-3-eddyz87@gmail.com
2024-03-07libbpf: Allow version suffixes (___smth) for struct_ops typesEduard Zingerman1-1/+5
E.g. allow the following struct_ops definitions: struct bpf_testmod_ops___v1 { int (*test)(void); }; struct bpf_testmod_ops___v2 { int (*test)(void); }; SEC(".struct_ops.link") struct bpf_testmod_ops___v1 a = { .test = ... } SEC(".struct_ops.link") struct bpf_testmod_ops___v2 b = { .test = ... } Where both bpf_testmod_ops__v1 and bpf_testmod_ops__v2 would be resolved as 'struct bpf_testmod_ops' from kernel BTF. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-2-eddyz87@gmail.com
2024-03-04libbpf: Correct debug message in btf__load_vmlinux_btfChen Shen1-1/+1
In the function btf__load_vmlinux_btf, the debug message incorrectly refers to 'path' instead of 'sysfs_btf_path'. Signed-off-by: Chen Shen <peterchenshen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240302062218.3587-1-peterchenshen@gmail.com
2024-03-01libbpf: Convert st_ops->data to shadow type.Kui-Feng Lee1-2/+38
Convert st_ops->data to the shadow type of the struct_ops map. The shadow type of a struct_ops type is a variant of the original struct type providing a way to access/change the values in the maps of the struct_ops type. bpf_map__initial_value() will return st_ops->data for struct_ops types. The skeleton is going to use it as the pointer to the shadow type of the original struct type. One of the main differences between the original struct type and the shadow type is that all function pointers of the shadow type are converted to pointers of struct bpf_program. Users can replace these bpf_program pointers with other BPF programs. The st_ops->progs[] will be updated before updating the value of a map to reflect the changes made by users. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240229064523.2091270-3-thinker.li@gmail.com
2024-03-01libbpf: Set btf_value_type_id of struct bpf_map for struct_ops.Kui-Feng Lee1-0/+5
For a struct_ops map, btf_value_type_id is the type ID of it's struct type. This value is required by bpftool to generate skeleton including pointers of shadow types. The code generator gets the type ID from bpf_map__btf_value_type_id() in order to get the type information of the struct type of a map. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240229064523.2091270-2-thinker.li@gmail.com
2024-03-01libperf evlist: Avoid out-of-bounds accessIan Rogers2-8/+14
Parallel testing appears to show a race between allocating and setting evsel ids. As there is a bounds check on the xyarray it yields a segv like: ``` AddressSanitizer:DEADLYSIGNAL ================================================================= ==484408==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 ==484408==The signal is caused by a WRITE memory access. ==484408==Hint: address points to the zero page. #0 0x55cef5d4eff4 in perf_evlist__id_hash tools/lib/perf/evlist.c:256 #1 0x55cef5d4f132 in perf_evlist__id_add tools/lib/perf/evlist.c:274 #2 0x55cef5d4f545 in perf_evlist__id_add_fd tools/lib/perf/evlist.c:315 #3 0x55cef5a1923f in store_evsel_ids util/evsel.c:3130 #4 0x55cef5a19400 in evsel__store_ids util/evsel.c:3147 #5 0x55cef5888204 in __run_perf_stat tools/perf/builtin-stat.c:832 #6 0x55cef5888c06 in run_perf_stat tools/perf/builtin-stat.c:960 #7 0x55cef58932db in cmd_stat tools/perf/builtin-stat.c:2878 ... ``` Avoid this crash by early exiting the perf_evlist__id_add_fd and perf_evlist__id_add is the access is out-of-bounds. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240229070757.796244-1-irogers@google.com
2024-02-22bpf: Clarify batch lookup/lookup_and_delete semanticsMartin Kelly1-5/+12
The batch lookup and lookup_and_delete APIs have two parameters, in_batch and out_batch, to facilitate iterative lookup/lookup_and_deletion operations for supported maps. Except NULL for in_batch at the start of these two batch operations, both parameters need to point to memory equal or larger than the respective map key size, except for various hashmaps (hash, percpu_hash, lru_hash, lru_percpu_hash) where the in_batch/out_batch memory size should be at least 4 bytes. Document these semantics to clarify the API. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240221211838.1241578-1-martin.kelly@crowdstrike.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-02-22tools subcmd: Add a no exec function call optionIan Rogers2-0/+4
Tools like perf fork tests in case they crash, but they don't want to exec a full binary. Add an option to call a function rather than do an exec. The child process exits with the result of the function call and is passed the struct of the run_command, things like container_of can then allow the child process function to determine additional arguments. Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Bill Wendling <morbo@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: llvm@lists.linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240221034155.1500118-5-irogers@google.com
2024-02-14libbpf: Make remark about zero-initializing bpf_*_info structsMatt Bobrowski1-5/+17
In some situations, if you fail to zero-initialize the bpf_{prog,map,btf,link}_info structs supplied to the set of LIBBPF helpers bpf_{prog,map,btf,link}_get_info_by_fd(), you can expect the helper to return an error. This can possibly leave people in a situation where they're scratching their heads for an unnnecessary amount of time. Make an explicit remark about the requirement of zero-initializing the supplied bpf_{prog,map,btf,link}_info structs for the respective LIBBPF helpers. Internally, LIBBPF helpers bpf_{prog,map,btf,link}_get_info_by_fd() call into bpf_obj_get_info_by_fd() where the bpf(2) BPF_OBJ_GET_INFO_BY_FD command is used. This specific command is effectively backed by restrictions enforced by the bpf_check_uarg_tail_zero() helper. This function ensures that if the size of the supplied bpf_{prog,map,btf,link}_info structs are larger than what the kernel can handle, trailing bits are zeroed. This can be a problem when compiling against UAPI headers that don't necessarily match the sizes of the same underlying types known to the kernel. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/ZcyEb8x4VbhieWsL@google.com
2024-02-13libbpf: Add support to GCC in CORE macro definitionsCupertino Miranda1-7/+38
Due to internal differences between LLVM and GCC the current implementation for the CO-RE macros does not fit GCC parser, as it will optimize those expressions even before those would be accessible by the BPF backend. As examples, the following would be optimized out with the original definitions: - As enums are converted to their integer representation during parsing, the IR would not know how to distinguish an integer constant from an actual enum value. - Types need to be kept as temporary variables, as the existing type casts of the 0 address (as expanded for LLVM), are optimized away by the GCC C parser, never really reaching GCCs IR. Although, the macros appear to add extra complexity, the expanded code is removed from the compilation flow very early in the compilation process, not really affecting the quality of the generated assembly. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240213173543.1397708-1-cupertino.miranda@oracle.com
2024-02-06libbpf: Use OPTS_SET() macro in bpf_xdp_query()Toke Høiland-Jørgensen1-2/+2
When the feature_flags and xdp_zc_max_segs fields were added to the libbpf bpf_xdp_query_opts, the code writing them did not use the OPTS_SET() macro. This causes libbpf to write to those fields unconditionally, which means that programs compiled against an older version of libbpf (with a smaller size of the bpf_xdp_query_opts struct) will have its stack corrupted by libbpf writing out of bounds. The patch adding the feature_flags field has an early bail out if the feature_flags field is not part of the opts struct (via the OPTS_HAS) macro, but the patch adding xdp_zc_max_segs does not. For consistency, this fix just changes the assignments to both fields to use the OPTS_SET() macro. Fixes: 13ce2daa259a ("xsk: add new netlink attribute dedicated for ZC max frags") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240206125922.1992815-1-toke@redhat.com
2024-02-06libbpf: fix return value for PERF_EVENT __arg_ctx type fix up checkAndrii Nakryiko1-3/+3
If PERF_EVENT program has __arg_ctx argument with matching architecture-specific pt_regs/user_pt_regs/user_regs_struct pointer type, libbpf should still perform type rewrite for old kernels, but not emit the warning. Fix copy/paste from kernel code where 0 is meant to signify "no error" condition. For libbpf we need to return "true" to proceed with type rewrite (which for PERF_EVENT program will be a canonical `struct bpf_perf_event_data *` type). Fixes: 9eea8fafe33e ("libbpf: fix __arg_ctx type enforcement for perf_event programs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240206002243.1439450-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-02libbpf: Add missed btf_ext__raw_data() APIAndrii Nakryiko3-3/+7
Another API that was declared in libbpf.map but actual implementation was missing. btf_ext__get_raw_data() was intended as a discouraged alias to consistently-named btf_ext__raw_data(), so make this an actuality. Fixes: 20eccf29e297 ("libbpf: hide and discourage inconsistently named getters") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240201172027.604869-5-andrii@kernel.org
2024-02-02libbpf: Add btf__new_split() API that was declared but not implementedAndrii Nakryiko2-1/+7
Seems like original commit adding split BTF support intended to add btf__new_split() API, and even declared it in libbpf.map, but never added (trivial) implementation. Fix this. Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240201172027.604869-4-andrii@kernel.org
2024-02-02libbpf: Add missing LIBBPF_API annotation to libbpf_set_memlock_rlim APIAndrii Nakryiko1-1/+1
LIBBPF_API annotation seems missing on libbpf_set_memlock_rlim API, so add it to make this API callable from libbpf's shared library version. Fixes: e542f2c4cd16 ("libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF") Fixes: ab9a5a05dc48 ("libbpf: fix up few libbpf.map problems") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240201172027.604869-3-andrii@kernel.org
2024-02-02libbpf: Call memfd_create() syscall directlyAndrii Nakryiko1-1/+10
Some versions of Android do not implement memfd_create() wrapper in their libc implementation, leading to build failures ([0]). On the other hand, memfd_create() is available as a syscall on quite old kernels (3.17+, while bpf() syscall itself is available since 3.18+), so it is ok to assume that syscall availability and call into it with syscall() helper to avoid Android-specific workarounds. Validated in libbpf-bootstrap's CI ([1]). [0] https://github.com/libbpf/libbpf-bootstrap/actions/runs/7701003207/job/20986080319#step:5:83 [1] https://github.com/libbpf/libbpf-bootstrap/actions/runs/7715988887/job/21031767212?pr=253 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240201172027.604869-2-andrii@kernel.org
2024-02-01libbpf: Remove unnecessary null check in kernel_supports()Eduard Zingerman1-1/+1
After recent changes, Coverity complained about inconsistent null checks in kernel_supports() function: kernel_supports(const struct bpf_object *obj, ...) [...] // var_compare_op: Comparing obj to null implies that obj might be null if (obj && obj->gen_loader) return true; // var_deref_op: Dereferencing null pointer obj if (obj->token_fd) return feat_supported(obj->feat_cache, feat_id); [...] - The original null check was introduced by commit [0], which introduced a call `kernel_supports(NULL, ...)` in function bump_rlimit_memlock(); - This call was refactored to use `feat_supported(NULL, ...)` in commit [1]. Looking at all places where kernel_supports() is called: - There is either `obj->...` access before the call; - Or `obj` comes from `prog->obj` expression, where `prog` comes from enumeration of programs in `obj`; - Or `obj` comes from `prog->obj`, where `prog` is a parameter to one of the API functions: - bpf_program__attach_kprobe_opts; - bpf_program__attach_kprobe; - bpf_program__attach_ksyscall. Assuming correct API usage, it appears that `obj` can never be null when passed to kernel_supports(). Silence the Coverity warning by removing redundant null check. [0] e542f2c4cd16 ("libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF") [1] d6dd1d49367a ("libbpf: Further decouple feature checking logic from bpf_object") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240131212615.20112-1-eddyz87@gmail.com
2024-01-31libbpf: add bpf_core_cast() macroAndrii Nakryiko1-0/+13
Add bpf_core_cast() macro that wraps bpf_rdonly_cast() kfunc. It's more ergonomic than kfunc, as it automatically extracts btf_id with bpf_core_type_id_kernel(), and works with type names. It also casts result to (T *) pointer. See the definition of the macro, it's self-explanatory. libbpf declares bpf_rdonly_cast() extern as __weak __ksym and should be safe to not conflict with other possible declarations in user code. But we do have a conflict with current BPF selftests that declare their externs with first argument as `void *obj`, while libbpf opts into more permissive `const void *obj`. This causes conflict, so we fix up BPF selftests uses in the same patch. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240130212023.183765-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-30libbpf: add __arg_trusted and __arg_nullable tag macrosAndrii Nakryiko1-0/+2
Add __arg_trusted to annotate global func args that accept trusted PTR_TO_BTF_ID arguments. Also add __arg_nullable to combine with __arg_trusted (and maybe other tags in the future) to force global subprog itself (i.e., callee) to do NULL checks, as opposed to default non-NULL semantics (and thus caller's responsibility to ensure non-NULL values). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240130000648.2144827-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-30libbpf: Add some details for BTF parsing failuresIan Rogers1-4/+18
As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find valid kernel BTF" message makes diagnosing the kernel build issue somewhat cryptic. Add a little more detail with the hope of helping users. Before: ``` libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 ``` After not accessible: ``` libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled? libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 ``` After not readable: ``` libbpf: failed to read kernel BTF from (/sys/kernel/btf/vmlinux): -1 ``` Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/ Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240125231840.1647951-1-irogers@google.com
2024-01-29libbpf: fix __arg_ctx type enforcement for perf_event programsAndrii Nakryiko1-1/+20
Adjust PERF_EVENT type enforcement around __arg_ctx to match exactly what kernel is doing. Fixes: 76ec90a996e3 ("libbpf: warn on unexpected __arg_ctx type when rewriting BTF") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240125205510.3642094-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-29libbpf: integrate __arg_ctx feature detector into kernel_supports()Andrii Nakryiko3-64/+61
Now that feature detection code is in bpf-next tree, integrate __arg_ctx kernel-side support into kernel_supports() framework. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240125205510.3642094-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-29libbpf: Fix faccessat() usage on AndroidAndrii Nakryiko1-0/+14
Android implementation of libc errors out with -EINVAL in faccessat() if passed AT_EACCESS ([0]), this leads to ridiculous issue with libbpf refusing to load /sys/kernel/btf/vmlinux on Androids ([1]). Fix by detecting Android and redefining AT_EACCESS to 0, it's equivalent on Android. [0] https://android.googlesource.com/platform/bionic/+/refs/heads/android13-release/libc/bionic/faccessat.cpp#50 [1] https://github.com/libbpf/libbpf-bootstrap/issues/250#issuecomment-1911324250 Fixes: 6a4ab8869d0b ("libbpf: Fix the case of running as non-root with capabilities") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240126220944.2497665-1-andrii@kernel.org
2024-01-27Merge tag 'for-netdev' of ↵Jakub Kicinski13-490/+800
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-01-26 We've added 107 non-merge commits during the last 4 day(s) which contain a total of 101 files changed, 6009 insertions(+), 1260 deletions(-). The main changes are: 1) Add BPF token support to delegate a subset of BPF subsystem functionality from privileged system-wide daemons such as systemd through special mount options for userns-bound BPF fs to a trusted & unprivileged application. With addressed changes from Christian and Linus' reviews, from Andrii Nakryiko. 2) Support registration of struct_ops types from modules which helps projects like fuse-bpf that seeks to implement a new struct_ops type, from Kui-Feng Lee. 3) Add support for retrieval of cookies for perf/kprobe multi links, from Jiri Olsa. 4) Bigger batch of prep-work for the BPF verifier to eventually support preserving boundaries and tracking scalars on narrowing fills, from Maxim Mikityanskiy. 5) Extend the tc BPF flavor to support arbitrary TCP SYN cookies to help with the scenario of SYN floods, from Kuniyuki Iwashima. 6) Add code generation to inline the bpf_kptr_xchg() helper which improves performance when stashing/popping the allocated BPF objects, from Hou Tao. 7) Extend BPF verifier to track aligned ST stores as imprecise spilled registers, from Yonghong Song. 8) Several fixes to BPF selftests around inline asm constraints and unsupported VLA code generation, from Jose E. Marchesi. 9) Various updates to the BPF IETF instruction set draft document such as the introduction of conformance groups for instructions, from Dave Thaler. 10) Fix BPF verifier to make infinite loop detection in is_state_visited() exact to catch some too lax spill/fill corner cases, from Eduard Zingerman. 11) Refactor the BPF verifier pointer ALU check to allow ALU explicitly instead of implicitly for various register types, from Hao Sun. 12) Fix the flaky tc_redirect_dtime BPF selftest due to slowness in neighbor advertisement at setup time, from Martin KaFai Lau. 13) Change BPF selftests to skip callback tests for the case when the JIT is disabled, from Tiezhu Yang. 14) Add a small extension to libbpf which allows to auto create a map-in-map's inner map, from Andrey Grafin. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (107 commits) selftests/bpf: Add missing line break in test_verifier bpf, docs: Clarify definitions of various instructions bpf: Fix error checks against bpf_get_btf_vmlinux(). bpf: One more maintainer for libbpf and BPF selftests selftests/bpf: Incorporate LSM policy to token-based tests selftests/bpf: Add tests for LIBBPF_BPF_TOKEN_PATH envvar libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar selftests/bpf: Add tests for BPF object load with implicit token selftests/bpf: Add BPF object loading tests with explicit token passing libbpf: Wire up BPF token support at BPF object level libbpf: Wire up token_fd into feature probing logic libbpf: Move feature detection code into its own file libbpf: Further decouple feature checking logic from bpf_object libbpf: Split feature detectors definitions from cached results selftests/bpf: Utilize string values for delegate_xxx mount options bpf: Support symbolic BPF FS delegation mount options bpf: Fail BPF_TOKEN_CREATE if no delegation option was set on BPF FS bpf,selinux: Allocate bpf_security_struct per BPF token selftests/bpf: Add BPF token-enabled tests libbpf: Add BPF token support to bpf_prog_load() API ... ==================== Link: https://lore.kernel.org/r/20240126215710.19855-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvarAndrii Nakryiko2-0/+14
To allow external admin authority to override default BPF FS location (/sys/fs/bpf) for implicit BPF token creation, teach libbpf to recognize LIBBPF_BPF_TOKEN_PATH envvar. If it is specified and user application didn't explicitly specify bpf_token_path option, it will be treated exactly like bpf_token_path option, overriding default /sys/fs/bpf location and making BPF token mandatory. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-29-andrii@kernel.org
2024-01-25libbpf: Wire up BPF token support at BPF object levelAndrii Nakryiko4-11/+131
Add BPF token support to BPF object-level functionality. BPF token is supported by BPF object logic either as an explicitly provided BPF token from outside (through BPF FS path), or implicitly (unless prevented through bpf_object_open_opts). Implicit mode is assumed to be the most common one for user namespaced unprivileged workloads. The assumption is that privileged container manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token delegation options (delegate_{cmds,maps,progs,attachs} mount options). BPF object during loading will attempt to create BPF token from /sys/fs/bpf location, and pass it for all relevant operations (currently, map creation, BTF load, and program load). In this implicit mode, if BPF token creation fails due to whatever reason (BPF FS is not mounted, or kernel doesn't support BPF token, etc), this is not considered an error. BPF object loading sequence will proceed with no BPF token. In explicit BPF token mode, user provides explicitly custom BPF FS mount point path. In such case, BPF object will attempt to create BPF token from provided BPF FS location. If BPF token creation fails, that is considered a critical error and BPF object load fails with an error. Libbpf provides a way to disable implicit BPF token creation, if it causes any troubles (BPF token is designed to be completely optional and shouldn't cause any problems even if provided, but in the world of BPF LSM, custom security logic can be installed that might change outcome depending on the presence of BPF token). To disable libbpf's default BPF token creation behavior user should provide either invalid BPF token FD (negative), or empty bpf_token_path option. BPF token presence can influence libbpf's feature probing, so if BPF object has associated BPF token, feature probing is instructed to use BPF object-specific feature detection cache and token FD. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-26-andrii@kernel.org
2024-01-25libbpf: Wire up token_fd into feature probing logicAndrii Nakryiko5-47/+97
Adjust feature probing callbacks to take into account optional token_fd. In unprivileged contexts, some feature detectors would fail to detect kernel support just because BPF program, BPF map, or BTF object can't be loaded due to privileged nature of those operations. So when BPF object is loaded with BPF token, this token should be used for feature probing. This patch is setting support for this scenario, but we don't yet pass non-zero token FD. This will be added in the next patch. We also switched BPF cookie detector from using kprobe program to tracepoint one, as tracepoint is somewhat less dangerous BPF program type and has higher likelihood of being allowed through BPF token in the future. This change has no effect on detection behavior. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240124022127.2379740-25-andrii@kernel.org
2024-01-25libbpf: Move feature detection code into its own fileAndrii Nakryiko6-466/+481
It's quite a lot of well isolated code, so it seems like a good candidate to move it out of libbpf.c to reduce its size. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240124022127.2379740-24-andrii@kernel.org
2024-01-25libbpf: Further decouple feature checking logic from bpf_objectAndrii Nakryiko3-11/+22
Add feat_supported() helper that accepts feature cache instead of bpf_object. This allows low-level code in bpf.c to not know or care about higher-level concept of bpf_object, yet it will be able to utilize custom feature checking in cases where BPF token might influence the outcome. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240124022127.2379740-23-andrii@kernel.org
2024-01-25libbpf: Split feature detectors definitions from cached resultsAndrii Nakryiko1-6/+12
Split a list of supported feature detectors with their corresponding callbacks from actual cached supported/missing values. This will allow to have more flexible per-token or per-object feature detectors in subsequent refactorings. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240124022127.2379740-22-andrii@kernel.org
2024-01-25libbpf: Add BPF token support to bpf_prog_load() APIAndrii Nakryiko2-2/+4
Wire through token_fd into bpf_prog_load(). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-16-andrii@kernel.org
2024-01-25libbpf: Add BPF token support to bpf_btf_load() APIAndrii Nakryiko2-2/+9
Allow user to specify token_fd for bpf_btf_load() API that wraps kernel's BPF_BTF_LOAD command. This allows loading BTF from unprivileged process as long as it has BPF token allowing BPF_BTF_LOAD command, which can be created and delegated by privileged process. Wire through new btf_flags as well, so that user can provide BPF_F_TOKEN_FD flag, if necessary. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-15-andrii@kernel.org
2024-01-25libbpf: Add BPF token support to bpf_map_create() APIAndrii Nakryiko2-4/+7
Add ability to provide token_fd for BPF_MAP_CREATE command through bpf_map_create() API. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-14-andrii@kernel.org
2024-01-25libbpf: Add bpf_token_create() APIAndrii Nakryiko3-0/+42
Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-13-andrii@kernel.org
2024-01-25libbpf: Ensure undefined bpf_attr field stays 0Martin KaFai Lau1-1/+1
The commit 9e926acda0c2 ("libbpf: Find correct module BTFs for struct_ops maps and progs.") sets a newly added field (value_type_btf_obj_fd) to -1 in libbpf when the caller of the libbpf's bpf_map_create did not define this field by passing a NULL "opts" or passing in a "opts" that does not cover this new field. OPT_HAS(opts, field) is used to decide if the field is defined or not: ((opts) && opts->sz >= offsetofend(typeof(*(opts)), field)) Once OPTS_HAS decided the field is not defined, that field should be set to 0. For this particular new field (value_type_btf_obj_fd), its corresponding map_flags "BPF_F_VTYPE_BTF_OBJ_FD" is not set. Thus, the kernel does not treat it as an fd field. Fixes: 9e926acda0c2 ("libbpf: Find correct module BTFs for struct_ops maps and progs.") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240124224418.2905133-1-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24libbpf: Correct bpf_core_read.h comment wrt bpf_core_relo structDima Tisnek1-1/+1
Past commit ([0]) removed the last vestiges of struct bpf_field_reloc, it's called struct bpf_core_relo now. [0] 28b93c64499a ("libbpf: Clean up and improve CO-RE reloc logging") Signed-off-by: Dima Tisnek <dimaqq@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240121060126.15650-1-dimaqq@gmail.com
2024-01-24libbpf: Find correct module BTFs for struct_ops maps and progs.Kui-Feng Lee4-12/+38
Locate the module BTFs for struct_ops maps and progs and pass them to the kernel. This ensures that the kernel correctly resolves type IDs from the appropriate module BTFs. For the map of a struct_ops object, the FD of the module BTF is set to bpf_map to keep a reference to the module BTF. The FD is passed to the kernel as value_type_btf_obj_fd when the struct_ops object is loaded. For a bpf_struct_ops prog, attach_btf_obj_fd of bpf_prog is the FD of a module BTF in the kernel. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240119225005.668602-13-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-24libbpf: call dup2() syscall directlyAndrii Nakryiko1-1/+11
We've ran into issues with using dup2() API in production setting, where libbpf is linked into large production environment and ends up calling unintended custom implementations of dup2(). These custom implementations don't provide atomic FD replacement guarantees of dup2() syscall, leading to subtle and hard to debug issues. To prevent this in the future and guarantee that no libc implementation will do their own custom non-atomic dup2() implementation, call dup2() syscall directly with syscall(SYS_dup2). Note that some architectures don't seem to provide dup2 and have dup3 instead. Try to detect and pick best syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240119210201.1295511-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>