summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-09-28samples/bpf: Add -fsanitize=bounds to userspace programsRuowen Qin1-0/+3
The sanitizer flag, which is supported by both clang and gcc, would make it easier to debug array index out-of-bounds problems in these programs. Make the Makfile smarter to detect ubsan support from the compiler and add the '-fsanitize=bounds' accordingly. Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com> Signed-off-by: Jinghao Jia <jinghao7@illinois.edu> Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230927045030.224548-2-ruowenq2@illinois.edu
2023-09-26Merge branch 'bpf: Add missed stats for kprobes'Andrii Nakryiko15-13/+322
Jiri Olsa says: ==================== hi, at the moment we can't retrieve the number of missed kprobe executions and subsequent execution of BPF programs. This patchset adds: - counting of missed execution on attach layer for: . kprobes attached through perf link (kprobe/ftrace) . kprobes attached through kprobe.multi link (fprobe) - counting of recursion_misses for BPF kprobe programs It's still technically possible to create kprobe without perf link (using SET_BPF perf ioctl) in which case we don't have a way to retrieve the kprobe's 'missed' count. However both libbpf and cilium/ebpf libraries use perf link if it's available, and for old kernels without perf link support we can use BPF program to retrieve the kprobe missed count. v3 changes: - added acks [Song] - make test_missed not serial [Andrii] Also available at: https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git bpf/missed_stats thanks, jirka ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-09-26selftests/bpf: Add test for recursion counts of perf event link tracepointJiri Olsa2-0/+81
Adding selftest that puts kprobe on bpf_fentry_test1 that calls bpf_printk and invokes bpf_trace_printk tracepoint. The bpf_trace_printk tracepoint has test[234] programs attached to it. Because kprobe execution goes through bpf_prog_active check, programs attached to the tracepoint will fail the recursion check and increment the recursion_misses stats. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-10-jolsa@kernel.org
2023-09-26selftests/bpf: Add test for recursion counts of perf event link kprobeJiri Olsa3-0/+100
Adding selftest that puts kprobe.multi on bpf_fentry_test1 that calls bpf_kfunc_common_test kfunc which has 3 perf event kprobes and 1 kprobe.multi attached. Because fprobe (kprobe.multi attach layear) does not have strict recursion check the kprobe's bpf_prog_active check is hit for test2-5. Disabling this test for arm64, because there's no fprobe support yet. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-9-jolsa@kernel.org
2023-09-26selftests/bpf: Add test for missed counts of perf event link kprobeJiri Olsa4-0/+84
Adding test that puts kprobe on bpf_fentry_test1 that calls bpf_kfunc_common_test kfunc, which has also kprobe on. The latter won't get triggered due to kprobe recursion check and kprobe missed counter is incremented. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-8-jolsa@kernel.org
2023-09-26bpftool: Display missed count for kprobe perf linkJiri Olsa1-0/+3
Adding 'missed' field to display missed counts for kprobes attached by perf event link, like: # bpftool link 5: perf_event prog 82 kprobe ffffffff815203e0 ksys_write 6: perf_event prog 83 kprobe ffffffff811d1e50 scheduler_tick missed 682217 # bpftool link -jp [{ "id": 5, "type": "perf_event", "prog_id": 82, "retprobe": false, "addr": 18446744071584220128, "func": "ksys_write", "offset": 0, "missed": 0 },{ "id": 6, "type": "perf_event", "prog_id": 83, "retprobe": false, "addr": 18446744071580753488, "func": "scheduler_tick", "offset": 0, "missed": 693469 } ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-7-jolsa@kernel.org
2023-09-26bpftool: Display missed count for kprobe_multi linkJiri Olsa1-0/+3
Adding 'missed' field to display missed counts for kprobes attached by kprobe multi link, like: # bpftool link 5: kprobe_multi prog 76 kprobe.multi func_cnt 1 missed 1 addr func [module] ffffffffa039c030 fp3_test [fprobe_test] # bpftool link -jp [{ "id": 5, "type": "kprobe_multi", "prog_id": 76, "retprobe": false, "func_cnt": 1, "missed": 1, "funcs": [{ "addr": 18446744072102723632, "func": "fp3_test", "module": "fprobe_test" } ] } ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-6-jolsa@kernel.org
2023-09-26bpf: Count missed stats in trace_call_bpfJiri Olsa2-0/+19
Increase misses stats in case bpf array execution is skipped because of recursion check in trace_call_bpf. Adding bpf_prog_inc_misses_counters that increase misses counts for all bpf programs in bpf_prog_array. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-5-jolsa@kernel.org
2023-09-26bpf: Add missed value to kprobe perf link infoJiri Olsa6-13/+28
Add missed value to kprobe attached through perf link info to hold the stats of missed kprobe handler execution. The kprobe's missed counter gets incremented when kprobe handler is not executed due to another kprobe running on the same cpu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-4-jolsa@kernel.org
2023-09-26bpf: Add missed value to kprobe_multi link infoJiri Olsa3-0/+3
Add missed value to kprobe_multi link info to hold the stats of missed kprobe_multi probe. The missed counter gets incremented when fprobe fails the recursion check or there's no rethook available for return probe. In either case the attached bpf program is not executed. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-3-jolsa@kernel.org
2023-09-26bpf: Count stats for kprobe_multi programsJiri Olsa1-0/+1
Adding support to gather missed stats for kprobe_multi programs due to bpf_prog_active protection. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-2-jolsa@kernel.org
2023-09-26Merge branch 'add libbpf getters for individual ringbuffers'Andrii Nakryiko5-13/+193
Martin Kelly says: ==================== This patch series adds a new ring__ API to libbpf exposing getters for accessing the individual ringbuffers inside a struct ring_buffer. This is useful for polling individually, getting available data, or similar use cases. The API looks like this, and was roughly proposed by Andrii Nakryiko in another thread: Getting a ring struct: struct ring *ring_buffer__ring(struct ring_buffer *rb, unsigned int idx); Using the ring struct: unsigned long ring__consumer_pos(const struct ring *r); unsigned long ring__producer_pos(const struct ring *r); size_t ring__avail_data_size(const struct ring *r); size_t ring__size(const struct ring *r); int ring__map_fd(const struct ring *r); int ring__consume(struct ring *r); Changes in v2: - Addressed all feedback from Andrii Nakryiko ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-09-26selftests/bpf: Add tests for ring__consumeMartin Kelly1-0/+4
Add tests for new API ring__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-15-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__consumeMartin Kelly3-0/+22
Add ring__consume to consume a single ringbuffer, analogous to ring_buffer__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-14-martin.kelly@crowdstrike.com
2023-09-26selftests/bpf: Add tests for ring__map_fdMartin Kelly1-0/+4
Add tests for the new API ring__map_fd. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-13-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__map_fdMartin Kelly3-0/+15
Add ring__map_fd to get the file descriptor underlying a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-12-martin.kelly@crowdstrike.com
2023-09-26selftests/bpf: Add tests for ring__sizeMartin Kelly1-1/+3
Add tests for the new API ring__size. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-11-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__sizeMartin Kelly3-0/+16
Add ring__size to get the total size of a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-10-martin.kelly@crowdstrike.com
2023-09-26selftests/bpf: Add tests for ring__avail_data_sizeMartin Kelly1-1/+3
Add test for the new API ring__avail_data_size. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-9-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__avail_data_sizeMartin Kelly3-0/+21
Add ring__avail_data_size for querying the currently available data in the ringbuffer, similar to the BPF_RB_AVAIL_DATA flag in bpf_ringbuf_query. This is racy during ongoing operations but is still useful for overall information on how a ringbuffer is behaving. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-8-martin.kelly@crowdstrike.com
2023-09-26selftests/bpf: Add tests for ring__*_posMartin Kelly1-0/+14
Add tests for the new APIs ring__producer_pos and ring__consumer_pos. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-7-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring__producer_pos, ring__consumer_posMartin Kelly3-0/+34
Add APIs to get the producer and consumer position for a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-6-martin.kelly@crowdstrike.com
2023-09-26selftests/bpf: Add tests for ring_buffer__ringMartin Kelly1-0/+15
Add tests for the new API ring_buffer__ring. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-5-martin.kelly@crowdstrike.com
2023-09-26libbpf: Add ring_buffer__ringMartin Kelly3-0/+24
Add a new function ring_buffer__ring, which exposes struct ring * to the user, representing a single ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-4-martin.kelly@crowdstrike.com
2023-09-26libbpf: Switch rings to array of pointersMartin Kelly1-8/+12
Switch rb->rings to be an array of pointers instead of a contiguous block. This allows for each ring pointer to be stable after ring_buffer__add is called, which allows us to expose struct ring * to the user without gotchas. Without this change, the realloc in ring_buffer__add could invalidate a struct ring *, making it unsafe to give to the user. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-3-martin.kelly@crowdstrike.com
2023-09-26libbpf: Refactor cleanup in ring_buffer__addMartin Kelly1-6/+9
Refactor the cleanup code in ring_buffer__add to use a unified err_out label. This reduces code duplication, as well as plugging a potential leak if mmap_sz != (__u64)(size_t)mmap_sz (currently this would miss unmapping tmp because ringbuf_unmap_ring isn't called). Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-2-martin.kelly@crowdstrike.com
2023-09-23Merge branch 'libbpf: Support symbol versioning for uprobe'Andrii Nakryiko8-17/+337
Hengqi Chen says: ==================== Dynamic symbols in shared library may have the same name, for example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 There are two pthread_rwlock_wrlock symbols in libc.so .dynsym section. The one with @@ is the default version, the other is hidden. Note that the version info is stored in .gnu.version and .gnu.version_d sections of libc and the two symbols are at the _same_ offset. Currently, specify `pthread_rwlock_wrlock`, `pthread_rwlock_wrlock@@GLIBC_2.34` or `pthread_rwlock_wrlock@GLIBC_2.2.5` in bpf_uprobe_opts::func_name won't work. Because there are two `pthread_rwlock_wrlock` in .dynsym sections without the version suffix and both are global bind. We could solve this by introducing symbol versioning ([0]). So that users can specify func, func@LIB_VERSION or func@@LIB_VERSION to attach a uprobe. This patchset resolves symbol conflicts and add symbol versioning for uprobe. - Patch 1 resolves symbol conflicts at the same offset - Patch 2 adds symbol versioning for dynsym - Patch 3 adds selftests for the above changes Changes from v3: - Address comments from Andrii Changes from v2: - Add uretprobe selfttest (Alan) - Check symbol exact match (Alan) - Fix typo (Jiri) Changes from v1: - Address comments from Alan and Jiri - Add selftests (Someone reminds me that there is an attempt at [1] and part of the selftest code from Andrii is taken from there) [0]: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html [1]: https://lore.kernel.org/lkml/CAEf4BzZTrjjyyOm3ak9JsssPSh6T_ZmGd677a2rt5e5rBLUrpQ@mail.gmail.com/ ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-09-23selftests/bpf: Add tests for symbol versioning for uprobeHengqi Chen6-4/+209
This exercises the newly added dynsym symbol versioning logics. Now we accept symbols in form of func, func@LIB_VERSION or func@@LIB_VERSION. The test rely on liburandom_read.so. For liburandom_read.so, we have: $ nm -D liburandom_read.so w __cxa_finalize@GLIBC_2.17 w __gmon_start__ w _ITM_deregisterTMCloneTable w _ITM_registerTMCloneTable 0000000000000000 A LIBURANDOM_READ_1.0.0 0000000000000000 A LIBURANDOM_READ_2.0.0 000000000000081c T urandlib_api@@LIBURANDOM_READ_2.0.0 0000000000000814 T urandlib_api@LIBURANDOM_READ_1.0.0 0000000000000824 T urandlib_api_sameoffset@LIBURANDOM_READ_1.0.0 0000000000000824 T urandlib_api_sameoffset@@LIBURANDOM_READ_2.0.0 000000000000082c T urandlib_read_without_sema@@LIBURANDOM_READ_1.0.0 00000000000007c4 T urandlib_read_with_sema@@LIBURANDOM_READ_1.0.0 0000000000011018 D urandlib_read_with_sema_semaphore@@LIBURANDOM_READ_1.0.0 For `urandlib_api`, specifying `urandlib_api` will cause a conflict because there are two symbols named urandlib_api and both are global bind. For `urandlib_api_sameoffset`, there are also two symbols in the .so, but both are at the same offset and essentially they refer to the same function so no conflict. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-4-hengqi.chen@gmail.com
2023-09-23libbpf: Support symbol versioning for uprobeHengqi Chen2-12/+124
In current implementation, we assume that symbol found in .dynsym section would have a version suffix and use it to compare with symbol user supplied. According to the spec ([0]), this assumption is incorrect, the version info of dynamic symbols are stored in .gnu.version and .gnu.version_d sections of ELF objects. For example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 In this case, specify pthread_rwlock_wrlock@@GLIBC_2.34 or pthread_rwlock_wrlock@GLIBC_2.2.5 in bpf_uprobe_opts::func_name won't work. Because the qualified name does NOT match `pthread_rwlock_wrlock` (without version suffix) in .dynsym sections. This commit implements the symbol versioning for dynsym and allows user to specify symbol in the following forms: - func - func@LIB_VERSION - func@@LIB_VERSION In case of symbol conflicts, error out and users should resolve it by specifying a qualified name. [0]: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-3-hengqi.chen@gmail.com
2023-09-23libbpf: Resolve symbol conflicts at the same offset for uprobeHengqi Chen1-1/+4
Dynamic symbols in shared library may have the same name, for example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 | grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 Currently, users can't attach a uprobe to pthread_rwlock_wrlock because there are two symbols named pthread_rwlock_wrlock and both are global bind. And libbpf considers it as a conflict. Since both of them are at the same offset we could accept one of them harmlessly. Note that we already does this in elf_resolve_syms_offsets. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-2-hengqi.chen@gmail.com
2023-09-22bpf, docs: Add loongarch64 as arch supporting BPF JITTiezhu Yang2-2/+3
As BPF JIT support for loongarch64 was added about one year ago with commit 5dc615520c4d ("LoongArch: Add BPF JIT support"), it is appropriate to add loongarch64 as arch supporting BPF JIT in bpf and sysctl docs as well. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/1695111937-19697-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-09-22samples/bpf: syscall_tp_user: Fix array out-of-bound accessJinghao Jia1-3/+20
Commit 06744f24696e ("samples/bpf: Add openat2() enter/exit tracepoint to syscall_tp sample") added two more eBPF programs to support the openat2() syscall. However, it did not increase the size of the array that holds the corresponding bpf_links. This leads to an out-of-bound access on that array in the bpf_object__for_each_program loop and could corrupt other variables on the stack. On our testing QEMU, it corrupts the map1_fds array and causes the sample to fail: # ./syscall_tp prog #0: map ids 4 5 verify map:4 val: 5 map_lookup failed: Bad file descriptor Dynamically allocate the array based on the number of programs reported by libbpf to prevent similar inconsistencies in the future Fixes: 06744f24696e ("samples/bpf: Add openat2() enter/exit tracepoint to syscall_tp sample") Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com> Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu> Signed-off-by: Jinghao Jia <jinghao7@illinois.edu> Link: https://lore.kernel.org/r/20230917214220.637721-4-jinghao7@illinois.edu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22samples/bpf: syscall_tp_user: Rename num_progs into nr_testsJinghao Jia1-12/+12
The variable name num_progs causes confusion because that variable really controls the number of rounds the test should be executed. Rename num_progs into nr_tests for the sake of clarity. Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com> Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu> Signed-off-by: Jinghao Jia <jinghao7@illinois.edu> Link: https://lore.kernel.org/r/20230917214220.637721-3-jinghao7@illinois.edu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22Merge branch 'implement-cpuv4-support-for-s390x'Alexei Starovoitov10-164/+331
Ilya Leoshkevich says: ==================== Implement cpuv4 support for s390x v1: https://lore.kernel.org/bpf/20230830011128.1415752-1-iii@linux.ibm.com/ v1 -> v2: - Redo Disable zero-extension for BPF_MEMSX as Puranjay and Alexei suggested. - Drop the bpf_ct_insert_entry() patch, it went in via the bpf tree. - Rebase, don't apply A-bs because there were fixed conflicts. Hi, This series adds the cpuv4 support to the s390x eBPF JIT. Patches 1-3 are preliminary bugfixes. Patches 4-8 implement the new instructions. Patches 9-10 enable the tests. Best regards, Ilya ==================== Link: https://lore.kernel.org/r/20230919101336.2223655-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22selftests/bpf: Trim DENYLIST.s390xIlya Leoshkevich1-25/+0
Enable all selftests, except the 2 that have to do with the userspace unwinding, and the new exceptions test, in the s390x CI. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-11-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22selftests/bpf: Enable the cpuv4 tests for s390xIlya Leoshkevich6-6/+12
Now that all the cpuv4 support is in place, enable the tests. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-10-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement signed divisionIlya Leoshkevich1-47/+125
Implement the cpuv4 signed division. It is encoded as unsigned division, but with off field set to 1. s390x has the necessary instructions: dsgfr, dsgf and dsgr. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-9-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement unconditional jump with 32-bit offsetIlya Leoshkevich1-3/+9
Implement the cpuv4 unconditional jump with 32-bit offset, which is encoded as BPF_JMP32 | BPF_JA and stores the offset in the imm field. Reuse the existing BPF_JMP | BPF_JA logic. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-8-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement unconditional byte swapIlya Leoshkevich1-0/+1
Implement the cpuv4 unconditional byte swap, which is encoded as BPF_ALU64 | BPF_END | BPF_FROM_LE. Since s390x is big-endian, it's the same as the existing BPF_ALU | BPF_END | BPF_FROM_LE. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-7-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement BPF_MEMSXIlya Leoshkevich1-5/+27
Implement the cpuv4 load with sign-extension, which is encoded as BPF_MEMSX (and, for internal uses cases only, BPF_PROBE_MEMSX). This is the same as BPF_MEM and BPF_PROBE_MEM, but with sign extension instead of zero extension, and s390x has the necessary instructions: lgb, lgh and lgf. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-6-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement BPF_MOV | BPF_X with sign-extensionIlya Leoshkevich1-8/+40
Implement the cpuv4 register-to-register move with sign extension. It is distinguished from the normal moves by non-zero values in insn->off, which determine the source size. s390x has instructions to deal with all of them: lbr, lhr, lgbr, lghr and lgfr. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-5-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22selftests/bpf: Add big-endian support to the ldsx testIlya Leoshkevich2-62/+90
Prepare the ldsx test to run on big-endian systems by adding the necessary endianness checks around narrow memory accesses. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-4-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22selftests/bpf: Unmount the cgroup2 work directoryIlya Leoshkevich1-7/+26
test_progs -t bind_perm,bpf_obj_pinning/mounted-str-rel fails when the selftests directory is mounted under /mnt, which is a reasonable thing to do when sharing the selftests residing on the host with a virtual machine, e.g., using 9p. The reason is that cgroup2 is mounted at /mnt and not unmounted, causing subsequent tests that need to access the selftests directory to fail. Fix by unmounting it. The kernel maintains a mount stack, so this reveals what was mounted there before. Introduce cgroup_workdir_mounted in order to maintain idempotency. Make it thread-local in order to support test_progs -j. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22bpf: Disable zero-extension for BPF_MEMSXIlya Leoshkevich1-1/+1
On the architectures that use bpf_jit_needs_zext(), e.g., s390x, the verifier incorrectly inserts a zero-extension after BPF_MEMSX, leading to miscompilations like the one below: 24: 89 1a ff fe 00 00 00 00 "r1 = *(s16 *)(r10 - 2);" # zext_dst set 0x3ff7fdb910e: lgh %r2,-2(%r13,%r0) # load halfword 0x3ff7fdb9114: llgfr %r2,%r2 # wrong! 25: 65 10 00 03 00 00 7f ff if r1 s> 32767 goto +3 <l0_1> # check_cond_jmp_op() Disable such zero-extensions. The JITs need to insert sign-extension themselves, if necessary. Suggested-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Puranjay Mohan <puranjay12@gmail.com> Link: https://lore.kernel.org/r/20230919101336.2223655-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni396-2020/+3555
Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-21Merge tag 'net-6.6-rc3' of ↵Linus Torvalds93-540/+1489
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter and bpf. Current release - regressions: - bpf: adjust size_index according to the value of KMALLOC_MIN_SIZE - netfilter: fix entries val in rule reset audit log - eth: stmmac: fix incorrect rxq|txq_stats reference Previous releases - regressions: - ipv4: fix null-deref in ipv4_link_failure - netfilter: - fix several GC related issues - fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP - eth: team: fix null-ptr-deref when team device type is changed - eth: i40e: fix VF VLAN offloading when port VLAN is configured - eth: ionic: fix 16bit math issue when PAGE_SIZE >= 64KB Previous releases - always broken: - core: fix ETH_P_1588 flow dissector - mptcp: fix several connection hang-up conditions - bpf: - avoid deadlock when using queue and stack maps from NMI - add override check to kprobe multi link attach - hsr: properly parse HSRv1 supervisor frames. - eth: igc: fix infinite initialization loop with early XDP redirect - eth: octeon_ep: fix tx dma unmap len values in SG - eth: hns3: fix GRE checksum offload issue" * tag 'net-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits) sfc: handle error pointers returned by rhashtable_lookup_get_insert_fast() igc: Expose tx-usecs coalesce setting to user octeontx2-pf: Do xdp_do_flush() after redirects. bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPI net: ena: Flush XDP packets on error. net/handshake: Fix memory leak in __sock_create() and sock_alloc_file() net: hinic: Fix warning-hinic_set_vlan_fliter() warn: variable dereferenced before check 'hwdev' netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP netfilter: nf_tables: fix memleak when more than 255 elements expired netfilter: nf_tables: disable toggling dormant table state more than once vxlan: Add missing entries to vxlan_get_size() net: rds: Fix possible NULL-pointer dereference team: fix null-ptr-deref when team device type is changed net: bridge: use DEV_STATS_INC() net: hns3: add 5ms delay before clear firmware reset irq source net: hns3: fix fail to delete tc flower rules during reset issue net: hns3: only enable unicast promisc when mac table full net: hns3: fix GRE checksum offload issue net: hns3: add cmdq check for vf periodic service task net: stmmac: fix incorrect rxq|txq_stats reference ...
2023-09-21Merge tag 'v6.6-rc3.vfs.ctime.revert' of ↵Linus Torvalds10-178/+38
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull finegrained timestamp reverts from Christian Brauner: "Earlier this week we sent a few minor fixes for the multi-grained timestamp work in [1]. While we were polishing those up after Linus realized that there might be a nicer way to fix them we received a regression report in [2] that fine grained timestamps break gnulib tests and thus possibly other tools. The kernel will elide fine-grain timestamp updates when no one is actively querying for them to avoid performance impacts. So a sequence like write(f1) stat(f2) write(f2) stat(f2) write(f1) stat(f1) may result in timestamp f1 to be older than the final f2 timestamp even though f1 was last written too but the second write didn't update the timestamp. Such plotholes can lead to subtle bugs when programs compare timestamps. For example, the nap() function in [2] will estimate that it needs to wait one ns on a fine-grain timestamp enabled filesytem between subsequent calls to observe a timestamp change. But in general we don't update timestamps with more than one jiffie if we think that no one is actively querying for fine-grain timestamps to avoid performance impacts. While discussing various fixes the decision was to go back to the drawing board and ultimately to explore a solution that involves only exposing such fine-grained timestamps to nfs internally and never to userspace. As there are multiple solutions discussed the honest thing to do here is not to fix this up or disable it but to cleanly revert. The general infrastructure will probably come back but there is no reason to keep this code in mainline. The general changes to timestamp handling are valid and a good cleanup that will stay. The revert is fully bisectable" Link: https://lore.kernel.org/all/20230918-hirte-neuzugang-4c2324e7bae3@brauner [1] Link: https://lore.kernel.org/all/bf0524debb976627693e12ad23690094e4514303.camel@linuxfromscratch.org [2] * tag 'v6.6-rc3.vfs.ctime.revert' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: Revert "fs: add infrastructure for multigrain timestamps" Revert "btrfs: convert to multigrain timestamps" Revert "ext4: switch to multigrain timestamps" Revert "xfs: switch to multigrain timestamps" Revert "tmpfs: add support for multigrain timestamps"
2023-09-21Merge tag 'powerpc-6.6-2' of ↵Linus Torvalds6-26/+60
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - A fix for breakpoint handling which was using get_user() while atomic - Fix the Power10 HASHCHK handler which was using get_user() while atomic - A few build fixes for issues caused by recent changes Thanks to Benjamin Gray, Christophe Leroy, Kajol Jain, and Naveen N Rao. * tag 'powerpc-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/dexcr: Move HASHCHK trap handler powerpc/82xx: Select FSL_SOC powerpc: Fix build issue with LD_DEAD_CODE_DATA_ELIMINATION and FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY powerpc/watchpoints: Annotate atomic context in more places powerpc/watchpoint: Disable pagefaults when getting user instruction powerpc/watchpoints: Disable preemption in thread_change_pc() powerpc/perf/hv-24x7: Update domain value check
2023-09-21Merge tag 'for-linus-6.6a-rc3-tag' of ↵Linus Torvalds16-155/+123
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - remove some unused functions in the Xen event channel handling - fix a regression (introduced during the merge window) when booting as Xen PV guest - small cleanup removing another strncpy() instance * tag 'for-linus-6.6a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/efi: refactor deprecated strncpy x86/xen: allow nesting of same lazy mode x86/xen: move paravirt lazy code arm/xen: remove lazy mode related definitions xen: simplify evtchn_do_upcall() call maze
2023-09-21Merge tag 'fixes-2023-09-21' of ↵Linus Torvalds6-5/+10
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock test fixes from Mike Rapoport: "Fix several compilation errors and warnings in memblock tests" * tag 'fixes-2023-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: fix warning ‘struct seq_file’ declared inside parameter list memblock tests: fix warning: "__ALIGN_KERNEL" redefined memblock tests: Fix compilation errors.