summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-04-05samples: bpf: Fix linking xdp_router_ipv4 after migrationAlexander Lobakin1-0/+1
Users of the xdp_sample_user infra should be explicitly linked with the standard math library (`-lm`). Otherwise, the following happens: /usr/bin/ld: xdp_sample_user.c:(.text+0x59fc): undefined reference to `ceil' /usr/bin/ld: xdp_sample_user.c:(.text+0x5a0d): undefined reference to `ceil' /usr/bin/ld: xdp_sample_user.c:(.text+0x5adc): undefined reference to `floor' /usr/bin/ld: xdp_sample_user.c:(.text+0x5b01): undefined reference to `ceil' /usr/bin/ld: xdp_sample_user.c:(.text+0x5c1e): undefined reference to `floor' /usr/bin/ld: xdp_sample_user.c:(.text+0x5c43): undefined reference to `ceil [...] That happened previously, so there's a block of linkage flags in the Makefile. xdp_router_ipv4 has been transferred to this infra quite recently, but hasn't been added to it. Fix. Fixes: 85bf1f51691c ("samples: bpf: Convert xdp_router_ipv4 to XDP samples helper") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220404115451.1116478-1-alexandr.lobakin@intel.com
2022-04-05sample: bpf: syscall_tp_user: Print result of verify_mapSong Chen1-0/+3
At the end of the test, we already print out prog <prog number>: map ids <...> <...> Value is the number read from kernel through bpf map, further print out verify map:<map id> val:<...> will help users to understand the program runs successfully. Signed-off-by: Song Chen <chensong_2000@189.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648889828-12417-1-git-send-email-chensong_2000@189.cn
2022-04-04libbpf: Don't return -EINVAL if hdr_len < offsetofend(core_relo_len)Yuntao Wang1-4/+2
Since core relos is an optional part of the .BTF.ext ELF section, we should skip parsing it instead of returning -EINVAL if header size is less than offsetofend(struct btf_ext_header, core_relo_len). Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220404005320.1723055-1-ytcoode@gmail.com
2022-04-04Merge branch 'libbpf: name-based u[ret]probe attach'Andrii Nakryiko9-25/+541
Alan Maguire says: ==================== This patch series focuses on supporting name-based attach - similar to that supported for kprobes - for uprobe BPF programs. Currently attach for such probes is done by determining the offset manually, so the aim is to try and mimic the simplicity of kprobe attach, making use of uprobe opts to specify a name string. Patch 1 supports expansion of the binary_path argument used for bpf_program__attach_uprobe_opts(), allowing it to determine paths for programs and shared objects automatically, allowing for specification of "libc.so.6" rather than the full path "/usr/lib64/libc.so.6". Patch 2 adds the "func_name" option to allow uprobe attach by name; the mechanics are described there. Having name-based support allows us to support auto-attach for uprobes; patch 3 adds auto-attach support while attempting to handle backwards-compatibility issues that arise. The format supported is u[ret]probe/binary_path:[raw_offset|function[+offset]] For example, to attach to libc malloc: SEC("uprobe//usr/lib64/libc.so.6:malloc") ..or, making use of the path computation mechanisms introduced in patch 1 SEC("uprobe/libc.so.6:malloc") Finally patch 4 add tests to the attach_probe selftests covering attach by name, with patch 5 covering skeleton auto-attach. Changes since v4 [1]: - replaced strtok_r() usage with copying segments from static char *; avoids unneeded string allocation (Andrii, patch 1) - switched to using access() instead of stat() when checking path-resolved binary (Andrii, patch 1) - removed computation of .plt offset for instrumenting shared library calls within binaries. Firstly it proved too brittle, and secondly it was somewhat unintuitive in that this form of instrumentation did not support function+offset as the "local function in binary" and "shared library function in shared library" cases did. We can still instrument library calls, just need to do it in the library .so (patch 2) - added binary path logging in cases where it was missing (Andrii, patch 2) - avoid strlen() calcuation in checking name match (Andrii, patch 2) - reword comments for func_name option (Andrii, patch 2) - tightened SEC() name validation to support "u[ret]probe" and fail on other permutations that do not support auto-attach (i.e. have u[ret]probe/binary_path:func format (Andrii, patch 3) - fixed selftests to fail independently rather than skip remainder on failure (Andrii, patches 4,5) Changes since v3 [2]: - reworked variable naming to fit better with libbpf conventions (Andrii, patch 2) - use quoted binary path in log messages (Andrii, patch 2) - added path determination mechanisms using LD_LIBRARY_PATH/PATH and standard locations (patch 1, Andrii) - changed section lookup to be type+name (if name is specified) to simplify use cases (patch 2, Andrii) - fixed .plt lookup scheme to match symbol table entries with .plt index via the .rela.plt table; also fix the incorrect assumption that the code in the .plt that does library linking is the same size as .plt entries (it just happens to be on x86_64) - aligned with pluggable section support such that uprobe SEC() names that do not conform to auto-attach format do not cause skeleton load failure (patch 3, Andrii) - no longer need to look up absolute path to libraries used by test_progs since we have mechanism to determine path automatically - replaced CHECK()s with ASSERT*()s for attach_probe test (Andrii, patch 4) - added auto-attach selftests also (Andrii, patch 5) Changes since RFC [3]: - used "long" for addresses instead of ssize_t (Andrii, patch 1). - used gelf_ interfaces to avoid assumptions about 64-bit binaries (Andrii, patch 1) - clarified string matching in symbol table lookups (Andrii, patch 1) - added support for specification of shared object functions in a non-shared object binary. This approach instruments the Procedure Linking Table (PLT) - malloc@PLT. - changed logic in symbol search to check dynamic symbol table first, then fall back to symbol table (Andrii, patch 1). - modified auto-attach string to require "/" separator prior to path prefix i.e. uprobe//path/to/binary (Andrii, patch 2) - modified auto-attach string to use ':' separator (Andrii, patch 2) - modified auto-attach to support raw offset (Andrii, patch 2) - modified skeleton attach to interpret -ESRCH errors as a non-fatal "unable to auto-attach" (Andrii suggested -EOPNOTSUPP but my concern was it might collide with other instances where that value is returned and reflects a failure to attach a to-be-expected attachment rather than skip a program that does not present an auto-attachable section name. Admittedly -EOPNOTSUPP seems a more natural value here). - moved library path retrieval code to trace_helpers (Andrii, patch 3) [1] https://lore.kernel.org/bpf/1647000658-16149-1-git-send-email-alan.maguire@oracle.com/ [2] https://lore.kernel.org/bpf/1643645554-28723-1-git-send-email-alan.maguire@oracle.com/ [3] https://lore.kernel.org/bpf/1642678950-19584-1-git-send-email-alan.maguire@oracle.com/ ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-04-04selftests/bpf: Add tests for uprobe auto-attach via skeletonAlan Maguire2-0/+90
tests that verify auto-attach works for function entry/return for local functions in program and library functions in a library. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-6-git-send-email-alan.maguire@oracle.com
2022-04-04selftests/bpf: Add tests for u[ret]probe attach by nameAlan Maguire2-17/+109
add tests that verify attaching by name for 1. local functions in a program 2. library functions in a shared object ...succeed for uprobe and uretprobes using new "func_name" option for bpf_program__attach_uprobe_opts(). Also verify auto-attach works where uprobe, path to binary and function name are specified, but fails with -EOPNOTSUPP with a SEC name that does not specify binary path/function. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-5-git-send-email-alan.maguire@oracle.com
2022-04-04libbpf: Add auto-attach for uprobes based on section nameAlan Maguire4-6/+76
Now that u[ret]probes can use name-based specification, it makes sense to add support for auto-attach based on SEC() definition. The format proposed is SEC("u[ret]probe/binary:[raw_offset|[function_name[+offset]]") For example, to trace malloc() in libc: SEC("uprobe/libc.so.6:malloc") ...or to trace function foo2 in /usr/bin/foo: SEC("uprobe//usr/bin/foo:foo2") Auto-attach is done for all tasks (pid -1). prog can be an absolute path or simply a program/library name; in the latter case, we use PATH/LD_LIBRARY_PATH to resolve the full path, falling back to standard locations (/usr/bin:/usr/sbin or /usr/lib64:/usr/lib) if the file is not found via environment-variable specified locations. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-4-git-send-email-alan.maguire@oracle.com
2022-04-04libbpf: Support function name-based attach uprobesAlan Maguire2-1/+213
kprobe attach is name-based, using lookups of kallsyms to translate a function name to an address. Currently uprobe attach is done via an offset value as described in [1]. Extend uprobe opts for attach to include a function name which can then be converted into a uprobe-friendly offset. The calcualation is done in several steps: 1. First, determine the symbol address using libelf; this gives us the offset as reported by objdump 2. If the function is a shared library function - and the binary provided is a shared library - no further work is required; the address found is the required address 3. Finally, if the function is local, subtract the base address associated with the object, retrieved from ELF program headers. The resultant value is then added to the func_offset value passed in to specify the uprobe attach address. So specifying a func_offset of 0 along with a function name "printf" will attach to printf entry. The modes of operation supported are then 1. to attach to a local function in a binary; function "foo1" in "/usr/bin/foo" 2. to attach to a shared library function in a shared library - function "malloc" in libc. [1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-3-git-send-email-alan.maguire@oracle.com
2022-04-04libbpf: auto-resolve programs/libraries when necessary for uprobesAlan Maguire1-1/+53
bpf_program__attach_uprobe_opts() requires a binary_path argument specifying binary to instrument. Supporting simply specifying "libc.so.6" or "foo" should be possible too. Library search checks LD_LIBRARY_PATH, then /usr/lib64, /usr/lib. This allows users to run BPF programs prefixed with LD_LIBRARY_PATH=/path2/lib while still searching standard locations. Similarly for non .so files, we check PATH and /usr/bin, /usr/sbin. Path determination will be useful for auto-attach of BPF uprobe programs using SEC() definition. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-2-git-send-email-alan.maguire@oracle.com
2022-04-04samples: bpf: Convert xdp_router_ipv4 to XDP samples helperLorenzo Bianconi4-459/+381
Rely on the libbpf skeleton facility and other utilities provided by XDP sample helpers in xdp_router_ipv4 sample. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/7f4d98ee2c13c04d5eb924eebf79ced32fee8418.1647414711.git.lorenzo@kernel.org
2022-04-04bpf: Correct the comment for BTF kind bitfieldHaiyue Wang2-4/+4
The commit 8fd886911a6a ("bpf: Add BTF_KIND_FLOAT to uapi") has extended the BTF kind bitfield from 4 to 5 bits, correct the comment. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220403115327.205964-1-haiyue.wang@intel.com
2022-04-04selftests/bpf: Fix cd_flavor_subdir() of test_progsYuntao Wang1-2/+4
Currently, when we run test_progs with just executable file name, for example 'PATH=. test_progs-no_alu32', cd_flavor_subdir() will not check if test_progs is running as a flavored test runner and switch into corresponding sub-directory. This will cause test_progs-no_alu32 executed by the 'PATH=. test_progs-no_alu32' command to run in the wrong directory and load the wrong BPF objects. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220403135245.1713283-1-ytcoode@gmail.com
2022-04-04selftests/bpf: Return true/false (not 1/0) from bool functionsHaowen Bai2-7/+7
Return boolean values ("true" or "false") instead of 1 or 0 from bool functions. This fixes the following warnings from coccicheck: ./tools/testing/selftests/bpf/progs/test_xdp_noinline.c:567:9-10: WARNING: return of 0/1 in function 'get_packet_dst' with return type bool ./tools/testing/selftests/bpf/progs/test_l4lb_noinline.c:221:9-10: WARNING: return of 0/1 in function 'get_packet_dst' with return type bool Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/1648779354-14700-1-git-send-email-baihaowen@meizu.com
2022-04-04selftests/bpf: Fix vfs_link kprobe definitionNikolay Borisov1-2/+3
Since commit 6521f8917082 ("namei: prepare for idmapped mounts") vfs_link's prototype was changed, the kprobe definition in profiler selftest in turn wasn't updated. The result is that all argument after the first are now stored in different registers. This means that self-test has been broken ever since. Fix it by updating the kprobe definition accordingly. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220331140949.1410056-1-nborisov@suse.com
2022-04-04bpf: Replace usage of supported with dedicated list iterator variableJakob Koschel1-16/+14
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. This removes the need to use the found variable (existed & supported) and simply checking if the variable was set, can determine if the break/goto was hit. [1] https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220331091929.647057-1-jakobkoschel@gmail.com
2022-04-02bpf, test_offload.py: Skip base maps without namesYauheni Kaliuta1-1/+1
The test fails: # ./test_offload.py [...] Test bpftool bound info reporting (own ns)... FAIL: 3 BPF maps loaded, expected 2 File "/root/bpf-next/tools/testing/selftests/bpf/./test_offload.py", line 1177, in <module> check_dev_info(False, "") File "/root/bpf-next/tools/testing/selftests/bpf/./test_offload.py", line 645, in check_dev_info maps = bpftool_map_list(expected=2, ns=ns) File "/root/bpf-next/tools/testing/selftests/bpf/./test_offload.py", line 190, in bpftool_map_list fail(True, "%d BPF maps loaded, expected %d" % File "/root/bpf-next/tools/testing/selftests/bpf/./test_offload.py", line 86, in fail tb = "".join(traceback.extract_stack().format()) Some base maps do not have names and they cannot be added due to compatibility with older kernels, see [0]. So, just skip the unnamed maps. [0] https://lore.kernel.org/bpf/CAEf4BzY66WPKQbDe74AKZ6nFtZjq5e+G3Ji2egcVytB9R6_sGQ@mail.gmail.com/ Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220329081100.9705-1-ykaliuta@redhat.com
2022-04-01bpf: Remove redundant assignment to smap->map.value_sizeYuntao Wang1-1/+0
The attr->value_size is already assigned to smap->map.value_size in bpf_map_init_from_attr(), there is no need to do it again in stack_map_alloc(). Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/bpf/20220323073626.958652-1-ytcoode@gmail.com
2022-04-01selftests/bpf: Remove unused variable from bpf_sk_assign testEyal Birger1-3/+1
Was never used in bpf_sk_assign_test(), and was removed from handle_{tcp,udp}() in commit 0b9ad56b1ea6 ("selftests/bpf: Use SOCKMAP for server sockets in bpf_sk_assign test"). Fixes: 0b9ad56b1ea6 ("selftests/bpf: Use SOCKMAP for server sockets in bpf_sk_assign test") Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220329154914.3718658-1-eyal.birger@gmail.com
2022-04-01bpf: Use swap() instead of open coding itJiapeng Chong1-4/+2
Clean the following coccicheck warning: ./kernel/trace/bpf_trace.c:2263:34-35: WARNING opportunity for swap(). ./kernel/trace/bpf_trace.c:2264:40-41: WARNING opportunity for swap(). Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220322062149.109180-1-jiapeng.chong@linux.alibaba.com
2022-04-01bpf, tests: Add load store test case for tail callXu Kuohai1-0/+30
Add test case to enusre that the caller and callee's fp offsets are correct during tail call (mainly asserting for arm64 JIT). Tested on both big-endian and little-endian arm64 qemu, result: test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed] test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed] test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321152852.2334294-6-xukuohai@huawei.com
2022-04-01bpf, tests: Add tests for BPF_LDX/BPF_STX with different offsetsXu Kuohai1-4/+281
This patch adds tests to verify the behavior of BPF_LDX/BPF_STX + BPF_B/BPF_H/BPF_W/BPF_DW with negative offset, small positive offset, large positive offset, and misaligned offset. Tested on both big-endian and little-endian arm64 qemu, result: test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]'] test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed] test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321152852.2334294-5-xukuohai@huawei.com
2022-04-01bpf, arm64: Adjust the offset of str/ldr(immediate) to positive numberXu Kuohai1-27/+138
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321152852.2334294-4-xukuohai@huawei.com
2022-04-01bpf, arm64: Optimize BPF store/load using arm64 str/ldr(immediate offset)Xu Kuohai2-15/+127
The current BPF store/load instruction is translated by the JIT into two instructions. The first instruction moves the immediate offset into a temporary register. The second instruction uses this temporary register to do the real store/load. In fact, arm64 supports addressing with immediate offsets. So This patch introduces optimization that uses arm64 str/ldr instruction with immediate offset when the offset fits. Example of generated instuction for r2 = *(u64 *)(r1 + 0): without optimization: mov x10, 0 ldr x1, [x0, x10] with optimization: ldr x1, [x0, 0] If the offset is negative, or is not aligned correctly, or exceeds max value, rollback to the use of temporary register. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321152852.2334294-3-xukuohai@huawei.com
2022-04-01arm64, insn: Add ldr/str with immediate offsetXu Kuohai2-14/+62
This patch introduces ldr/str with immediate offset support to simplify the JIT implementation of BPF LDX/STX instructions on arm64. Although arm64 ldr/str immediate is available in pre-index, post-index and unsigned offset forms, the unsigned offset form is sufficient for BPF, so this patch only adds this type. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321152852.2334294-2-xukuohai@huawei.com
2022-03-31Merge tag 'net-5.18-rc1' of ↵Linus Torvalds58-337/+588
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull more networking updates from Jakub Kicinski: "Networking fixes and rethook patches. Features: - kprobes: rethook: x86: replace kretprobe trampoline with rethook Current release - regressions: - sfc: avoid null-deref on systems without NUMA awareness in the new queue sizing code Current release - new code bugs: - vxlan: do not feed vxlan_vnifilter_dump_dev with non-vxlan devices - eth: lan966x: fix null-deref on PHY pointer in timestamp ioctl when interface is down Previous releases - always broken: - openvswitch: correct neighbor discovery target mask field in the flow dump - wireguard: ignore v6 endpoints when ipv6 is disabled and fix a leak - rxrpc: fix call timer start racing with call destruction - rxrpc: fix null-deref when security type is rxrpc_no_security - can: fix UAF bugs around echo skbs in multiple drivers Misc: - docs: move netdev-FAQ to the 'process' section of the documentation" * tag 'net-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits) vxlan: do not feed vxlan_vnifilter_dump_dev with non vxlan devices openvswitch: Add recirc_id to recirc warning rxrpc: fix some null-ptr-deref bugs in server_key.c rxrpc: Fix call timer start racing with call destruction net: hns3: fix software vlan talbe of vlan 0 inconsistent with hardware net: hns3: fix the concurrency between functions reading debugfs docs: netdev: move the netdev-FAQ to the process pages docs: netdev: broaden the new vs old code formatting guidelines docs: netdev: call out the merge window in tag checking docs: netdev: add missing back ticks docs: netdev: make the testing requirement more stringent docs: netdev: add a question about re-posting frequency docs: netdev: rephrase the 'should I update patchwork' question docs: netdev: rephrase the 'Under review' question docs: netdev: shorten the name and mention msgid for patch status docs: netdev: note that RFC postings are allowed any time docs: netdev: turn the net-next closed into a Warning docs: netdev: move the patch marking section up docs: netdev: minor reword docs: netdev: replace references to old archives ...
2022-03-31Merge tag 'v5.18-p1' of ↵Linus Torvalds5-23/+27
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Missing Kconfig dependency on arm that leads to boot failure - x86 SLS fixes - Reference leak in the stm32 driver * tag 'v5.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/sm3 - Fixup SLS crypto: x86/poly1305 - Fixup SLS crypto: x86/chacha20 - Avoid spurious jumps to other functions crypto: stm32 - fix reference leak in stm32_crc_remove crypto: arm/aes-neonbs-cbc - Select generic cbc and aes
2022-03-31vxlan: do not feed vxlan_vnifilter_dump_dev with non vxlan devicesEric Dumazet1-0/+6
vxlan_vnifilter_dump_dev() assumes it is called only for vxlan devices. Make sure it is the case. BUG: KASAN: slab-out-of-bounds in vxlan_vnifilter_dump_dev+0x9a0/0xb40 drivers/net/vxlan/vxlan_vnifilter.c:349 Read of size 4 at addr ffff888060d1ce70 by task syz-executor.3/17662 CPU: 0 PID: 17662 Comm: syz-executor.3 Tainted: G W 5.17.0-syzkaller-12888-g77c9387c0c5b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0xeb/0x495 mm/kasan/report.c:313 print_report mm/kasan/report.c:429 [inline] kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491 vxlan_vnifilter_dump_dev+0x9a0/0xb40 drivers/net/vxlan/vxlan_vnifilter.c:349 vxlan_vnifilter_dump+0x3ff/0x650 drivers/net/vxlan/vxlan_vnifilter.c:428 netlink_dump+0x4b5/0xb70 net/netlink/af_netlink.c:2270 __netlink_dump_start+0x647/0x900 net/netlink/af_netlink.c:2375 netlink_dump_start include/linux/netlink.h:245 [inline] rtnetlink_rcv_msg+0x70c/0xb80 net/core/rtnetlink.c:5953 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2496 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:725 ____sys_sendmsg+0x6e2/0x800 net/socket.c:2413 ___sys_sendmsg+0xf3/0x170 net/socket.c:2467 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f87b8e89049 Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Roopa Prabhu <roopa@nvidia.com> Link: https://lore.kernel.org/r/20220330194643.2706132-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-31openvswitch: Add recirc_id to recirc warningStéphane Graber1-2/+2
When hitting the recirculation limit, the kernel would currently log something like this: [ 58.586597] openvswitch: ovs-system: deferred action limit reached, drop recirc action Which isn't all that useful to debug as we only have the interface name to go on but can't track it down to a specific flow. With this change, we now instead get: [ 58.586597] openvswitch: ovs-system: deferred action limit reached, drop recirc action (recirc_id=0x9e) Which can now be correlated with the flow entries from OVS. Suggested-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Tested-by: Stephane Graber <stgraber@ubuntu.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Link: https://lore.kernel.org/r/20220330194244.3476544-1-stgraber@ubuntu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-31Merge tag 'linux-can-fixes-for-5.18-20220331' of ↵Jakub Kicinski7-32/+37
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2022-03-31 The first patch is by Oliver Hartkopp and fixes MSG_PEEK feature in the CAN ISOTP protocol (broken in net-next for v5.18 only). Tom Rix's patch for the mcp251xfd driver fixes the propagation of an error value in case of an error. A patch by me for the m_can driver fixes a use-after-free in the xmit handler for m_can IP cores v3.0.x. Hangyu Hua contributes 3 patches fixing the same double free in the error path of the xmit handler in the ems_usb, usb_8dev and mcba_usb USB CAN driver. Pavel Skripkin contributes a patch for the mcba_usb driver to properly check the endpoint type. The last patch is by me and fixes a mem leak in the gs_usb, which was introduced in net-next for v5.18. * tag 'linux-can-fixes-for-5.18-20220331' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: gs_usb: gs_make_candev(): fix memory leak for devices with extended bit timing configuration can: mcba_usb: properly check endpoint type can: mcba_usb: mcba_usb_start_xmit(): fix double dev_kfree_skb in error path can: usb_8dev: usb_8dev_start_xmit(): fix double dev_kfree_skb() in error path can: ems_usb: ems_usb_start_xmit(): fix double dev_kfree_skb() in error path can: m_can: m_can_tx_handler(): fix use after free of skb can: mcp251xfd: mcp251xfd_register_get_dev_id(): fix return of error value can: isotp: restore accidentally removed MSG_PEEK feature ==================== Link: https://lore.kernel.org/r/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-31rxrpc: fix some null-ptr-deref bugs in server_key.cXiaolong Huang1-2/+5
Some function calls are not implemented in rxrpc_no_security, there are preparse_server_key, free_preparse_server_key and destroy_server_key. When rxrpc security type is rxrpc_no_security, user can easily trigger a null-ptr-deref bug via ioctl. So judgment should be added to prevent it The crash log: user@syzkaller:~$ ./rxrpc_preparse_s [ 37.956878][T15626] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 37.957645][T15626] #PF: supervisor instruction fetch in kernel mode [ 37.958229][T15626] #PF: error_code(0x0010) - not-present page [ 37.958762][T15626] PGD 4aadf067 P4D 4aadf067 PUD 4aade067 PMD 0 [ 37.959321][T15626] Oops: 0010 [#1] PREEMPT SMP [ 37.959739][T15626] CPU: 0 PID: 15626 Comm: rxrpc_preparse_ Not tainted 5.17.0-01442-gb47d5a4f6b8d #43 [ 37.960588][T15626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014 [ 37.961474][T15626] RIP: 0010:0x0 [ 37.961787][T15626] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [ 37.962480][T15626] RSP: 0018:ffffc9000d9abdc0 EFLAGS: 00010286 [ 37.963018][T15626] RAX: ffffffff84335200 RBX: ffff888012a1ce80 RCX: 0000000000000000 [ 37.963727][T15626] RDX: 0000000000000000 RSI: ffffffff84a736dc RDI: ffffc9000d9abe48 [ 37.964425][T15626] RBP: ffffc9000d9abe48 R08: 0000000000000000 R09: 0000000000000002 [ 37.965118][T15626] R10: 000000000000000a R11: f000000000000000 R12: ffff888013145680 [ 37.965836][T15626] R13: 0000000000000000 R14: ffffffffffffffec R15: ffff8880432aba80 [ 37.966441][T15626] FS: 00007f2177907700(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000 [ 37.966979][T15626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 37.967384][T15626] CR2: ffffffffffffffd6 CR3: 000000004aaf1000 CR4: 00000000000006f0 [ 37.967864][T15626] Call Trace: [ 37.968062][T15626] <TASK> [ 37.968240][T15626] rxrpc_preparse_s+0x59/0x90 [ 37.968541][T15626] key_create_or_update+0x174/0x510 [ 37.968863][T15626] __x64_sys_add_key+0x139/0x1d0 [ 37.969165][T15626] do_syscall_64+0x35/0xb0 [ 37.969451][T15626] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 37.969824][T15626] RIP: 0033:0x43a1f9 Signed-off-by: Xiaolong Huang <butterflyhuangxx@gmail.com> Tested-by: Xiaolong Huang <butterflyhuangxx@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2022-March/005069.html Fixes: 12da59fcab5a ("rxrpc: Hand server key parsing off to the security class") Link: https://lore.kernel.org/r/164865013439.2941502.8966285221215590921.stgit@warthog.procyon.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31rxrpc: Fix call timer start racing with call destructionDavid Howells4-15/+50
The rxrpc_call struct has a timer used to handle various timed events relating to a call. This timer can get started from the packet input routines that are run in softirq mode with just the RCU read lock held. Unfortunately, because only the RCU read lock is held - and neither ref or other lock is taken - the call can start getting destroyed at the same time a packet comes in addressed to that call. This causes the timer - which was already stopped - to get restarted. Later, the timer dispatch code may then oops if the timer got deallocated first. Fix this by trying to take a ref on the rxrpc_call struct and, if successful, passing that ref along to the timer. If the timer was already running, the ref is discarded. The timer completion routine can then pass the ref along to the call's work item when it queues it. If the timer or work item where already queued/running, the extra ref is discarded. Fixes: a158bdd3247b ("rxrpc: Fix call timeouts") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2022-March/005073.html Link: https://lore.kernel.org/r/164865115696.2943015.11097991776647323586.stgit@warthog.procyon.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31Merge branch 'net-hns3-add-two-fixes-for-net'Paolo Abeni4-8/+15
Guangbin Huang says: ==================== net: hns3: add two fixes for -net This series adds two fixes for the HNS3 ethernet driver. ==================== Link: https://lore.kernel.org/r/20220330134506.36635-1-huangguangbin2@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31net: hns3: fix software vlan talbe of vlan 0 inconsistent with hardwareGuangbin Huang1-3/+3
When user delete vlan 0, as driver will not delete vlan 0 for hardware in function hclge_set_vlan_filter_hw(), so vlan 0 in software vlan talbe should not be deleted. Fixes: fe4144d47eef ("net: hns3: sync VLAN filter entries when kill VLAN ID failed") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31net: hns3: fix the concurrency between functions reading debugfsYufeng Mo3-5/+12
Currently, the debugfs mechanism is that all functions share a global variable to save the pointer for obtaining data. When different functions concurrently access the same file node, repeated release exceptions occur. Therefore, the granularity of the pointer for storing the obtained data is adjusted to be private for each function. Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31Merge branch 'docs-update-and-move-the-netdev-faq'Paolo Abeni5-48/+73
Jakub Kicinski says: ==================== docs: update and move the netdev-FAQ A section of documentation for tree-specific process quirks had been created a while back. There's only one tree in it, so far, the tip tree, but the contents seem to answer similar questions as we answer in the netdev-FAQ. Move the netdev-FAQ. Take this opportunity to touch up and update a few sections. v3: remove some confrontational? language from patch 7 v2: remove non-git in patch 3 add patch 5 ==================== Link: https://lore.kernel.org/r/20220330042505.2902770-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: move the netdev-FAQ to the process pagesJakub Kicinski5-2/+5
The documentation for the tip tree is really in quite a similar spirit to the netdev-FAQ. Move the netdev-FAQ to the process docs as well. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: broaden the new vs old code formatting guidelinesJakub Kicinski1-4/+4
Convert the "should I use new or old comment formatting" to cover all formatting. This makes the question itself shorter. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: call out the merge window in tag checkingJakub Kicinski1-1/+3
Add the most important case to the question about "where are we in the cycle" - the case of net-next being closed. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: add missing back ticksJakub Kicinski1-6/+6
I think double back ticks are more correct. Add where they are missing. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: make the testing requirement more stringentJakub Kicinski1-5/+9
These days we often ask for selftests so let's update our testing requirements. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: add a question about re-posting frequencyJakub Kicinski1-0/+11
We have to tell people to stop reposting to often lately, or not to repost while the discussion is ongoing. Document this. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: rephrase the 'should I update patchwork' questionJakub Kicinski1-3/+5
Make the question shorter and adjust the start of the answer accordingly. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: rephrase the 'Under review' questionJakub Kicinski1-3/+5
The semantics of "Under review" have shifted. Reword the question about it a bit and focus it on the response time. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: shorten the name and mention msgid for patch statusJakub Kicinski1-3/+5
Cut down the length of the question so it renders better in docs. Mention that Message-ID can be used to search patchwork. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: note that RFC postings are allowed any timeJakub Kicinski1-0/+3
Document that RFCs are allowed during the merge window. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: turn the net-next closed into a WarningJakub Kicinski1-2/+3
Use the sphinx Warning box to make the net-next being closed stand out more. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: move the patch marking section upJakub Kicinski1-14/+11
We want people to mark their patches with net and net-next in the subject. Many miss doing that. Move the FAQ section which points that out up, and place it after the section which enumerates the trees, that seems like a pretty logical place for it. Since the two sections are together we can remove a little bit (not too much) of the repetition. v2: also remove the text for non-git setups, we want people to use git. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: minor rewordJakub Kicinski1-1/+1
that -> those Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31docs: netdev: replace references to old archivesJakub Kicinski1-4/+2
Most people use (or should use) lore at this point. Replace the pointers to older archiving systems. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-31can: gs_usb: gs_make_candev(): fix memory leak for devices with extended bit ↵Marc Kleine-Budde1-0/+2
timing configuration Some CAN-FD capable devices offer extended bit timing information for the data bit timing. The information must be read with an USB control message. The memory for this message is allocated but not free()ed (in the non error case). This patch adds the missing free. Fixes: 6679f4c5e5a6 ("can: gs_usb: add extended bt_const feature") Link: https://lore.kernel.org/all/20220329193450.659726-1-mkl@pengutronix.de Reported-by: syzbot+4d0ae90a195b269f102d@syzkaller.appspotmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>