summaryrefslogtreecommitdiff
path: root/arch/s390/net
AgeCommit message (Collapse)AuthorFilesLines
2024-07-08s390/bpf: Implement exceptionsIlya Leoshkevich1-2/+53
Implement the following three pieces required from the JIT: - A "top-level" BPF prog (exception_boundary) must save all non-volatile registers, and not only the ones that it clobbers. - A "handler" BPF prog (exception_cb) must switch stack to that of exception_boundary, and restore the registers that exception_boundary saved. - arch_bpf_stack_walk() must unwind the stack and provide the results in a way that satisfies both bpf_throw() and exception_cb. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-3-iii@linux.ibm.com
2024-07-08s390/bpf: Change seen_reg to a maskIlya Leoshkevich1-16/+16
Using a mask instead of an array saves a small amount of memory and allows marking multiple registers as seen with a simple "or". Another positive side-effect is that it speeds up verification with jitterbug. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703005047.40915-2-iii@linux.ibm.com
2024-07-02s390/bpf: Support arena atomicsIlya Leoshkevich1-10/+94
s390x supports most BPF atomics using single instructions, which makes implementing arena support a matter of adding arena address to the base register (unfortunately atomics do not support index registers), and wrapping the respective native instruction in probing sequences. An exception is BPF_XCHG, which is implemented using two different memory accesses and a loop. Make sure there is enough extable entries for both instructions. Compute the base address once for both memory accesses. Since on exception we need to land after the loop, emit the nops manually. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-10-iii@linux.ibm.com
2024-07-02s390/bpf: Enable arenaIlya Leoshkevich1-0/+5
Now that BPF_PROBE_MEM32 and address space cast instructions are implemented, tell the verifier that the JIT supports arena. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-9-iii@linux.ibm.com
2024-07-02s390/bpf: Support address space cast instructionIlya Leoshkevich1-0/+18
The new address cast instruction translates arena offsets to userspace addresses. NULL pointers must not be translated. The common code sets up the mappings in such a way that it's enough to replace the higher 32 bits to achieve the desired result. s390x has just an instruction for this: INSERT IMMEDIATE. Implement the sequence using 3 instruction: LOAD AND TEST, BRANCH RELATIVE ON CONDITION and INSERT IMMEDIATE. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-8-iii@linux.ibm.com
2024-07-02s390/bpf: Support BPF_PROBE_MEM32Ilya Leoshkevich1-27/+110
BPF_PROBE_MEM32 is a new mode for LDX, ST and STX instructions. The JIT is supposed to add the start address of the kernel arena mapping to the %dst register, and use a probing variant of the respective memory access. Reuse the existing probing infrastructure for that. Put the arena address into the literal pool, load it into %r1 and use that as an index register. Do not clear any registers in ex_handler_bpf() for failing ST and STX instructions. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-7-iii@linux.ibm.com
2024-07-02s390/bpf: Land on the next JITed instruction after exceptionIlya Leoshkevich1-3/+4
Currently we land on the nop, which is unnecessary: we can just as well begin executing the next instruction. Furthermore, the upcoming arena support for the loop-based BPF_XCHG implementation will require landing on an instruction that comes after the loop. So land on the next JITed instruction, which covers both cases. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-6-iii@linux.ibm.com
2024-07-02s390/bpf: Introduce pre- and post- probe functionsIlya Leoshkevich1-14/+44
Currently probe insns are handled by two "if" statements at the beginning and at the end of bpf_jit_insn(). The first one needs to be in sync with the huge insn->code statement that follows it, which was not a problem so far, since the check is small. The introduction of arena will make it significantly larger, and it will no longer be obvious whether it is in sync with the opcode switch. Move these statements to the new bpf_jit_probe_load_pre() and bpf_jit_probe_post() functions, and call them only from cases that need them. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-5-iii@linux.ibm.com
2024-07-02s390/bpf: Get rid of get_probe_mem_regno()Ilya Leoshkevich1-26/+7
Commit 7fc8c362e782 ("s390/bpf: encode register within extable entry") introduced explicit passing of the number of the register to be cleared to ex_handler_bpf(), which replaced deducing it from the respective native load instruction using get_probe_mem_regno(). Replace the second and last usage in the same manner, and remove this function. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-4-iii@linux.ibm.com
2024-07-02s390/bpf: Factor out emitting probe nopsIlya Leoshkevich1-22/+40
The upcoming arena support for the loop-based BPF_XCHG implementation requires emitting nop and extable entries separately. Move nop handling into a separate function, and keep track of the nop offset. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240701234304.14336-3-iii@linux.ibm.com
2024-05-13s390/bpf: Emit a barrier for BPF_FETCH instructionsIlya Leoshkevich1-2/+6
BPF_ATOMIC_OP() macro documentation states that "BPF_ADD | BPF_FETCH" should be the same as atomic_fetch_add(), which is currently not the case on s390x: the serialization instruction "bcr 14,0" is missing. This applies to "and", "or" and "xor" variants too. s390x is allowed to reorder stores with subsequent fetches from different addresses, so code relying on BPF_FETCH acting as a barrier, for example: stw [%r0], 1 afadd [%r1], %r2 ldxw %r3, [%r4] may be broken. Fix it by emitting "bcr 14,0". Note that a separate serialization instruction is not needed for BPF_XCHG and BPF_CMPXCHG, because COMPARE AND SWAP performs serialization itself. Fixes: ba3b86b9cef0 ("s390/bpf: Implement new atomic ops") Reported-by: Puranjay Mohan <puranjay12@gmail.com> Closes: https://lore.kernel.org/bpf/mb61p34qvq3wf.fsf@kernel.org/ Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20240507000557.12048-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-26/+20
Cross-merge networking fixes after downstream PR. No conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-20s390/bpf: Fix bpf_plt pointer arithmeticIlya Leoshkevich1-26/+20
Kui-Feng Lee reported a crash on s390x triggered by the dummy_st_ops/dummy_init_ptr_arg test [1]: [<0000000000000002>] 0x2 [<00000000009d5cde>] bpf_struct_ops_test_run+0x156/0x250 [<000000000033145a>] __sys_bpf+0xa1a/0xd00 [<00000000003319dc>] __s390x_sys_bpf+0x44/0x50 [<0000000000c4382c>] __do_syscall+0x244/0x300 [<0000000000c59a40>] system_call+0x70/0x98 This is caused by GCC moving memcpy() after assignments in bpf_jit_plt(), resulting in NULL pointers being written instead of the return and the target addresses. Looking at the GCC internals, the reordering is allowed because the alias analysis thinks that the memcpy() destination and the assignments' left-hand-sides are based on different objects: new_plt and bpf_plt_ret/bpf_plt_target respectively, and therefore they cannot alias. This is in turn due to a violation of the C standard: When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object ... From the C's perspective, bpf_plt_ret and bpf_plt are distinct objects and cannot be subtracted. In the practical terms, doing so confuses the GCC's alias analysis. The code was written this way in order to let the C side know a few offsets defined in the assembly. While nice, this is by no means necessary. Fix the noncompliance by hardcoding these offsets. [1] https://lore.kernel.org/bpf/c9923c1d-971d-4022-8dc8-1364e929d34c@gmail.com/ Fixes: f1d5df84cd8c ("s390/bpf: Implement bpf_arch_text_poke()") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Message-ID: <20240320015515.11883-1-iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-15bpf: Take return from set_memory_rox() into account with ↵Christophe Leroy1-1/+5
bpf_jit_binary_lock_ro() set_memory_rox() can fail, leaving memory unprotected. Check return and bail out when bpf_jit_binary_lock_ro() returns an error. Link: https://github.com/KSPP/linux/issues/7 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: linux-hardening@vger.kernel.org <linux-hardening@vger.kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Puranjay Mohan <puranjay12@gmail.com> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn> # LoongArch Reviewed-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> # MIPS Part Message-ID: <036b6393f23a2032ce75a1c92220b2afcb798d5d.1709850515.git.christophe.leroy@csgroup.eu> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04s390/bpf: Fix gotol with large offsetsIlya Leoshkevich1-1/+1
The gotol implementation uses a wrong data type for the offset: it should be s32, not s16. Fixes: c690191e23d8 ("s390/bpf: Implement unconditional jump with 32-bit offset") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20240102193531.3169422-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-18s390/bpf: Fix indirect trampoline generationAlexei Starovoitov1-1/+2
The func_addr used to be NULL for indirect trampolines used by struct_ops. Now func_addr is a valid function pointer. Hence use BPF_TRAMP_F_INDIRECT flag to detect such condition. Fixes: 2cd3e3772e41 ("x86/cfi,bpf: Fix bpf_struct_ops CFI") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20231216004549.78355-1-alexei.starovoitov@gmail.com
2023-12-07bpf: Add arch_bpf_trampoline_size()Song Liu1-22/+34
This helper will be used to calculate the size of the trampoline before allocating the memory. arch_prepare_bpf_trampoline() for arm64 and riscv64 can use arch_bpf_trampoline_size() to check the trampoline fits in the image. OTOH, arch_prepare_bpf_trampoline() for s390 has to call the JIT process twice, so it cannot use arch_bpf_trampoline_size(). Signed-off-by: Song Liu <song@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # on s390x Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> # on riscv Link: https://lore.kernel.org/r/20231206224054.492250-6-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-10-17Merge tag 'for-netdev' of ↵Jakub Kicinski1-63/+202
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-10-16 We've added 90 non-merge commits during the last 25 day(s) which contain a total of 120 files changed, 3519 insertions(+), 895 deletions(-). The main changes are: 1) Add missed stats for kprobes to retrieve the number of missed kprobe executions and subsequent executions of BPF programs, from Jiri Olsa. 2) Add cgroup BPF sockaddr hooks for unix sockets. The use case is for systemd to reimplement the LogNamespace feature which allows running multiple instances of systemd-journald to process the logs of different services, from Daan De Meyer. 3) Implement BPF CPUv4 support for s390x BPF JIT, from Ilya Leoshkevich. 4) Improve BPF verifier log output for scalar registers to better disambiguate their internal state wrt defaults vs min/max values matching, from Andrii Nakryiko. 5) Extend the BPF fib lookup helpers for IPv4/IPv6 to support retrieving the source IP address with a new BPF_FIB_LOOKUP_SRC flag, from Martynas Pumputis. 6) Add support for open-coded task_vma iterator to help with symbolization for BPF-collected user stacks, from Dave Marchevsky. 7) Add libbpf getters for accessing individual BPF ring buffers which is useful for polling them individually, for example, from Martin Kelly. 8) Extend AF_XDP selftests to validate the SHARED_UMEM feature, from Tushar Vyavahare. 9) Improve BPF selftests cross-building support for riscv arch, from Björn Töpel. 10) Add the ability to pin a BPF timer to the same calling CPU, from David Vernet. 11) Fix libbpf's bpf_tracing.h macros for riscv to use the generic implementation of PT_REGS_SYSCALL_REGS() to access syscall arguments, from Alexandre Ghiti. 12) Extend libbpf to support symbol versioning for uprobes, from Hengqi Chen. 13) Fix bpftool's skeleton code generation to guarantee that ELF data is 8 byte aligned, from Ian Rogers. 14) Inherit system-wide cpu_mitigations_off() setting for Spectre v1/v4 security mitigations in BPF verifier, from Yafang Shao. 15) Annotate struct bpf_stack_map with __counted_by attribute to prepare BPF side for upcoming __counted_by compiler support, from Kees Cook. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (90 commits) bpf: Ensure proper register state printing for cond jumps bpf: Disambiguate SCALAR register state output in verifier logs selftests/bpf: Make align selftests more robust selftests/bpf: Improve missed_kprobe_recursion test robustness selftests/bpf: Improve percpu_alloc test robustness selftests/bpf: Add tests for open-coded task_vma iter bpf: Introduce task_vma open-coded iterator kfuncs selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c bpf: Don't explicitly emit BTF for struct btf_iter_num bpf: Change syscall_nr type to int in struct syscall_tp_t net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set bpf: Avoid unnecessary audit log for CPU security mitigations selftests/bpf: Add tests for cgroup unix socket address hooks selftests/bpf: Make sure mount directory exists documentation/bpf: Document cgroup unix socket address hooks bpftool: Add support for cgroup unix socket address hooks libbpf: Add support for cgroup unix socket address hooks bpf: Implement cgroup sockaddr hooks for unix sockets bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf bpf: Propagate modified uaddrlen from cgroup sockaddr programs ... ==================== Link: https://lore.kernel.org/r/20231016204803.30153-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-5/+20
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: kernel/bpf/verifier.c 829955981c55 ("bpf: Fix verifier log for async callback return values") a923819fb2c5 ("bpf: Treat first argument as return value for bpf_throw") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11s390/bpf: Fix unwinding past the trampolineIlya Leoshkevich1-3/+14
When functions called by the trampoline panic, the backtrace that is printed stops at the trampoline, because the trampoline does not store its caller's frame address (backchain) on stack; it also stores the return address at a wrong location. Store both the same way as is already done for the regular eBPF programs. Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231010203512.385819-3-iii@linux.ibm.com
2023-10-11s390/bpf: Fix clobbering the caller's backchain in the trampolineIlya Leoshkevich1-2/+6
One of the first things that s390x kernel functions do is storing the the caller's frame address (backchain) on stack. This makes unwinding possible. The backchain is always stored at frame offset 152, which is inside the 160-byte stack area, that the functions allocate for their callees. The callees must preserve the backchain; the remaining 152 bytes they may use as they please. Currently the trampoline uses all 160 bytes, clobbering the backchain. This causes kernel panics when using __builtin_return_address() in functions called by the trampoline. Fix by reducing the usage of the caller-reserved stack area by 8 bytes in the trampoline. Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Reported-by: Song Liu <song@kernel.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231010203512.385819-2-iii@linux.ibm.com
2023-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+1
Cross-merge networking fixes after downstream PR. No conflicts (or adjacent changes of note). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-09-22s390/bpf: Implement signed divisionIlya Leoshkevich1-47/+125
Implement the cpuv4 signed division. It is encoded as unsigned division, but with off field set to 1. s390x has the necessary instructions: dsgfr, dsgf and dsgr. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-9-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement unconditional jump with 32-bit offsetIlya Leoshkevich1-3/+9
Implement the cpuv4 unconditional jump with 32-bit offset, which is encoded as BPF_JMP32 | BPF_JA and stores the offset in the imm field. Reuse the existing BPF_JMP | BPF_JA logic. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-8-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement unconditional byte swapIlya Leoshkevich1-0/+1
Implement the cpuv4 unconditional byte swap, which is encoded as BPF_ALU64 | BPF_END | BPF_FROM_LE. Since s390x is big-endian, it's the same as the existing BPF_ALU | BPF_END | BPF_FROM_LE. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-7-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement BPF_MEMSXIlya Leoshkevich1-5/+27
Implement the cpuv4 load with sign-extension, which is encoded as BPF_MEMSX (and, for internal uses cases only, BPF_PROBE_MEMSX). This is the same as BPF_MEM and BPF_PROBE_MEM, but with sign extension instead of zero extension, and s390x has the necessary instructions: lgb, lgh and lgf. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-6-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-22s390/bpf: Implement BPF_MOV | BPF_X with sign-extensionIlya Leoshkevich1-8/+40
Implement the cpuv4 register-to-register move with sign extension. It is distinguished from the normal moves by non-zero values in insn->off, which determine the source size. s390x has instructions to deal with all of them: lbr, lhr, lgbr, lghr and lgfr. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919101336.2223655-5-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-19s390/bpf: Let arch_prepare_bpf_trampoline return program sizeSong Liu1-1/+1
arch_prepare_bpf_trampoline() for s390 currently returns 0 on success. This is not a problem for regular trampoline. However, struct_ops relies on the return value to advance "image" pointer: bpf_struct_ops_map_update_elem() { ... for_each_member(i, t, member) { ... err = bpf_struct_ops_prepare_trampoline(); ... image += err; } } When arch_prepare_bpf_trampoline returns 0 on success, all members of the struct_ops will point to the same trampoline (the last one). Fix this by returning the program size in arch_prepare_bpf_trampoline (on success). This is the same behavior as other architectures. Signed-off-by: Song Liu <song@kernel.org> Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230919060258.3237176-2-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-16bpf: Use bpf_is_subprog to check for subprogsKumar Kartikeya Dwivedi1-1/+1
We would like to know whether a bpf_prog corresponds to the main prog or one of the subprogs. The current JIT implementations simply check this using the func_idx in bpf_prog->aux->func_idx. When the index is 0, it belongs to the main program, otherwise it corresponds to some subprogram. This will also be necessary to halt exception propagation while walking the stack when an exception is thrown, so we add a simple helper function to check this, named bpf_is_subprog, and convert existing JIT implementations to also make use of it. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230912233214.1518551-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-06s390/bpf: Pass through tail call counter in trampolinesIlya Leoshkevich1-0/+10
s390x eBPF programs use the following extension to the s390x calling convention: tail call counter is passed on stack at offset STK_OFF_TCCNT, which callees otherwise use as scratch space. Currently trampoline does not respect this and clobbers tail call counter. This breaks enforcing tail call limits in eBPF programs, which have trampolines attached to them. Fix by forwarding a copy of the tail call counter to the original eBPF program in the trampoline (for fexit), and by restoring it at the end of the trampoline (for fentry). Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Reported-by: Leon Hwang <hffilwlqm@gmail.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230906004448.111674-1-iii@linux.ibm.com
2023-06-28s390: consistently use .balign instead of .alignHeiko Carstens1-2/+2
The .align directive has inconsistent behavior across architectures. Use .balign instead everywhere. This is a no-op for s390, but with this there is no mix in using .align and .balign anymore. Future code is supposed to use only .balign. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-04-22Merge tag 'for-netdev' of ↵Jakub Kicinski1-0/+5
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-04-21 We've added 71 non-merge commits during the last 8 day(s) which contain a total of 116 files changed, 13397 insertions(+), 8896 deletions(-). The main changes are: 1) Add a new BPF netfilter program type and minimal support to hook BPF programs to netfilter hooks such as prerouting or forward, from Florian Westphal. 2) Fix race between btf_put and btf_idr walk which caused a deadlock, from Alexei Starovoitov. 3) Second big batch to migrate test_verifier unit tests into test_progs for ease of readability and debugging, from Eduard Zingerman. 4) Add support for refcounted local kptrs to the verifier for allowing shared ownership, useful for adding a node to both the BPF list and rbtree, from Dave Marchevsky. 5) Migrate bpf_for(), bpf_for_each() and bpf_repeat() macros from BPF selftests into libbpf-provided bpf_helpers.h header and improve kfunc handling, from Andrii Nakryiko. 6) Support 64-bit pointers to kfuncs needed for archs like s390x, from Ilya Leoshkevich. 7) Support BPF progs under getsockopt with a NULL optval, from Stanislav Fomichev. 8) Improve verifier u32 scalar equality checking in order to enable LLVM transformations which earlier had to be disabled specifically for BPF backend, from Yonghong Song. 9) Extend bpftool's struct_ops object loading to support links, from Kui-Feng Lee. 10) Add xsk selftest follow-up fixes for hugepage allocated umem, from Magnus Karlsson. 11) Support BPF redirects from tc BPF to ifb devices, from Daniel Borkmann. 12) Add BPF support for integer type when accessing variable length arrays, from Feng Zhou. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (71 commits) selftests/bpf: verifier/value_ptr_arith converted to inline assembly selftests/bpf: verifier/value_illegal_alu converted to inline assembly selftests/bpf: verifier/unpriv converted to inline assembly selftests/bpf: verifier/subreg converted to inline assembly selftests/bpf: verifier/spin_lock converted to inline assembly selftests/bpf: verifier/sock converted to inline assembly selftests/bpf: verifier/search_pruning converted to inline assembly selftests/bpf: verifier/runtime_jit converted to inline assembly selftests/bpf: verifier/regalloc converted to inline assembly selftests/bpf: verifier/ref_tracking converted to inline assembly selftests/bpf: verifier/map_ptr_mixing converted to inline assembly selftests/bpf: verifier/map_in_map converted to inline assembly selftests/bpf: verifier/lwt converted to inline assembly selftests/bpf: verifier/loops1 converted to inline assembly selftests/bpf: verifier/jeq_infer_not_null converted to inline assembly selftests/bpf: verifier/direct_packet_access converted to inline assembly selftests/bpf: verifier/d_path converted to inline assembly selftests/bpf: verifier/ctx converted to inline assembly selftests/bpf: verifier/btf_ctx_access converted to inline assembly selftests/bpf: verifier/bpf_get_stack converted to inline assembly ... ==================== Link: https://lore.kernel.org/r/20230421211035.9111-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-14s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULLIlya Leoshkevich1-3/+8
Thomas Richter reported a crash in linux-next with a backtrace similar to the following one: [<0000000000000000>] 0x0 ([<000000000031a182>] bpf_trace_run4+0xc2/0x218) [<00000000001d59f4>] __bpf_trace_sched_switch+0x1c/0x28 [<0000000000c44a3a>] __schedule+0x43a/0x890 [<0000000000c44ef8>] schedule+0x68/0x110 [<0000000000c4e5ca>] do_nanosleep+0xa2/0x168 [<000000000026e7fe>] hrtimer_nanosleep+0xf6/0x1c0 [<000000000026eb6e>] __s390x_sys_nanosleep+0xb6/0xf0 [<0000000000c3b81c>] __do_syscall+0x1e4/0x208 [<0000000000c50510>] system_call+0x70/0x98 Last Breaking-Event-Address: [<000003ff7fda1814>] bpf_prog_65e887c70a835bbf_on_switch+0x1a4/0x1f0 The problem is that bpf_arch_text_poke() with new_addr == NULL is susceptible to the following race condition: T1 T2 ----------------- ------------------- plt.target = NULL entry: brcl 0xf,plt entry.mask = 0 lgrl %r1,plt.target br %r1 Fix by setting PLT target to the instruction following `brcl 0xf,plt` instead of 0. This way T2 will simply resume the execution of the eBPF program, which is the desired effect of passing new_addr == NULL. Fixes: f1d5df84cd8c ("s390/bpf: Implement bpf_arch_text_poke()") Reported-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/bpf/20230414154755.184502-1-iii@linux.ibm.com
2023-04-14bpf: Support 64-bit pointers to kfuncsIlya Leoshkevich1-0/+5
test_ksyms_module fails to emit a kfunc call targeting a module on s390x, because the verifier stores the difference between kfunc address and __bpf_call_base in bpf_insn.imm, which is s32, and modules are roughly (1 << 42) bytes away from the kernel on s390x. Fix by keeping BTF id in bpf_insn.imm for BPF_PSEUDO_KFUNC_CALLs, and storing the absolute address in bpf_kfunc_desc. Introduce bpf_jit_supports_far_kfunc_call() in order to limit this new behavior to the s390x JIT. Otherwise other JITs need to be modified, which is not desired. Introduce bpf_get_kfunc_addr() instead of exposing both find_kfunc_desc() and struct bpf_kfunc_desc. In addition to sorting kfuncs by imm, also sort them by offset, in order to handle conflicting imms from different modules. Do this on all architectures in order to simplify code. Factor out resolving specialized kfuncs (XPD and dynptr) from fixup_kfunc_call(). This was required in the first place, because fixup_kfunc_call() uses find_kfunc_desc(), which returns a const pointer, so it's not possible to modify kfunc addr without stripping const, which is not nice. It also removes repetition of code like: if (bpf_jit_supports_far_kfunc_call()) desc->addr = func; else insn->imm = BPF_CALL_IMM(func); and separates kfunc_desc_tab fixups from kfunc_call fixups. Suggested-by: Jiri Olsa <olsajiri@gmail.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230412230632.885985-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-30s390/bpf: Implement bpf_jit_supports_kfunc_call()Ilya Leoshkevich1-2/+23
Implement calling kernel functions from eBPF. In general, the eBPF ABI is fairly close to that of s390x, with one important difference: on s390x callers should sign-extend signed arguments. Handle that by using information returned by bpf_jit_find_kfunc_model(). Here is an example of how sign extensions works. Suppose we need to call the following function from BPF: ; long noinline bpf_kfunc_call_test4(signed char a, short b, int c, long d) 0000000000936a78 <bpf_kfunc_call_test4>: 936a78: c0 04 00 00 00 00 jgnop bpf_kfunc_call_test4 ; return (long)a + (long)b + (long)c + d; 936a7e: b9 08 00 45 agr %r4,%r5 936a82: b9 08 00 43 agr %r4,%r3 936a86: b9 08 00 24 agr %r2,%r4 936a8a: c0 f4 00 1e 3b 27 jg <__s390_indirect_jump_r14> As per the s390x ABI, bpf_kfunc_call_test4() has the right to assume that a, b and c are sign-extended by the caller, which results in using 64-bit additions (agr) without any additional conversions. Without sign extension we would have the following on the JITed code side: ; tmp = bpf_kfunc_call_test4(-3, -30, -200, -1000); ; 5: b4 10 00 00 ff ff ff fd w1 = -3 0x3ff7fdcdad4: llilf %r2,0xfffffffd ; 6: b4 20 00 00 ff ff ff e2 w2 = -30 0x3ff7fdcdada: llilf %r3,0xffffffe2 ; 7: b4 30 00 00 ff ff ff 38 w3 = -200 0x3ff7fdcdae0: llilf %r4,0xffffff38 ; 8: b7 40 00 00 ff ff fc 18 r4 = -1000 0x3ff7fdcdae6: lgfi %r5,-1000 0x3ff7fdcdaec: mvc 64(4,%r15),160(%r15) 0x3ff7fdcdaf2: lgrl %r1,bpf_kfunc_call_test4@GOT 0x3ff7fdcdaf8: brasl %r14,__s390_indirect_jump_r1 This first 3 llilfs are 32-bit loads, that need to be sign-extended to 64 bits. Note: at the moment bpf_jit_find_kfunc_model() does not seem to play nicely with XDP metadata functions: add_kfunc_call() adds an "abstract" bpf_*() version to kfunc_btf_tab, but then fixup_kfunc_call() puts the concrete version into insn->imm, which bpf_jit_find_kfunc_model() cannot find. But this seems to be a common code problem. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-7-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-30s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()Ilya Leoshkevich1-10/+27
Allow mixing subprogs and tail calls by passing the current tail call count to subprogs. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-6-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-30s390/bpf: Implement arch_prepare_bpf_trampoline()Ilya Leoshkevich1-22/+520
arch_prepare_bpf_trampoline() is used for direct attachment of eBPF programs to various places, bypassing kprobes. It's responsible for calling a number of eBPF programs before, instead and/or after whatever they are attached to. Add a s390x implementation, paying attention to the following: - Reuse the existing JIT infrastructure, where possible. - Like the existing JIT, prefer making multiple passes instead of backpatching. Currently 2 passes is enough. If literal pool is introduced, this needs to be raised to 3. However, at the moment adding literal pool only makes the code larger. If branch shortening is introduced, the number of passes needs to be increased even further. - Support both regular and ftrace calling conventions, depending on the trampoline flags. - Use expolines for indirect calls. - Handle the mismatch between the eBPF and the s390x ABIs. - Sign-extend fmod_ret return values. invoke_bpf_prog() produces about 120 bytes; it might be possible to slightly optimize this, but reaching 50 bytes, like on x86_64, looks unrealistic: just loading cookie, __bpf_prog_enter, bpf_func, insnsi and __bpf_prog_exit as literals already takes at least 5 * 12 = 60 bytes, and we can't use relative addressing for most of them. Therefore, lower BPF_MAX_TRAMP_LINKS on s390x. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-5-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-30s390/bpf: Implement bpf_arch_text_poke()Ilya Leoshkevich1-0/+97
bpf_arch_text_poke() is used to hotpatch eBPF programs and trampolines. s390x has a very strict hotpatching restriction: the only thing that is allowed to be hotpatched is conditional branch mask. Take the same approach as commit de5012b41e5c ("s390/ftrace: implement hotpatching"): create a conditional jump to a "plt", which loads the target address from memory and jumps to it; then first patch this address, and then the mask. Trampolines (introduced in the next patch) respect the ftrace calling convention: the return address is in %r0, and %r1 is clobbered. With that in mind, bpf_arch_text_poke() does not differentiate between jumps and calls. However, there is a simple optimization for jumps (for the epilogue_ip case): if a jump already points to the destination, then there is no "plt" and we can just flip the mask. For simplicity, the "plt" template is defined in assembly, and its size is used to define C arrays. There doesn't seem to be a way to convey this size to C as a constant, so it's hardcoded and double-checked during runtime. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-4-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-30s390/bpf: Add expoline to tail callsIlya Leoshkevich1-2/+10
All the indirect jumps in the eBPF JIT already use expolines, except for the tail call one. Fixes: de5cb6eb514e ("s390: use expoline thunks in the BPF JIT") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28s390/bpf: Fix a typo in a commentIlya Leoshkevich1-1/+1
"desription" should be "description". Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-27-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-23s390/bpf: Fix typo in commentJulia Lawall1-1/+1
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20220521111145.81697-84-Julia.Lawall@inria.fr
2022-03-10s390: raise minimum supported machine generation to z10Vasily Gorbik1-23/+8
Machine generations up to z9 (released in May 2006) have been officially out of service for several years now (z9 end of service - January 31, 2019). No distributions build kernels supporting those old machine generations anymore, except Debian, which seems to pick the oldest supported generation. The team supporting Debian on s390 has been notified about the change. Raising minimum supported machine generation to z10 helps to reduce maintenance cost and effectively remove code, which is not getting enough testing coverage due to lack of older hardware and distributions support. Besides that this unblocks some optimization opportunities and allows to use wider instruction set in asm files for future features implementation. Due to this change spectre mitigation and usercopy implementations could be drastically simplified and many newer instructions could be converted from ".insn" encoding to instruction names. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08s390/bpf: encode register within extable entryHeiko Carstens1-11/+5
Instead of decoding the instruction that faulted to get the register which needs to be zeroed, simply encode its number into the extable entries during code generation. This allows to get rid of a bit of code, and is also what other architectures are doing. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08s390/extable: convert to relative table with dataHeiko Carstens1-3/+2
Follow arm64, riscv, and x86 and change extable layout to common "relative table with data". This allows to get rid of s390 specific code in sorttable.c. The main difference to before is that extable entries do not contain a relative function pointer anymore. Instead data and type fields are added. The type field is used to indicate which exception handler needs to be called, while the data field is currently unused. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08s390/extable: move EX_TABLE define to asm-extable.hHeiko Carstens1-0/+1
Follow arm64 and riscv and move the EX_TABLE define to asm-extable.h which is a lot less generic than the current linkage.h. Also make sure that all files which contain EX_TABLE usages actually include the new header file. This should make sure that the files always compile and there won't be any random compile breakage due to other header file dependencies. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01s390: always use the packed stack layoutVasily Gorbik1-1/+0
-mpacked-stack option has been supported by both minimum gcc and clang versions for a while. With commit e2bc3e91d91e ("scripts/min-tool-version.sh: Raise minimum clang version to 13.0.0 for s390") minimum clang version now also supports a combination of flags -mpacked-stack -mbackchain -pg -mfentry and fulfills all requirements to always enable the packed stack layout. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-11-16bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33Tiezhu Yang1-3/+3
In the current code, the actual max tail call count is 33 which is greater than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance. We can see the historical evolution from commit 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs") and commit f9dabe016b63 ("bpf: Undo off-by-one in interpreter tail call count limit"). In order to avoid changing existing behavior, the actual limit is 33 now, this is reasonable. After commit 874be05f525e ("bpf, tests: Add tail call test suite"), we can see there exists failed testcase. On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set: # echo 0 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf # dmesg | grep -w FAIL Tail call error path, max count reached jited:0 ret 34 != 33 FAIL On some archs: # echo 1 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf # dmesg | grep -w FAIL Tail call error path, max count reached jited:1 ret 34 != 33 FAIL Although the above failed testcase has been fixed in commit 18935a72eb25 ("bpf/tests: Fix error in tail call limit tests"), it would still be good to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code more readable. The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh. For the riscv JIT, use RV_REG_TCC directly to save one register move as suggested by Björn Töpel. For the other implementations, no function changes, it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT can reflect the actual max tail call count, the related tail call testcases in test_bpf module and selftests can work well for the interpreter and the JIT. Here are the test results on x86_64: # uname -m x86_64 # echo 0 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf test_suite=test_tail_calls # dmesg | tail -1 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed] # rmmod test_bpf # echo 1 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf test_suite=test_tail_calls # dmesg | tail -1 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed] # rmmod test_bpf # ./test_progs -t tailcalls #142 tailcalls:OK Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Björn Töpel <bjorn@kernel.org> Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn
2021-10-26s390: introduce nospec_uses_trampoline()Sven Schnelle1-3/+3
and replace all of the "__is_defined(CC_USING_EXPOLINE) && !nospec_disable" occurrences. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-04bpf, s390: Fix potential memory leak about jit_dataTiezhu Yang1-1/+1
Make sure to free jit_data through kfree() in the error path. Fixes: 1c8f9b91c456 ("bpf: s390: add JIT support for multi-function programs") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-09-16s390/bpf: Fix optimizing out zero-extensionsIlya Leoshkevich1-28/+30
Currently the JIT completely removes things like `reg32 += 0`, however, the BPF_ALU semantics requires the target register to be zero-extended in such cases. Fix by optimizing out only the arithmetic operation, but not the subsequent zero-extension. Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>