summaryrefslogtreecommitdiff
path: root/arch/riscv
AgeCommit message (Collapse)AuthorFilesLines
2024-05-13riscv, bpf: make some atomic operations fully orderedPuranjay Mohan1-10/+10
The BPF atomic operations with the BPF_FETCH modifier along with BPF_XCHG and BPF_CMPXCHG are fully ordered but the RISC-V JIT implements all atomic operations except BPF_CMPXCHG with relaxed ordering. Section 8.1 of the "The RISC-V Instruction Set Manual Volume I: Unprivileged ISA" [1], titled, "Specifying Ordering of Atomic Instructions" says: | To provide more efficient support for release consistency [5], each | atomic instruction has two bits, aq and rl, used to specify additional | memory ordering constraints as viewed by other RISC-V harts. and | If only the aq bit is set, the atomic memory operation is treated as | an acquire access. | If only the rl bit is set, the atomic memory operation is treated as a | release access. | | If both the aq and rl bits are set, the atomic memory operation is | sequentially consistent. Fix this by setting both aq and rl bits as 1 for operations with BPF_FETCH and BPF_XCHG. [1] https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64") Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/r/20240505201633.123115-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-13riscv, bpf: Fix typo in commentXiao Wang1-2/+2
We can use either "instruction" or "insn" in the comment. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/r/20240507111618.437121-1-xiao.w.wang@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-13riscv, bpf: inline bpf_get_smp_processor_id()Puranjay Mohan1-0/+26
Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit. RISCV saves the pointer to the CPU's task_struct in the TP (thread pointer) register. This makes it trivial to get the CPU's processor id. As thread_info is the first member of task_struct, we can read the processor id from TP + offsetof(struct thread_info, cpu). RISCV64 JIT output for `call bpf_get_smp_processor_id` ====================================================== Before After -------- ------- auipc t1,0x848c ld a5,32(tp) jalr 604(t1) mv a5,a0 Benchmark using [1] on Qemu. ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+------------------+------------------+--------------+ | Name | Before | After | % change | |---------------+------------------+------------------+--------------| | glob-arr-inc | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s | + 24.04% | | arr-inc | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s | + 23.56% | | hash-inc | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s | + 32.18% | +---------------+------------------+------------------+--------------+ NOTE: This benchmark includes changes from this patch and the previous patch that implemented the per-cpu insn. [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-13riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan1-0/+24
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. RISC-V uses generic per-cpu implementation where the offsets for CPUs are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores the address of the task_struct in TP register. The first element in task_struct is struct thread_info, and we can get the cpu number by reading from the TP register + offsetof(struct thread_info, cpu). Once we have the cpu number in a register we read the offset for that cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this offset to the destination register. To measure the improvement from this change, the benchmark in [1] was used on Qemu: Before: glob-arr-inc : 1.127 ± 0.013M/s arr-inc : 1.121 ± 0.004M/s hash-inc : 0.681 ± 0.052M/s After: glob-arr-inc : 1.138 ± 0.011M/s arr-inc : 1.366 ± 0.006M/s hash-inc : 0.676 ± 0.001M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240502151854.9810-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-29Merge tag 'for-netdev' of ↵Jakub Kicinski3-2/+205
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-04-29 We've added 147 non-merge commits during the last 32 day(s) which contain a total of 158 files changed, 9400 insertions(+), 2213 deletions(-). The main changes are: 1) Add an internal-only BPF per-CPU instruction for resolving per-CPU memory addresses and implement support in x86 BPF JIT. This allows inlining per-CPU array and hashmap lookups and the bpf_get_smp_processor_id() helper, from Andrii Nakryiko. 2) Add BPF link support for sk_msg and sk_skb programs, from Yonghong Song. 3) Optimize x86 BPF JIT's emit_mov_imm64, and add support for various atomics in bpf_arena which can be JITed as a single x86 instruction, from Alexei Starovoitov. 4) Add support for passing mark with bpf_fib_lookup helper, from Anton Protopopov. 5) Add a new bpf_wq API for deferring events and refactor sleepable bpf_timer code to keep common code where possible, from Benjamin Tissoires. 6) Fix BPF_PROG_TEST_RUN infra with regards to bpf_dummy_struct_ops programs to check when NULL is passed for non-NULLable parameters, from Eduard Zingerman. 7) Harden the BPF verifier's and/or/xor value tracking, from Harishankar Vishwanathan. 8) Introduce crypto kfuncs to make BPF programs able to utilize the kernel crypto subsystem, from Vadim Fedorenko. 9) Various improvements to the BPF instruction set standardization doc, from Dave Thaler. 10) Extend libbpf APIs to partially consume items from the BPF ringbuffer, from Andrea Righi. 11) Bigger batch of BPF selftests refactoring to use common network helpers and to drop duplicate code, from Geliang Tang. 12) Support bpf_tail_call_static() helper for BPF programs with GCC 13, from Jose E. Marchesi. 13) Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF program to have code sections where preemption is disabled, from Kumar Kartikeya Dwivedi. 14) Allow invoking BPF kfuncs from BPF_PROG_TYPE_SYSCALL programs, from David Vernet. 15) Extend the BPF verifier to allow different input maps for a given bpf_for_each_map_elem() helper call in a BPF program, from Philo Lu. 16) Add support for PROBE_MEM32 and bpf_addr_space_cast instructions for riscv64 and arm64 JITs to enable BPF Arena, from Puranjay Mohan. 17) Shut up a false-positive KMSAN splat in interpreter mode by unpoison the stack memory, from Martin KaFai Lau. 18) Improve xsk selftest coverage with new tests on maximum and minimum hardware ring size configurations, from Tushar Vyavahare. 19) Various ReST man pages fixes as well as documentation and bash completion improvements for bpftool, from Rameez Rehman & Quentin Monnet. 20) Fix libbpf with regards to dumping subsequent char arrays, from Quentin Deslandes. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (147 commits) bpf, docs: Clarify PC use in instruction-set.rst bpf_helpers.h: Define bpf_tail_call_static when building with GCC bpf, docs: Add introduction for use in the ISA Internet Draft selftests/bpf: extend BPF_SOCK_OPS_RTT_CB test for srtt and mrtt_us bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args selftests/bpf: dummy_st_ops should reject 0 for non-nullable params bpf: check bpf_dummy_struct_ops program params for test runs selftests/bpf: do not pass NULL for non-nullable params in dummy_st_ops selftests/bpf: adjust dummy_st_ops_success to detect additional error bpf: mark bpf_dummy_struct_ops.test_1 parameter as nullable selftests/bpf: Add ring_buffer__consume_n test. bpf: Add bpf_guard_preempt() convenience macro selftests: bpf: crypto: add benchmark for crypto functions selftests: bpf: crypto skcipher algo selftests bpf: crypto: add skcipher to bpf crypto bpf: make common crypto API for TC/XDP programs bpf: update the comment for BTF_FIELDS_MAX selftests/bpf: Fix wq test. selftests/bpf: Use make_sockaddr in test_sock_addr selftests/bpf: Use connect_to_addr in test_sock_addr ... ==================== Link: https://lore.kernel.org/r/20240429131657.19423-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-05Merge tag 'riscv-for-linus-6.9-rc3' of ↵Linus Torvalds12-20/+34
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for an __{get,put}_kernel_nofault to avoid an uninitialized value causing spurious failures - compat_vdso.so.dbg is now installed to the standard install location - A fix to avoid initializing PERF_SAMPLE_BRANCH_*-related events, as they aren't supported and will just later fail - A fix to make AT_VECTOR_SIZE_ARCH correct now that we're providing AT_MINSIGSTKSZ - pgprot_nx() is now implemented, which fixes vmap W^X protection - A fix for the vector save/restore code, which at least manifests as corrupted vector state when a signal is taken - A fix for a race condition in instruction patching - A fix to avoid leaking the kernel-mode GP to userspace, which is a kernel pointer leak that can be used to defeat KASLR in various ways - A handful of smaller fixes to build warnings, an overzealous printk, and some missing tracing annotations * tag 'riscv-for-linus-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: process: Fix kernel gp leakage riscv: Disable preemption when using patch_map() riscv: Fix warning by declaring arch_cpu_idle() as noinstr riscv: use KERN_INFO in do_trap riscv: Fix vector state restore in rt_sigreturn() riscv: mm: implement pgprot_nx riscv: compat_vdso: align VDSOAS build log RISC-V: Update AT_VECTOR_SIZE_ARCH for new AT_MINSIGSTKSZ riscv: Mark __se_sys_* functions __used drivers/perf: riscv: Disable PERF_SAMPLE_BRANCH_* while not supported riscv: compat_vdso: install compat_vdso.so.dbg to /lib/modules/*/vdso/ riscv: hwprobe: do not produce frtace relocation riscv: Fix spurious errors from __get/put_kernel_nofault riscv: mm: Fix prototype to avoid discarding const
2024-04-04riscv: process: Fix kernel gp leakageStefan O'Rear1-3/+0
childregs represents the registers which are active for the new thread in user context. For a kernel thread, childregs->gp is never used since the kernel gp is not touched by switch_to. For a user mode helper, the gp value can be observed in user space after execve or possibly by other means. [From the email thread] The /* Kernel thread */ comment is somewhat inaccurate in that it is also used for user_mode_helper threads, which exec a user process, e.g. /sbin/init or when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have PF_KTHREAD set and are valid targets for ptrace etc. even before they exec. childregs is the *user* context during syscall execution and it is observable from userspace in at least five ways: 1. kernel_execve does not currently clear integer registers, so the starting register state for PID 1 and other user processes started by the kernel has sp = user stack, gp = kernel __global_pointer$, all other integer registers zeroed by the memset in the patch comment. This is a bug in its own right, but I'm unwilling to bet that it is the only way to exploit the issue addressed by this patch. 2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread before it execs, but ptrace requires SIGSTOP to be delivered which can only happen at user/kernel boundaries. 3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for user_mode_helpers before the exec completes, but gp is not one of the registers it returns. 4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses are also exposed via PERF_SAMPLE_REGS_USER which is permitted under LOCKDOWN_PERF. I have not attempted to write exploit code. 5. Much of the tracing infrastructure allows access to user registers. I have not attempted to determine which forms of tracing allow access to user registers without already allowing access to kernel registers. Fixes: 7db91e57a0ac ("RISC-V: Task implementation") Cc: stable@vger.kernel.org Signed-off-by: Stefan O'Rear <sorear@fastmail.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240327061258.2370291-1-sorear@fastmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-04riscv: Disable preemption when using patch_map()Alexandre Ghiti1-0/+8
patch_map() uses fixmap mappings to circumvent the non-writability of the kernel text mapping. The __set_fixmap() function only flushes the current cpu tlb, it does not emit an IPI so we must make sure that while we use a fixmap mapping, the current task is not migrated on another cpu which could miss the newly introduced fixmap mapping. So in order to avoid any task migration, disable the preemption. Reported-by: Andrea Parri <andrea@rivosinc.com> Closes: https://lore.kernel.org/all/ZcS+GAaM25LXsBOl@andrea/ Reported-by: Andy Chiu <andy.chiu@sifive.com> Closes: https://lore.kernel.org/linux-riscv/CABgGipUMz3Sffu-CkmeUB1dKVwVQ73+7=sgC45-m0AE9RCjOZg@mail.gmail.com/ Fixes: cad539baa48f ("riscv: implement a memset like function for text") Fixes: 0ff7c3b33127 ("riscv: Use text_mutex instead of patch_lock") Co-developed-by: Andy Chiu <andy.chiu@sifive.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Acked-by: Puranjay Mohan <puranjay12@gmail.com> Link: https://lore.kernel.org/r/20240326203017.310422-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-04riscv: Fix warning by declaring arch_cpu_idle() as noinstrAlexandre Ghiti1-1/+1
The following warning appears when using ftrace: [89855.443413] RCU not on for: arch_cpu_idle+0x0/0x1c [89855.445640] WARNING: CPU: 5 PID: 0 at include/linux/trace_recursion.h:162 arch_ftrace_ops_list_func+0x208/0x228 [89855.445824] Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xt_addrtype(E) nft_compat(E) nf_tables(E) nfnetlink(E) br_netfilter(E) cfg80211(E) nls_iso8859_1(E) ofpart(E) redboot(E) cmdlinepart(E) cfi_cmdset_0001(E) virtio_net(E) cfi_probe(E) cfi_util(E) 9pnet_virtio(E) gen_probe(E) net_failover(E) virtio_rng(E) failover(E) 9pnet(E) physmap(E) map_funcs(E) chipreg(E) mtd(E) uio_pdrv_genirq(E) uio(E) dm_multipath(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) drm(E) efi_pstore(E) backlight(E) ip_tables(E) x_tables(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) xor(E) async_tx(E) raid6_pq(E) raid1(E) raid0(E) virtio_blk(E) [89855.451563] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G E 6.8.0-rc6ubuntu-defconfig #2 [89855.451726] Hardware name: riscv-virtio,qemu (DT) [89855.451899] epc : arch_ftrace_ops_list_func+0x208/0x228 [89855.452016] ra : arch_ftrace_ops_list_func+0x208/0x228 [89855.452119] epc : ffffffff8016b216 ra : ffffffff8016b216 sp : ffffaf808090fdb0 [89855.452171] gp : ffffffff827c7680 tp : ffffaf808089ad40 t0 : ffffffff800c0dd8 [89855.452216] t1 : 0000000000000001 t2 : 0000000000000000 s0 : ffffaf808090fe30 [89855.452306] s1 : 0000000000000000 a0 : 0000000000000026 a1 : ffffffff82cd6ac8 [89855.452423] a2 : ffffffff800458c8 a3 : ffffaf80b1870640 a4 : 0000000000000000 [89855.452646] a5 : 0000000000000000 a6 : 00000000ffffffff a7 : ffffffffffffffff [89855.452698] s2 : ffffffff82766872 s3 : ffffffff80004caa s4 : ffffffff80ebea90 [89855.452743] s5 : ffffaf808089bd40 s6 : 8000000a00006e00 s7 : 0000000000000008 [89855.452787] s8 : 0000000000002000 s9 : 0000000080043700 s10: 0000000000000000 [89855.452831] s11: 0000000000000000 t3 : 0000000000100000 t4 : 0000000000000064 [89855.452874] t5 : 000000000000000c t6 : ffffaf80b182dbfc [89855.452929] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [89855.453053] [<ffffffff8016b216>] arch_ftrace_ops_list_func+0x208/0x228 [89855.453191] [<ffffffff8000e082>] ftrace_call+0x8/0x22 [89855.453265] [<ffffffff800a149c>] do_idle+0x24c/0x2ca [89855.453357] [<ffffffff8000da54>] return_to_handler+0x0/0x26 [89855.453429] [<ffffffff8000b716>] smp_callin+0x92/0xb6 [89855.453785] ---[ end trace 0000000000000000 ]--- To fix this, mark arch_cpu_idle() as noinstr, like it is done in commit a9cbc1b471d2 ("s390/idle: mark arch_cpu_idle() noinstr"). Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Closes: https://lore.kernel.org/linux-riscv/51f21b87-ebed-4411-afbc-c00d3dea2bab@yadro.com/ Fixes: cfbc4f81c9d0 ("riscv: Select ARCH_WANTS_NO_INSTR") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Andy Chiu <andy.chiu@sifive.com> Acked-by: Puranjay Mohan <puranjay12@gmail.com> Link: https://lore.kernel.org/r/20240326203017.310422-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-04riscv: use KERN_INFO in do_trapAndreas Schwab1-1/+1
Print the instruction dump with info instead of emergency level. The unhandled signal message is only for informational purpose. Fixes: b8a03a634129 ("riscv: add userland instruction dump to RISC-V splats") Signed-off-by: Andreas Schwab <schwab@suse.de> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/mvmy1aegrhm.fsf@suse.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-04bpf, riscv: Implement bpf_addr_space_cast instructionPuranjay Mohan3-0/+16
LLVM generates bpf_addr_space_cast instruction while translating pointers between native (zero) address space and __attribute__((address_space(N))). The addr_space=0 is reserved as bpf_arena address space. rY = addr_space_cast(rX, 0, 1) is processed by the verifier and converted to normal 32-bit move: wX = wY rY = addr_space_cast(rX, 1, 0) has to be converted by JIT. Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20240404114203.105970-3-puranjay12@gmail.com
2024-04-04bpf, riscv: Implement PROBE_MEM32 pseudo instructionsPuranjay Mohan3-2/+189
Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW] instructions. They are similar to PROBE_MEM instructions with the following differences: - PROBE_MEM32 supports store. - PROBE_MEM32 relies on the verifier to clear upper 32-bit of the src/dst register - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in S7 in the prologue). Due to bpf_arena constructions such S7 + reg + off16 access is guaranteed to be within arena virtual range, so no address check at run-time. - S11 is a free callee-saved register, so it is used to store kern_vm_start - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When LDX faults the destination register is zeroed. To support these on riscv, we do tmp = S7 + src/dst reg and then use tmp2 as the new src/dst register. This allows us to reuse most of the code for normal [LDX | STX | ST]. Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20240404114203.105970-2-puranjay12@gmail.com
2024-04-04riscv: Fix vector state restore in rt_sigreturn()Björn Töpel1-7/+8
The RISC-V Vector specification states in "Appendix D: Calling Convention for Vector State" [1] that "Executing a system call causes all caller-saved vector registers (v0-v31, vl, vtype) and vstart to become unspecified.". In the RISC-V kernel this is called "discarding the vstate". Returning from a signal handler via the rt_sigreturn() syscall, vector discard is also performed. However, this is not an issue since the vector state should be restored from the sigcontext, and therefore not care about the vector discard. The "live state" is the actual vector register in the running context, and the "vstate" is the vector state of the task. A dirty live state, means that the vstate and live state are not in synch. When vectorized user_from_copy() was introduced, an bug sneaked in at the restoration code, related to the discard of the live state. An example when this go wrong: 1. A userland application is executing vector code 2. The application receives a signal, and the signal handler is entered. 3. The application returns from the signal handler, using the rt_sigreturn() syscall. 4. The live vector state is discarded upon entering the rt_sigreturn(), and the live state is marked as "dirty", indicating that the live state need to be synchronized with the current vstate. 5. rt_sigreturn() restores the vstate, except the Vector registers, from the sigcontext 6. rt_sigreturn() restores the Vector registers, from the sigcontext, and now the vectorized user_from_copy() is used. The dirty live state from the discard is saved to the vstate, making the vstate corrupt. 7. rt_sigreturn() returns to the application, which crashes due to corrupted vstate. Note that the vectorized user_from_copy() is invoked depending on the value of CONFIG_RISCV_ISA_V_UCOPY_THRESHOLD. Default is 768, which means that vlen has to be larger than 128b for this bug to trigger. The fix is simply to mark the live state as non-dirty/clean prior performing the vstate restore. Link: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-8abdb41-2024-03-26/unpriv-isa-asciidoc.pdf # [1] Reported-by: Charlie Jenkins <charlie@rivosinc.com> Reported-by: Vineet Gupta <vgupta@kernel.org> Fixes: c2a658d41924 ("riscv: lib: vectorize copy_to_user/copy_from_user") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Vineet Gupta <vineetg@rivosinc.com> Link: https://lore.kernel.org/r/20240403072638.567446-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-7/+32
Pull KVM fixes from Paolo Bonzini: "ARM: - Ensure perf events programmed to count during guest execution are actually enabled before entering the guest in the nVHE configuration - Restore out-of-range handler for stage-2 translation faults - Several fixes to stage-2 TLB invalidations to avoid stale translations, possibly including partial walk caches - Fix early handling of architectural VHE-only systems to ensure E2H is appropriately set - Correct a format specifier warning in the arch_timer selftest - Make the KVM banner message correctly handle all of the possible configurations RISC-V: - Remove redundant semicolon in num_isa_ext_regs() - Fix APLIC setipnum_le/be write emulation - Fix APLIC in_clrip[x] read emulation x86: - Fix a bug in KVM_SET_CPUID{2,} where KVM looks at the wrong CPUID entries (old vs. new) and ultimately neglects to clear PV_UNHALT from vCPUs with HLT-exiting disabled - Documentation fixes for SEV - Fix compat ABI for KVM_MEMORY_ENCRYPT_OP - Fix a 14-year-old goof in a declaration shared by host and guest; the enabled field used by Linux when running as a guest pushes the size of "struct kvm_vcpu_pv_apf_data" from 64 to 68 bytes. This is really unconsequential because KVM never consumes anything beyond the first 64 bytes, but the resulting struct does not match the documentation Selftests: - Fix spelling mistake in arch_timer selftest" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) KVM: arm64: Rationalise KVM banner output arm64: Fix early handling of FEAT_E2H0 not being implemented KVM: arm64: Ensure target address is granule-aligned for range TLBI KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range() KVM: arm64: Don't pass a TLBI level hint when zapping table entries KVM: arm64: Don't defer TLB invalidation when zapping table entries KVM: selftests: Fix __GUEST_ASSERT() format warnings in ARM's arch timer test KVM: arm64: Fix out-of-IPA space translation fault handling KVM: arm64: Fix host-programmed guest events in nVHE RISC-V: KVM: Fix APLIC in_clrip[x] read emulation RISC-V: KVM: Fix APLIC setipnum_le/be write emulation RISC-V: KVM: Remove second semicolon KVM: selftests: Fix spelling mistake "trigged" -> "triggered" Documentation: kvm/sev: clarify usage of KVM_MEMORY_ENCRYPT_OP Documentation: kvm/sev: separate description of firmware KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP KVM: selftests: Check that PV_UNHALT is cleared when HLT exiting is disabled KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT KVM: x86: Introduce __kvm_get_hypervisor_cpuid() helper KVM: SVM: Return -EINVAL instead of -EBUSY on attempt to re-init SEV/SEV-ES ...
2024-03-28Merge tag 'net-6.9-rc2' of ↵Linus Torvalds1-0/+16
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf, WiFi and netfilter. Current release - regressions: - ipv6: fix address dump when IPv6 is disabled on an interface Current release - new code bugs: - bpf: temporarily disable atomic operations in BPF arena - nexthop: fix uninitialized variable in nla_put_nh_group_stats() Previous releases - regressions: - bpf: protect against int overflow for stack access size - hsr: fix the promiscuous mode in offload mode - wifi: don't always use FW dump trig - tls: adjust recv return with async crypto and failed copy to userspace - tcp: properly terminate timers for kernel sockets - ice: fix memory corruption bug with suspend and rebuild - at803x: fix kernel panic with at8031_probe - qeth: handle deferred cc1 Previous releases - always broken: - bpf: fix bug in BPF_LDX_MEMSX - netfilter: reject table flag and netdev basechain updates - inet_defrag: prevent sk release while still in use - wifi: pick the version of SESSION_PROTECTION_NOTIF - wwan: t7xx: split 64bit accesses to fix alignment issues - mlxbf_gige: call request_irq() after NAPI initialized - hns3: fix kernel crash when devlink reload during pf initialization" * tag 'net-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) inet: inet_defrag: prevent sk release while still in use Octeontx2-af: fix pause frame configuration in GMP mode net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips net: bcmasp: Remove phy_{suspend/resume} net: bcmasp: Bring up unimac after PHY link up net: phy: qcom: at803x: fix kernel panic with at8031_probe netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c netfilter: nf_tables: skip netdev hook unregistration if table is dormant netfilter: nf_tables: reject table flag and netdev basechain updates netfilter: nf_tables: reject destroy command to remove basechain hooks bpf: update BPF LSM designated reviewer list bpf: Protect against int overflow for stack access size bpf: Check bloom filter map value size bpf: fix warning for crash_kexec selftests: netdevsim: set test timeout to 10 minutes net: wan: framer: Add missing static inline qualifiers mlxbf_gige: call request_irq() after NAPI initialized tls: get psock ref after taking rxlock to avoid leak selftests: tls: add test with a partially invalid iov tls: adjust recv return with async crypto and failed copy to userspace ...
2024-03-27riscv: mm: implement pgprot_nxJisheng Zhang1-0/+6
commit cca98e9f8b5e ("mm: enforce that vmap can't map pages executable") enforces the W^X protection by not allowing remapping existing pages as executable. Add riscv bits so that riscv can benefit the same protection. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231121160637.3856-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-27riscv: compat_vdso: align VDSOAS build logMasahiro Yamada1-1/+1
Add one more space after "VDSOAS" for better alignment in the build log. [Before] LDS arch/riscv/kernel/compat_vdso/compat_vdso.lds VDSOAS arch/riscv/kernel/compat_vdso/rt_sigreturn.o VDSOAS arch/riscv/kernel/compat_vdso/getcpu.o VDSOAS arch/riscv/kernel/compat_vdso/flush_icache.o VDSOAS arch/riscv/kernel/compat_vdso/note.o VDSOLD arch/riscv/kernel/compat_vdso/compat_vdso.so.dbg VDSOSYM include/generated/compat_vdso-offsets.h [After] LDS arch/riscv/kernel/compat_vdso/compat_vdso.lds VDSOAS arch/riscv/kernel/compat_vdso/rt_sigreturn.o VDSOAS arch/riscv/kernel/compat_vdso/getcpu.o VDSOAS arch/riscv/kernel/compat_vdso/flush_icache.o VDSOAS arch/riscv/kernel/compat_vdso/note.o VDSOLD arch/riscv/kernel/compat_vdso/compat_vdso.so.dbg VDSOSYM include/generated/compat_vdso-offsets.h Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231117125843.1058553-1-masahiroy@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-27RISC-V: Update AT_VECTOR_SIZE_ARCH for new AT_MINSIGSTKSZVictor Isaev1-1/+1
"riscv: signal: Report signal frame size to userspace via auxv" (e92f469) has added new constant AT_MINSIGSTKSZ but failed to increment the size of auxv, keeping AT_VECTOR_SIZE_ARCH at 9. This fix correctly increments AT_VECTOR_SIZE_ARCH to 10, following the approach in the commit 94b07c1 ("arm64: signal: Report signal frame size to userspace via auxv"). Link: https://lore.kernel.org/r/73883406.20231215232720@torrio.net Link: https://lore.kernel.org/all/20240102133617.3649-1-victor@torrio.net/ Reported-by: Ivan Komarov <ivan.komarov@dfyz.info> Closes: https://lore.kernel.org/linux-riscv/CY3Z02NYV1C4.11BLB9PLVW9G1@fedora/ Fixes: e92f469b0771 ("riscv: signal: Report signal frame size to userspace via auxv") Signed-off-by: Victor Isaev <isv@google.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-27riscv: Mark __se_sys_* functions __usedSami Tolvanen1-1/+2
Clang doesn't think ___se_sys_* functions used even though they are aliased to __se_sys_*, resulting in -Wunused-function warnings when building rv32. For example: mm/oom_kill.c:1195:1: warning: unused function '___se_sys_process_mrelease' [-Wunused-function] 1195 | SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/syscalls.h:221:36: note: expanded from macro 'SYSCALL_DEFINE2' 221 | #define SYSCALL_DEFINE2(name, ...) SYSCALL_DEFINEx(2, _##name, __VA_ARGS__) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/syscalls.h:231:2: note: expanded from macro 'SYSCALL_DEFINEx' 231 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/syscall_wrapper.h:81:2: note: expanded from macro '__SYSCALL_DEFINEx' 81 | __SYSCALL_SE_DEFINEx(x, sys, name, __VA_ARGS__) \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/syscall_wrapper.h:40:14: note: expanded from macro '__SYSCALL_SE_DEFINEx' 40 | static long ___se_##prefix##name(__MAP(x,__SC_LONG,__VA_ARGS__)) | ^~~~~~~~~~~~~~~~~~~~ <scratch space>:30:1: note: expanded from here 30 | ___se_sys_process_mrelease | ^~~~~~~~~~~~~~~~~~~~~~~~~~ 1 warning generated. Mark the functions __used explicitly to fix the Clang warnings. Fixes: a9ad73295cc1 ("riscv: Fix syscall wrapper for >word-size arguments") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240326153712.1839482-2-samitolvanen@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-27riscv: compat_vdso: install compat_vdso.so.dbg to /lib/modules/*/vdso/Masahiro Yamada1-1/+1
'make vdso_install' installs debug vdso files to /lib/modules/*/vdso/. Only for the compat vdso on riscv, the installation destination differs; compat_vdso.so.dbg is installed to /lib/module/*/compat_vdso/. To follow the standard install destination and simplify the vdso_install logic, change the install destination to standard /lib/modules/*/vdso/. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231117125807.1058477-1-masahiroy@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-27riscv: hwprobe: do not produce frtace relocationVladimir Isaev1-0/+1
Such relocation causes crash of android linker similar to one described in commit e05d57dcb8c7 ("riscv: Fixup __vdso_gettimeofday broke dynamic ftrace"). Looks like this relocation is added by CONFIG_DYNAMIC_FTRACE which is disabled in the default android kernel. Before: readelf -rW arch/riscv/kernel/vdso/vdso.so: Relocation section '.rela.dyn' at offset 0xd00 contains 1 entry: Offset Info Type 0000000000000d20 0000000000000003 R_RISCV_RELATIVE objdump: 0000000000000c86 <__vdso_riscv_hwprobe@@LINUX_4.15>: c86: 0001 nop c88: 0001 nop c8a: 0001 nop c8c: 0001 nop c8e: e211 bnez a2,c92 <__vdso_riscv_hwprobe... After: readelf -rW arch/riscv/kernel/vdso/vdso.so: There are no relocations in this file. objdump: 0000000000000c86 <__vdso_riscv_hwprobe@@LINUX_4.15>: c86: e211 bnez a2,c8a <__vdso_riscv_hwprobe... c88: c6b9 beqz a3,cd6 <__vdso_riscv_hwprobe... c8a: e739 bnez a4,cd8 <__vdso_riscv_hwprobe... c8c: ffffd797 auipc a5,0xffffd Also disable SCS since it also should not be available in vdso. Fixes: aa5af0aa90ba ("RISC-V: Add hwprobe vDSO function and data") Signed-off-by: Roman Artemev <roman.artemev@syntacore.com> Signed-off-by: Vladimir Isaev <vladimir.isaev@syntacore.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20240313085843.17661-1-vladimir.isaev@syntacore.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-26riscv: Fix spurious errors from __get/put_kernel_nofaultSamuel Holland1-2/+2
These macros did not initialize __kr_err, so they could fail even if the access did not fault. Cc: stable@vger.kernel.org Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240312022030.320789-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-26riscv: mm: Fix prototype to avoid discarding constSamuel Holland1-2/+2
__flush_tlb_range() does not modify the provided cpumask, so its cmask parameter can be pointer-to-const. This avoids the unsafe cast of cpu_online_mask. Fixes: 54d7431af73e ("riscv: Add support for BATCHED_UNMAP_TLB_FLUSH") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240301201837.2826172-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-26Merge tag 'for-netdev' of ↵Paolo Abeni1-0/+16
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-03-25 The following pull-request contains BPF updates for your *net* tree. We've added 17 non-merge commits during the last 12 day(s) which contain a total of 19 files changed, 184 insertions(+), 61 deletions(-). The main changes are: 1) Fix an arm64 BPF JIT bug in BPF_LDX_MEMSX implementation's offset handling found via test_bpf module, from Puranjay Mohan. 2) Various fixups to the BPF arena code in particular in the BPF verifier and around BPF selftests to match latest corresponding LLVM implementation, from Puranjay Mohan and Alexei Starovoitov. 3) Fix xsk to not assume that metadata is always requested in TX completion, from Stanislav Fomichev. 4) Fix riscv BPF JIT's kfunc parameter incompatibility between BPF and the riscv ABI which requires sign-extension on int/uint, from Pu Lehui. 5) Fix s390x BPF JIT's bpf_plt pointer arithmetic which triggered a crash when testing struct_ops, from Ilya Leoshkevich. 6) Fix libbpf's arena mmap handling which had incorrect u64-to-pointer cast on 32-bit architectures, from Andrii Nakryiko. 7) Fix libbpf to define MFD_CLOEXEC when not available, from Arnaldo Carvalho de Melo. 8) Fix arm64 BPF JIT implementation for 32bit unconditional bswap which resulted in an incorrect swap as indicated by test_bpf, from Artem Savkov. 9) Fix BPF man page build script to use silent mode, from Hangbin Liu. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: riscv, bpf: Fix kfunc parameters incompatibility between bpf and riscv abi bpf: verifier: reject addr_space_cast insn without arena selftests/bpf: verifier_arena: fix mmap address for arm64 bpf: verifier: fix addr_space_cast from as(1) to as(0) libbpf: Define MFD_CLOEXEC if not available arm64: bpf: fix 32bit unconditional bswap bpf, arm64: fix bug in BPF_LDX_MEMSX libbpf: fix u64-to-pointer cast on 32-bit arches s390/bpf: Fix bpf_plt pointer arithmetic xsk: Don't assume metadata is always requested in TX completion selftests/bpf: Add arena test case for 4Gbyte corner case selftests/bpf: Remove hard coded PAGE_SIZE macro. libbpf, selftests/bpf: Adjust libbpf, bpftool, selftests to match LLVM bpf: Clarify bpf_arena comments. MAINTAINERS: Update email address for Quentin Monnet scripts/bpf_doc: Use silent mode when exec make cmd bpf: Temporarily disable atomic operations in BPF arena ==================== Link: https://lore.kernel.org/r/20240325213520.26688-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-26RISC-V: KVM: Fix APLIC in_clrip[x] read emulationAnup Patel1-3/+18
The reads to APLIC in_clrip[x] registers returns rectified input values of the interrupt sources. A rectified input value of an interrupt source is defined by the section "4.5.2 Source configurations (sourcecfg[1]–sourcecfg[1023])" of the RISC-V AIA specification as: rectified input value = (incoming wire value) XOR (source is inverted) Update the riscv_aplic_input() implementation to match the above. Cc: stable@vger.kernel.org Fixes: 74967aa208e2 ("RISC-V: KVM: Add in-kernel emulation of AIA APLIC") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240321085041.1955293-3-apatel@ventanamicro.com
2024-03-25riscv, bpf: Fix kfunc parameters incompatibility between bpf and riscv abiPu Lehui1-0/+16
We encountered a failing case when running selftest in no_alu32 mode: The failure case is `kfunc_call/kfunc_call_test4` and its source code is like bellow: ``` long bpf_kfunc_call_test4(signed char a, short b, int c, long d) __ksym; int kfunc_call_test4(struct __sk_buff *skb) { ... tmp = bpf_kfunc_call_test4(-3, -30, -200, -1000); ... } ``` And its corresponding asm code is: ``` 0: r1 = -3 1: r2 = -30 2: r3 = 0xffffff38 # opcode: 18 03 00 00 38 ff ff ff 00 00 00 00 00 00 00 00 4: r4 = -1000 5: call bpf_kfunc_call_test4 ``` insn 2 is parsed to ld_imm64 insn to emit 0x00000000ffffff38 imm, and converted to int type and then send to bpf_kfunc_call_test4. But since it is zero-extended in the bpf calling convention, riscv jit will directly treat it as an unsigned 32-bit int value, and then fails with the message "actual 4294966063 != expected -1234". The reason is the incompatibility between bpf and riscv abi, that is, bpf will do zero-extension on uint, but riscv64 requires sign-extension on int or uint. We can solve this problem by sign extending the 32-bit parameters in kfunc. The issue is related to [0], and thanks to Yonghong and Alexei. Link: https://github.com/llvm/llvm-project/pull/84874 [0] Fixes: d40c3847b485 ("riscv, bpf: Add kfunc support for RV64") Signed-off-by: Pu Lehui <pulehui@huawei.com> Tested-by: Puranjay Mohan <puranjay12@gmail.com> Reviewed-by: Puranjay Mohan <puranjay12@gmail.com> Link: https://lore.kernel.org/r/20240324103306.2202954-1-pulehui@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-25RISC-V: KVM: Fix APLIC setipnum_le/be write emulationAnup Patel1-3/+13
The writes to setipnum_le/be register for APLIC in MSI-mode have special consideration for level-triggered interrupts as-per the section "4.9.2 Special consideration for level-sensitive interrupt sources" of the RISC-V AIA specification. Particularly, the below text from the RISC-V AIA specification defines the behaviour of writes to setipnum_le/be register for level-triggered interrupts: "A second option is for the interrupt service routine to write the APLIC’s source identity number for the interrupt to the domain’s setipnum register just before exiting. This will cause the interrupt’s pending bit to be set to one again if the source is still asserting an interrupt, but not if the source is not asserting an interrupt." Fix setipnum_le/be write emulation for in-kernel APLIC by implementing the above behaviour in aplic_write_pending() function. Cc: stable@vger.kernel.org Fixes: 74967aa208e2 ("RISC-V: KVM: Add in-kernel emulation of AIA APLIC") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240321085041.1955293-2-apatel@ventanamicro.com
2024-03-25RISC-V: KVM: Remove second semicolonColin Ian King1-1/+1
There is a statement with two semicolons. Remove the second one, it is redundant. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240315092914.2431214-1-colin.i.king@gmail.com
2024-03-22Merge tag 'riscv-for-linus-6.9-mw2' of ↵Linus Torvalds68-550/+4316
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for various vector-accelerated crypto routines - Hibernation is now enabled for portable kernel builds - mmap_rnd_bits_max is larger on systems with larger VAs - Support for fast GUP - Support for membarrier-based instruction cache synchronization - Support for the Andes hart-level interrupt controller and PMU - Some cleanups around unaligned access speed probing and Kconfig settings - Support for ACPI LPI and CPPC - Various cleanus related to barriers - A handful of fixes * tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits) riscv: Fix syscall wrapper for >word-size arguments crypto: riscv - add vector crypto accelerated AES-CBC-CTS crypto: riscv - parallelize AES-CBC decryption riscv: Only flush the mm icache when setting an exec pte riscv: Use kcalloc() instead of kzalloc() riscv/barrier: Add missing space after ',' riscv/barrier: Consolidate fence definitions riscv/barrier: Define RISCV_FULL_BARRIER riscv/barrier: Define __{mb,rmb,wmb} RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ cpufreq: Move CPPC configs to common Kconfig and add RISC-V ACPI: RISC-V: Add CPPC driver ACPI: Enable ACPI_PROCESSOR for RISC-V ACPI: RISC-V: Add LPI driver cpuidle: RISC-V: Move few functions to arch/riscv riscv: Introduce set_compat_task() in asm/compat.h riscv: Introduce is_compat_thread() into compat.h riscv: add compile-time test into is_compat_task() riscv: Replace direct thread flag check with is_compat_task() riscv: Improve arch_get_mmap_end() macro ...
2024-03-22Merge tag 'kbuild-v6.9' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Generate a list of built DTB files (arch/*/boot/dts/dtbs-list) - Use more threads when building Debian packages in parallel - Fix warnings shown during the RPM kernel package uninstallation - Change OBJECT_FILES_NON_STANDARD_*.o etc. to take a relative path to Makefile - Support GCC's -fmin-function-alignment flag - Fix a null pointer dereference bug in modpost - Add the DTB support to the RPM package - Various fixes and cleanups in Kconfig * tag 'kbuild-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (67 commits) kconfig: tests: test dependency after shuffling choices kconfig: tests: add a test for randconfig with dependent choices kconfig: tests: support KCONFIG_SEED for the randconfig runner kbuild: rpm-pkg: add dtb files in kernel rpm kconfig: remove unneeded menu_is_visible() call in conf_write_defconfig() kconfig: check prompt for choice while parsing kconfig: lxdialog: remove unused dialog colors kconfig: lxdialog: fix button color for blackbg theme modpost: fix null pointer dereference kbuild: remove GCC's default -Wpacked-bitfield-compat flag kbuild: unexport abs_srctree and abs_objtree kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1 kconfig: remove named choice support kconfig: use linked list in get_symbol_str() to iterate over menus kconfig: link menus to a symbol kbuild: fix inconsistent indentation in top Makefile kbuild: Use -fmin-function-alignment when available alpha: merge two entries for CONFIG_ALPHA_GAMMA alpha: merge two entries for CONFIG_ALPHA_EV4 kbuild: change DTC_FLAGS_<basetarget>.o to take the path relative to $(obj) ...
2024-03-20riscv: Fix syscall wrapper for >word-size argumentsSami Tolvanen1-14/+39
The current syscall wrapper macros break 64-bit arguments on rv32 because they only guarantee the first N input registers are passed to syscalls that accept N arguments. According to the calling convention, values twice the word size reside in register pairs and as a result, syscall arguments don't always have a direct register mapping on rv32. Instead of using `__MAP(x,__SC_LONG,__VA_ARGS__)` to declare the type of the `__se(_compat)_sys_*` functions on rv32, change the function declarations to accept `ulong` arguments and alias them to the actual syscall implementations, similarly to the existing macros in include/linux/syscalls.h. This matches previous behavior and ensures registers are passed to syscalls as-is, no matter which argument types they expect. Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers") Reported-by: Khem Raj <raj.khem@gmail.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20240311193143.2981310-2-samitolvanen@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20Merge patch series "riscv/barrier: tidying up barrier-related macro"Palmer Dabbelt7-32/+36
Eric Chan <ericchancf@google.com> says: This series makes barrier-related macro more neat and clear. This is a follow-up to [0-3], change to multiple patches, for readability, create new message thread. [0](v1/v2) https://lore.kernel.org/lkml/20240209125048.4078639-1-ericchancf@google.com/ [1] (v3) https://lore.kernel.org/lkml/20240213142856.2416073-1-ericchancf@google.com/ [2] (v4) https://lore.kernel.org/lkml/20240213200923.2547570-1-ericchancf@google.com/ [4] (v5) https://lore.kernel.org/lkml/20240213223810.2595804-1-ericchancf@google.com/ * b4-shazam-merge: riscv/barrier: Add missing space after ',' riscv/barrier: Consolidate fence definitions riscv/barrier: Define RISCV_FULL_BARRIER riscv/barrier: Define __{mb,rmb,wmb} Link: https://lore.kernel.org/r/20240217131206.3667544-1-ericchancf@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20crypto: riscv - add vector crypto accelerated AES-CBC-CTSEric Biggers3-5/+245
Add an implementation of cts(cbc(aes)) accelerated using the Zvkned RISC-V vector crypto extension. This is mainly useful for fscrypt, where cts(cbc(aes)) is the "default" filenames encryption algorithm. In that use case, typically most messages are short and are block-aligned. The CBC-CTS variant implemented is CS3; this is the variant Linux uses. To perform well on short messages, the new implementation processes the full message in one call to the assembly function if the data is contiguous. Otherwise it falls back to CBC operations followed by CTS at the end. For decryption, to further improve performance on short messages, especially block-aligned messages, the CBC-CTS assembly function parallelizes the AES decryption of all full blocks. This improves on the arm64 implementation of cts(cbc(aes)), which always splits the CBC part(s) from the CTS part, doing the AES decryptions for the last two blocks serially and usually loading the round keys twice. Tested in QEMU with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240213055442.35954-1-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20crypto: riscv - parallelize AES-CBC decryptionEric Biggers1-9/+15
Since CBC decryption is parallelizable, make the RISC-V implementation of AES-CBC decryption process multiple blocks at a time, instead of processing the blocks one by one. This should improve performance. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240208060851.154129-1-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20Merge patch series "RISC-V: ACPI: Enable CPPC based cpufreq support"Palmer Dabbelt1-0/+1
Sunil V L <sunilvl@ventanamicro.com> says: This series enables the support for "Collaborative Processor Performance Control (CPPC) on ACPI based RISC-V platforms. It depends on the encoding of CPPC registers as defined in RISC-V FFH spec [2]. CPPC is described in the ACPI spec [1]. RISC-V FFH spec required to enable this, is available at [2]. [1] - https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#collaborative-processor-performance-control [2] - https://github.com/riscv-non-isa/riscv-acpi-ffh/releases/download/v1.0.0/riscv-ffh.pdf * b4-shazam-merge: RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ cpufreq: Move CPPC configs to common Kconfig and add RISC-V ACPI: RISC-V: Add CPPC driver Link: https://lore.kernel.org/r/20240208034414.22579-1-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv: Only flush the mm icache when setting an exec pteAlexandre Ghiti3-10/+10
We used to emit a flush_icache_all() whenever a dirty executable mapping is set in the page table but we can instead call flush_icache_mm() which will only send IPIs to cores that currently run this mm and add a deferred icache flush to the others. The number of calls to sbi_remote_fence_i() (tested without IPI support): With a simple buildroot rootfs: * Before: ~5k * After : 4 (!) Tested on HW, the boot to login is ~4.5% faster. With an ubuntu rootfs: * Before: ~24k * After : ~13k Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240202124711.256146-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv: Use kcalloc() instead of kzalloc()Erick Archer1-2/+1
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, use the purpose specific kcalloc() function instead of the argument count * size in the kzalloc() function. Also, it is preferred to use sizeof(*pointer) instead of sizeof(type) due to the type of the variable can change and one needs not change the former (unlike the latter). Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/162 Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20240120135400.4710-1-erick.archer@gmx.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20Merge patch series "RISC-V: ACPI: Add LPI support"Palmer Dabbelt2-0/+52
Sunil V L <sunilvl@ventanamicro.com> says: This series adds support for Low Power Idle (LPI) on ACPI based platforms. LPI is described in the ACPI spec [1]. RISC-V FFH spec required to enable this is available at [2]. [1] - https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#lpi-low-power-idle-states [2] - https://github.com/riscv-non-isa/riscv-acpi-ffh/releases/download/v/riscv-ffh.pdf * b4-shazam-merge: ACPI: Enable ACPI_PROCESSOR for RISC-V ACPI: RISC-V: Add LPI driver cpuidle: RISC-V: Move few functions to arch/riscv Link: https://lore.kernel.org/r/20240118062930.245937-1-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20Merge patch series "riscv: Introduce compat-mode helpers & improve ↵Palmer Dabbelt5-24/+34
arch_get_mmap_end()" Leonardo Bras <leobras@redhat.com> says: I just saw the opportunity of optimizing the helper is_compat_task() by introducing a compile-time test, and it made possible to remove some #ifdef's without any loss of performance. I also saw the possibility of removing the direct check of task flags from general code, and concentrated it in asm/compat.h by creating a few more helpers, which in the end helped optimize code. arch_get_mmap_end() just got a simple improvement and some extra docs. * b4-shazam-merge: riscv: Introduce set_compat_task() in asm/compat.h riscv: Introduce is_compat_thread() into compat.h riscv: add compile-time test into is_compat_task() riscv: Replace direct thread flag check with is_compat_task() riscv: Improve arch_get_mmap_end() macro Link: https://lore.kernel.org/r/20240103160024.70305-2-leobras@redhat.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv/barrier: Add missing space after ','Eric Chan1-6/+6
The past form of RISCV_FENCE would cause checkpatch.pl to issue error messages, the example is as follows: ERROR: space required after that ',' (ctx:VxV) 26: FILE: arch/riscv/include/asm/barrier.h:27: +#define __smp_mb() RISCV_FENCE(rw,rw) ^ fix the remaining of RISCV_FENCE. Signed-off-by: Eric Chan <ericchancf@google.com> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240217131328.3669364-1-ericchancf@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv/barrier: Consolidate fence definitionsEric Chan7-14/+16
Disparate fence implementations are consolidated into fence.h. Also introduce RISCV_FENCE_ASM to make fence macro more reusable. Signed-off-by: Eric Chan <ericchancf@google.com> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240217131316.3668927-1-ericchancf@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv/barrier: Define RISCV_FULL_BARRIEREric Chan3-10/+12
Introduce RISCV_FULL_BARRIER and use in arch_atomic* function. like RISCV_ACQUIRE_BARRIER and RISCV_RELEASE_BARRIER, the fence instruction can be eliminated When SMP is not enabled. Signed-off-by: Eric Chan <ericchancf@google.com> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240217131302.3668481-1-ericchancf@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv/barrier: Define __{mb,rmb,wmb}Eric Chan1-3/+3
Introduce __{mb,rmb,wmb}, and rely on the generic definitions for {mb,rmb,wmb}. Although KCSAN is not supported yet, the definitions can be made more consistent with generic instrumentation. Also add a space to make the changes pass check by checkpatch.pl. Without the space, the error message is as below: ERROR: space required after that ',' (ctx:VxV) 26: FILE: arch/riscv/include/asm/barrier.h:23: +#define __mb() RISCV_FENCE(iorw,iorw) ^ Signed-off-by: Eric Chan <ericchancf@google.com> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240217131249.3668103-1-ericchancf@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQSunil V L1-0/+1
CONFIG_ACPI_CPPC_CPUFREQ is required to enable CPPC for RISC-V. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20240208034414.22579-4-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20cpuidle: RISC-V: Move few functions to arch/riscvSunil V L2-0/+52
To support ACPI Low Power Idle (LPI), few functions are required which are currently static functions in the DT based cpuidle driver. Hence, move them under arch/riscv so that ACPI driver also can use them. Since they are no longer static functions, append "riscv_" prefix to the function name. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20240118062930.245937-2-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv: Introduce set_compat_task() in asm/compat.hLeonardo Bras2-4/+9
In order to have all task compat bit access directly in compat.h, introduce set_compat_task() to set/reset those when needed. Also, since it's only used on an if/else scenario, simplify the macro using it. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20240103160024.70305-7-leobras@redhat.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv: Introduce is_compat_thread() into compat.hLeonardo Bras2-3/+11
task_user_regset_view() makes use of a function very similar to is_compat_task(), but pointing to a any thread. In arm64 asm/compat.h there is a function very similar to that: is_compat_thread(struct thread_info *thread) Copy this function to riscv asm/compat.h and make use of it into task_user_regset_view(). Also, introduce a compile-time test for CONFIG_COMPAT and simplify the function code by removing the #ifdef. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20240103160024.70305-6-leobras@redhat.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv: add compile-time test into is_compat_task()Leonardo Bras4-12/+5
Currently several places will test for CONFIG_COMPAT before testing is_compat_task(), probably in order to avoid a run-time test into the task structure. Since is_compat_task() is an inlined function, it would be helpful to add a compile-time test of CONFIG_COMPAT, making sure it always returns zero when the option is not enabled during the kernel build. With this, the compiler is able to understand in build-time that is_compat_task() will always return 0, and optimize-out some of the extra code introduced by the option. This will also allow removing a lot #ifdefs that were introduced, and make the code more clean. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20240103160024.70305-5-leobras@redhat.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv: Replace direct thread flag check with is_compat_task()Leonardo Bras2-2/+2
There is some code that detects compat mode into a task by checking the flag directly, and other code that check using the helper is_compat_task(). Since the helper already exists, use it instead of checking the flags directly. Signed-off-by: Leonardo Bras <leobras@redhat.com> Link: https://lore.kernel.org/r/20240103160024.70305-4-leobras@redhat.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-20riscv: Improve arch_get_mmap_end() macroLeonardo Bras1-3/+9
This macro caused me some confusion, which took some reviewer's time to make it clear, so I propose adding a short comment in code to avoid confusion in the future. Also, added some improvements to the macro, such as removing the assumption of VA_USER_SV57 being the largest address space. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20240103160024.70305-3-leobras@redhat.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>