summaryrefslogtreecommitdiff
path: root/arch/riscv
AgeCommit message (Collapse)AuthorFilesLines
2024-04-30Merge patch series "riscv: Create and document PR_RISCV_SET_ICACHE_FLUSH_CTX ↵Palmer Dabbelt6-9/+159
prctl" Charlie Jenkins <charlie@rivosinc.com> says: Improve the performance of icache flushing by creating a new prctl flag PR_RISCV_SET_ICACHE_FLUSH_CTX. The interface is left generic to allow for future expansions such as with the proposed J extension [1]. Documentation is also provided to explain the use case. Patch sent to add PR_RISCV_SET_ICACHE_FLUSH_CTX to man-pages [2]. [1] https://github.com/riscv/riscv-j-extension [2] https://lore.kernel.org/linux-man/20240124-fencei_prctl-v1-1-0bddafcef331@rivosinc.com * b4-shazam-merge: cpumask: Add assign cpu documentation: Document PR_RISCV_SET_ICACHE_FLUSH_CTX prctl riscv: Include riscv_set_icache_flush_ctx prctl riscv: Remove unnecessary irqflags processor.h include Link: https://lore.kernel.org/r/20240312-fencei-v13-0-4b6bdc2bbf32@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-30Merge patch series "riscv: fix patching with IPI"Palmer Dabbelt3-9/+53
Alexandre Ghiti <alexghiti@rivosinc.com> says: patch 1 removes a useless memory barrier and patch 2 actually fixes the issue with IPI in the patching code. * b4-shazam-merge: riscv: Fix text patching when IPI are used riscv: Remove superfluous smp_mb() Link: https://lore.kernel.org/r/20240229121056.203419-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29Merge tag 'for-netdev' of ↵Jakub Kicinski3-2/+205
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-04-29 We've added 147 non-merge commits during the last 32 day(s) which contain a total of 158 files changed, 9400 insertions(+), 2213 deletions(-). The main changes are: 1) Add an internal-only BPF per-CPU instruction for resolving per-CPU memory addresses and implement support in x86 BPF JIT. This allows inlining per-CPU array and hashmap lookups and the bpf_get_smp_processor_id() helper, from Andrii Nakryiko. 2) Add BPF link support for sk_msg and sk_skb programs, from Yonghong Song. 3) Optimize x86 BPF JIT's emit_mov_imm64, and add support for various atomics in bpf_arena which can be JITed as a single x86 instruction, from Alexei Starovoitov. 4) Add support for passing mark with bpf_fib_lookup helper, from Anton Protopopov. 5) Add a new bpf_wq API for deferring events and refactor sleepable bpf_timer code to keep common code where possible, from Benjamin Tissoires. 6) Fix BPF_PROG_TEST_RUN infra with regards to bpf_dummy_struct_ops programs to check when NULL is passed for non-NULLable parameters, from Eduard Zingerman. 7) Harden the BPF verifier's and/or/xor value tracking, from Harishankar Vishwanathan. 8) Introduce crypto kfuncs to make BPF programs able to utilize the kernel crypto subsystem, from Vadim Fedorenko. 9) Various improvements to the BPF instruction set standardization doc, from Dave Thaler. 10) Extend libbpf APIs to partially consume items from the BPF ringbuffer, from Andrea Righi. 11) Bigger batch of BPF selftests refactoring to use common network helpers and to drop duplicate code, from Geliang Tang. 12) Support bpf_tail_call_static() helper for BPF programs with GCC 13, from Jose E. Marchesi. 13) Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF program to have code sections where preemption is disabled, from Kumar Kartikeya Dwivedi. 14) Allow invoking BPF kfuncs from BPF_PROG_TYPE_SYSCALL programs, from David Vernet. 15) Extend the BPF verifier to allow different input maps for a given bpf_for_each_map_elem() helper call in a BPF program, from Philo Lu. 16) Add support for PROBE_MEM32 and bpf_addr_space_cast instructions for riscv64 and arm64 JITs to enable BPF Arena, from Puranjay Mohan. 17) Shut up a false-positive KMSAN splat in interpreter mode by unpoison the stack memory, from Martin KaFai Lau. 18) Improve xsk selftest coverage with new tests on maximum and minimum hardware ring size configurations, from Tushar Vyavahare. 19) Various ReST man pages fixes as well as documentation and bash completion improvements for bpftool, from Rameez Rehman & Quentin Monnet. 20) Fix libbpf with regards to dumping subsequent char arrays, from Quentin Deslandes. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (147 commits) bpf, docs: Clarify PC use in instruction-set.rst bpf_helpers.h: Define bpf_tail_call_static when building with GCC bpf, docs: Add introduction for use in the ISA Internet Draft selftests/bpf: extend BPF_SOCK_OPS_RTT_CB test for srtt and mrtt_us bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args selftests/bpf: dummy_st_ops should reject 0 for non-nullable params bpf: check bpf_dummy_struct_ops program params for test runs selftests/bpf: do not pass NULL for non-nullable params in dummy_st_ops selftests/bpf: adjust dummy_st_ops_success to detect additional error bpf: mark bpf_dummy_struct_ops.test_1 parameter as nullable selftests/bpf: Add ring_buffer__consume_n test. bpf: Add bpf_guard_preempt() convenience macro selftests: bpf: crypto: add benchmark for crypto functions selftests: bpf: crypto skcipher algo selftests bpf: crypto: add skcipher to bpf crypto bpf: make common crypto API for TC/XDP programs bpf: update the comment for BTF_FIELDS_MAX selftests/bpf: Fix wq test. selftests/bpf: Use make_sockaddr in test_sock_addr selftests/bpf: Use connect_to_addr in test_sock_addr ... ==================== Link: https://lore.kernel.org/r/20240429131657.19423-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-29riscv: mm: Always use an ASID to flush mm contextsSamuel Holland1-2/+1
Even if multiple ASIDs are not supported, using the single-ASID variant of the sfence.vma instruction preserves TLB entries for global (kernel) pages. So it is always more efficient to use the single-ASID code path. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240327045035.368512-14-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: mm: Preserve global TLB entries when switching contextsSamuel Holland1-1/+1
If the CPU does not support multiple ASIDs, all MM contexts use ASID 0. In this case, it is still beneficial to flush the TLB by ASID, as the single-ASID variant of the sfence.vma instruction preserves TLB entries for global (kernel) pages. This optimization is recommended by the RISC-V privileged specification: If the implementation does not provide ASIDs, or software chooses to always use ASID 0, then after every satp write, software should execute SFENCE.VMA with rs1=x0. In the common case that no global translations have been modified, rs2 should be set to a register other than x0 but which contains the value zero, so that global translations are not flushed. It is not possible to apply this optimization when using the ASID allocator, because that code must flush the TLB for all ASIDs at once when incrementing the version number. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240327045035.368512-13-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: mm: Make asid_bits a local variableSamuel Holland1-2/+1
This variable is only used inside asids_init(). Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240327045035.368512-12-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: mm: Use a fixed layout for the MM context IDSamuel Holland3-8/+4
Currently, the size of the ASID field in the MM context ID dynamically depends on the number of hardware-supported ASID bits. This requires reading a global variable to extract either field from the context ID. Instead, allocate the maximum possible number of bits to the ASID field, so the layout of the context ID is known at compile-time. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240327045035.368512-11-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: mm: Introduce cntx2asid/cntx2version helper macrosSamuel Holland3-7/+10
When using the ASID allocator, the MM context ID contains two values: the ASID in the lower bits, and the allocator version number in the remaining bits. Use macros to make this separation more obvious. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240327045035.368512-10-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: Avoid TLB flush loops when affected by SiFive CIP-1200Samuel Holland3-1/+8
Implementations affected by SiFive errata CIP-1200 have a bug which forces the kernel to always use the global variant of the sfence.vma instruction. When affected by this errata, do not attempt to flush a range of addresses; each iteration of the loop would actually flush the whole TLB instead. Instead, minimize the overall number of sfence.vma instructions. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/20240327045035.368512-9-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vmaSamuel Holland3-25/+29
commit 3f1e782998cd ("riscv: add ASID-based tlbflushing methods") added calls to the sfence.vma instruction with rs2 != x0. These single-ASID instruction variants are also affected by SiFive errata CIP-1200. Until now, the errata workaround was not needed for the single-ASID sfence.vma variants, because they were only used when the ASID allocator was enabled, and the affected SiFive platforms do not support multiple ASIDs. However, we are going to start using those sfence.vma variants regardless of ASID support, so now we need alternatives covering them. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240327045035.368512-8-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: mm: Combine the SMP and UP TLB flush codeSamuel Holland3-33/+5
In SMP configurations, all TLB flushing narrower than flush_tlb_all() goes through __flush_tlb_range(). Do the same in UP configurations. This allows UP configurations to take advantage of recent improvements to the code in tlbflush.c, such as support for huge pages and flushing multiple-page ranges. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/20240327045035.368512-7-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: Only send remote fences when some other CPU is onlineSamuel Holland2-2/+6
If no other CPU is online, a local cache or TLB flush is sufficient. These checks can be constant-folded when SMP is disabled. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240327045035.368512-6-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: mm: Broadcast kernel TLB flushes only when neededSamuel Holland1-13/+5
__flush_tlb_range() avoids broadcasting TLB flushes when an mm context is only active on the local CPU. Apply this same optimization to TLB flushes of kernel memory when only one CPU is online. This check can be constant-folded when SMP is disabled. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240327045035.368512-5-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: Use IPIs for remote cache/TLB flushes by defaultSamuel Holland7-47/+37
An IPI backend is always required in an SMP configuration, but an SBI implementation is not. For example, SBI will be unavailable when the kernel runs in M mode. For this reason, consider IPI delivery of cache and TLB flushes to be the base case, and any other implementation (such as the SBI remote fence extension) to be an optimization. Generally, if IPIs can be delivered without firmware assistance, they are assumed to be faster than SBI calls due to the SBI context switch overhead. However, when SBI is used as the IPI backend, then the context switch cost must be paid anyway, and performing the cache/TLB flush directly in the SBI implementation is more efficient than injecting an interrupt to S-mode. This is the only existing scenario where riscv_ipi_set_virq_range() is called with use_for_rfence set to false. sbi_ipi_init() already checks riscv_ipi_have_virq_range(), so it only calls riscv_ipi_set_virq_range() when no other IPI device is available. This allows moving the static key and dropping the use_for_rfence parameter. This decouples the static key from the irqchip driver probe order. Furthermore, the static branch only makes sense when CONFIG_RISCV_SBI is enabled. Optherwise, IPIs must be used. Add a fallback definition of riscv_use_sbi_for_rfence() which handles this case and removes the need to check CONFIG_RISCV_SBI elsewhere, such as in cacheflush.c. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240327045035.368512-4-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: Factor out page table TLB synchronizationSamuel Holland1-18/+13
The logic is the same for all page table levels. See commit 69be3fb111e7 ("riscv: enable MMU_GATHER_RCU_TABLE_FREE for SMP && MMU"). Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240327045035.368512-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: Flush the instruction cache during SMP bringupSamuel Holland1-3/+4
Instruction cache flush IPIs are sent only to CPUs in cpu_online_mask, so they will not target a CPU until it calls set_cpu_online() earlier in smp_callin(). As a result, if instruction memory is modified between the CPU coming out of reset and that point, then its instruction cache may contain stale data. Therefore, the instruction cache must be flushed after the set_cpu_online() synchronization point. Fixes: 08f051eda33b ("RISC-V: Flush I$ when making a dirty page executable") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240327045035.368512-2-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29Merge tag 'renesas-dts-for-v6.10-tag2' of ↵Arnd Bergmann2-16/+75
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/dt Renesas DTS updates for v6.10 (take two) - Add external interrupt (IRQC) support for the RZ/Five SoC, - Add SPI (MSIOF), external interrupt (INTC-EX), and IOMMU support for the R-Car V4M SoC, - Miscellaneous fixes and improvements. * tag 'renesas-dts-for-v6.10-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: arm64: dts: renesas: r8a779h0: Link IOMMU consumers arm64: dts: renesas: r8a779h0: Add IPMMU nodes arm64: dts: renesas: r8a779h0: Add INTC-EX node arm64: dts: renesas: r8a779h0: Add MSIOF nodes arm64: dts: renesas: rzg3s-smarc-som: Enable eMMC by default riscv: dts: renesas: rzfive-smarc-som: Drop deleting interrupt properties from ETH0/1 nodes arm64: dts: renesas: r9a07g043: Move interrupt-parent property to common DTSI riscv: dts: renesas: r9a07g043f: Add IRQC node to RZ/Five SoC DTSI arm64: dts: renesas: s4sk: Fix ethernet0 alias Link: https://lore.kernel.org/r/cover.1714116737.git.geert+renesas@glider.be Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-04-29riscv: hwprobe: export Zihintpause ISA extensionClément Léger2-0/+2
Export the Zihintpause ISA extension through hwprobe which allows using "pause" instructions. Some userspace applications (OpenJDK for instance) uses this to handle some locking back-off. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240221083108.1235311-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: misaligned: remove CONFIG_RISCV_M_MODE specific codeClément Léger1-89/+17
While reworking code to fix sparse errors, it appears that the RISCV_M_MODE specific could actually be removed and use the one for normal mode. Even though RISCV_M_MODE can do direct user memory access, using the user uaccess helpers is also going to work. Since there is no need anymore for specific accessors (load_u8()/store_u8()), we can directly use memcpy()/copy_{to/from}_user() and get rid of the copy loop entirely. __read_insn() is also fixed to use an unsigned long instead of a pointer which was cast in __user address space. The insn_addr parameter is now cast from unsigned lnog to the correct address space directly. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240206154104.896809-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29riscv: Do not save the scratch CSR during suspendSamuel Holland2-3/+1
While the processor is executing kernel code, the value of the scratch CSR is always zero, so there is no need to save the value. Continue to write the CSR during the resume flow, so we do not rely on firmware to initialize it. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240312195641.1830521-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29Merge patch series "riscv: 64-bit NOMMU fixes and enhancements"Palmer Dabbelt4-10/+13
Samuel Holland <samuel.holland@sifive.com> says: This series aims to improve support for NOMMU, specifically by making it easier to test NOMMU kernels in QEMU and on various widely-available hardware (errata permitting). After all, everything supports Svbare... After applying this series, a NOMMU kernel based on defconfig (changing only the three options below*) boots to userspace on QEMU when passed as -kernel. # CONFIG_RISCV_M_MODE is not set # CONFIG_MMU is not set CONFIG_NONPORTABLE=y *if you are using LLD, you must also disable BPF_SYSCALL and KALLSYMS, because LLD bails on out-of-range references to undefined weak symbols. * b4-shazam-merge: riscv: Allow NOMMU kernels to run in S-mode riscv: Remove MMU dependency from Zbb and Zicboz riscv: Fix loading 64-bit NOMMU kernels past the start of RAM riscv: Fix TASK_SIZE on 64-bit NOMMU Link: https://lore.kernel.org/r/20240227003630.3634533-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29RISC-V: enable building 64-bit kernels with rust supportMiguel Ojeda2-0/+8
The rust modules work on 64-bit RISC-V, with no twiddling required. Select HAVE_RUST and provide the required flags to kbuild so that the modules can be used. The Makefile and Kconfig changes are lifted from work done by Miguel in the Rust-for-Linux tree, hence his authorship. Following the rabbit hole, the Makefile changes originated in a script, created based on config files originally added by Gary, hence his co-authorship. 32-bit is broken in core rust code, so support is limited to 64-bit: ld.lld: error: undefined symbol: __udivdi3 As 64-bit RISC-V is now supported, add it to the arch support table. Co-developed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Co-developed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240409-silencer-book-ce1320f06aab@spud Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-29Merge patch series "Rework & improve riscv cmpxchg.h and atomic.h"Palmer Dabbelt2-368/+200
Leonardo Bras <leobras@redhat.com> says: While studying riscv's cmpxchg.h file, I got really interested in understanding how RISCV asm implemented the different versions of {cmp,}xchg. When I understood the pattern, it made sense for me to remove the duplications and create macros to make it easier to understand what exactly changes between the versions: Instruction sufixes & barriers. Also, did the same kind of work on atomic.c. After that, I noted both cmpxchg and xchg only accept variables of size 4 and 8, compared to x86 and arm64 which do 1,2,4,8. Now that deduplication is done, it is quite direct to implement them for variable sizes 1 and 2, so I did it. Then Guo Ren already presented me some possible users :) I did compare the generated asm on a test.c that contained usage for every changed function, and could not detect any change on patches 1 + 2 + 3 compared with upstream. Pathes 4 & 5 were compiled-tested, merged with guoren/qspinlock_v11 and booted just fine with qemu -machine virt -append "qspinlock". (tree: https://gitlab.com/LeoBras/linux/-/commits/guo_qspinlock_v11) Latest tests happened based on this tree: https://github.com/guoren83/linux/tree/qspinlock_v12 * b4-shazam-lts: riscv/cmpxchg: Implement xchg for variables of size 1 and 2 riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2 riscv/atomic.h : Deduplicate arch_atomic.* riscv/cmpxchg: Deduplicate cmpxchg() asm and macros riscv/cmpxchg: Deduplicate xchg() asm functions Link: https://lore.kernel.org/r/20240103163203.72768-2-leobras@redhat.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-27Merge tag 'riscv-for-linus-6.9-rc6' of ↵Linus Torvalds7-27/+33
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel separation - A fix to avoid loading rv64/NOMMU kernel past the start of RAM - A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer overflow in the bitmask - The sud_test kselftest has been fixed to properly swizzle the syscall number into the return register, which are not the same on RISC-V - A fix for a build warning in the perf tools on rv32 - A fix for the CBO selftests, to avoid non-constants leaking into the inline asm - A pair of fixes for T-Head PBMT errata probing, which has been renamed MAE by the vendor * tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2 perf riscv: Fix the warning due to the incompatible type riscv: T-Head: Test availability bit before enabling MAE errata riscv: thead: Rename T-Head PBMT to MAE selftests: sud_test: return correct emulated syscall value on RISC-V riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN riscv: Fix loading 64-bit NOMMU kernels past the start of RAM riscv: Fix TASK_SIZE on 64-bit NOMMU
2024-04-27Merge tag 'for-netdev' of ↵Jakub Kicinski1-3/+3
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-04-26 We've added 12 non-merge commits during the last 22 day(s) which contain a total of 14 files changed, 168 insertions(+), 72 deletions(-). The main changes are: 1) Fix BPF_PROBE_MEM in verifier and JIT to skip loads from vsyscall page, from Puranjay Mohan. 2) Fix a crash in XDP with devmap broadcast redirect when the latter map is in process of being torn down, from Toke Høiland-Jørgensen. 3) Fix arm64 and riscv64 BPF JITs to properly clear start time for BPF program runtime stats, from Xu Kuohai. 4) Fix a sockmap KCSAN-reported data race in sk_psock_skb_ingress_enqueue, from Jason Xing. 5) Fix BPF verifier error message in resolve_pseudo_ldimm64, from Anton Protopopov. 6) Fix missing DEBUG_INFO_BTF_MODULES Kconfig menu item, from Andrii Nakryiko. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Test PROBE_MEM of VSYSCALL_ADDR on x86-64 bpf, x86: Fix PROBE_MEM runtime load check bpf: verifier: prevent userspace memory access xdp: use flags field to disambiguate broadcast redirect arm32, bpf: Reimplement sign-extension mov instruction riscv, bpf: Fix incorrect runtime stats bpf, arm64: Fix incorrect runtime stats bpf: Fix a verifier verbose message bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue MAINTAINERS: bpf: Add Lehui and Puranjay as riscv64 reviewers MAINTAINERS: Update email address for Puranjay Mohan bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition ==================== Link: https://lore.kernel.org/r/20240426224248.26197-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26Merge patch series "RISC-V: Test th.sxstatus.MAEE bit before enabling MAEE"Palmer Dabbelt3-23/+29
Christoph Müllner <christoph.muellner@vrull.eu> says: Currently, the Linux kernel suffers from a boot regression when running on the c906 QEMU emulation. Details have been reported here by Björn Töpel: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg04766.html The main issue is, that Linux enables XTheadMae for CPUs that have a T-Head mvendorid but QEMU maintainers don't want to emulate a CPU that uses reserved bits in PTEs. See also the following discussion for more context: https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00775.html This series renames "T-Head PBMT" to "MAE"/"XTheadMae" and only enables it if the th.sxstatus.MAEE bit is set. The th.sxstatus CSR is documented here: https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadsxstatus.adoc XTheadMae is documented here: https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadmae.adoc The QEMU patch to emulate th.sxstatus with the MAEE bit not set is here: https://lore.kernel.org/all/20240329120427.684677-1-christoph.muellner@vrull.eu/ After applying the referenced QEMU patch, this patchset allows to successfully boot a C906 QEMU system emulation ("-cpu thead-c906"). * b4-shazam-lts: riscv: T-Head: Test availability bit before enabling MAE errata riscv: thead: Rename T-Head PBMT to MAE Link: https://lore.kernel.org/r/20240407213236.2121592-1-christoph.muellner@vrull.eu Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-26dma-mapping: Simplify arch_setup_dma_ops()Robin Murphy1-2/+1
The dma_base, size and iommu arguments are only used by ARM, and can now easily be deduced from the device itself, so there's no need to pass them through the callchain as well. Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Michael Kelley <mhklinux@outlook.com> # For Hyper-V Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/5291c2326eab405b1aa7693aa964e8d3cb7193de.1713523152.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26RISC-V: KVM: Improve firmware counter read functionAtish Patra3-3/+8
Rename the function to indicate that it is meant for firmware counter read. While at it, add a range sanity check for it as well. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240420151741.962500-17-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-26RISC-V: KVM: Support 64 bit firmware counters on RV32Atish Patra3-2/+52
The SBI v2.0 introduced a fw_read_hi function to read 64 bit firmware counters for RV32 based systems. Add infrastructure to support that. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240420151741.962500-16-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-26RISC-V: KVM: Add perf sampling support for guestsAtish Patra7-8/+93
KVM enables perf for guest via counter virtualization. However, the sampling can not be supported as there is no mechanism to enabled trap/emulate scountovf in ISA yet. Rely on the SBI PMU snapshot to provide the counter overflow data via the shared memory. In case of sampling event, the host first sets the guest's LCOFI interrupt and injects to the guest via irq filtering mechanism defined in AIA specification. Thus, ssaia must be enabled in the host in order to use perf sampling in the guest. No other AIA dependency w.r.t kernel is required. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240420151741.962500-15-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-26RISC-V: KVM: Implement SBI PMU Snapshot featureAtish Patra3-1/+130
PMU Snapshot function allows to minimize the number of traps when the guest access configures/access the hpmcounters. If the snapshot feature is enabled, the hypervisor updates the shared memory with counter data and state of overflown counters. The guest can just read the shared memory instead of trap & emulate done by the hypervisor. This patch doesn't implement the counter overflow yet. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240420151741.962500-14-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-26RISC-V: KVM: No need to exit to the user space if perf event failedAtish Patra2-8/+12
Currently, we return a linux error code if creating a perf event failed in kvm. That shouldn't be necessary as guest can continue to operate without perf profiling or profiling with firmware counters. Return appropriate SBI error code to indicate that PMU configuration failed. An error message in kvm already describes the reason for failure. Fixes: 0cb74b65d2e5 ("RISC-V: KVM: Implement perf support without sampling") Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240420151741.962500-13-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-26RISC-V: KVM: No need to update the counter value during resetAtish Patra1-6/+2
The virtual counter value is updated during pmu_ctr_read. There is no need to update it in reset case. Otherwise, it will be counted twice which is incorrect. Fixes: 0cb74b65d2e5 ("RISC-V: KVM: Implement perf support without sampling") Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240420151741.962500-12-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-26RISC-V: KVM: Fix the initial sample period valueAtish Patra1-1/+1
The initial sample period value when counter value is not assigned should be set to maximum value supported by the counter width. Otherwise, it may result in spurious interrupts. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240420151741.962500-11-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-26mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FASTDavid Hildenbrand1-1/+1
Nowadays, we call it "GUP-fast", the external interface includes functions like "get_user_pages_fast()", and we renamed all internal functions to reflect that as well. Let's make the config option reflect that. Link: https://lkml.kernel.org/r/20240402125516.223131-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26riscv: mm: accelerate pagefault when badaccessKefeng Wang1-1/+4
The access_error() of vma already checked under per-VMA lock, if it is a bad access, directly handle error, no need to retry with mmap_lock again. Since the page faut is handled under per-VMA lock, count it as a vma lock event with VMA_LOCK_SUCCESS. [wangkefeng.wang@huawei.com: use `cause' rather than SIGSEGV, per Alexandre] Link: https://lkml.kernel.org/r/ac978061-ce1a-40a4-8b0a-61883b42bea7@huawei.com Link: https://lkml.kernel.org/r/20240403083805.1818160-6-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26mm/arch: provide pud_pfn() fallbackPeter Xu1-0/+1
The comment in the code explains the reasons. We took a different approach comparing to pmd_pfn() by providing a fallback function. Another option is to provide some lower level config options (compare to HUGETLB_PAGE or THP) to identify which layer an arch can support for such huge mappings. However that can be an overkill. [peterx@redhat.com: fix loongson defconfig] Link: https://lkml.kernel.org/r/20240403013249.1418299-4-peterx@redhat.com Link: https://lkml.kernel.org/r/20240327152332.950956-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Ryan Roberts <ryan.roberts@arm.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Jones <andrew.jones@linux.dev> Cc: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@infradead.org> Cc: David Hildenbrand <david@redhat.com> Cc: James Houghton <jthoughton@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26mm: convert arch_clear_hugepage_flags to take a folioMatthew Wilcox (Oracle)1-3/+3
All implementations that aren't no-ops just set a bit in the flags, and we want to use the folio flags rather than the page flags for that. Rename it to arch_clear_hugetlb_flags() while we're touching it so nobody thinks it's used for THP. [willy@infradead.org: fix arm64 build] Link: https://lkml.kernel.org/r/ZgQvNKGdlDkwhQEX@casper.infradead.org Link: https://lkml.kernel.org/r/20240326171045.410737-8-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26fix missing vmalloc.h includesKent Overstreet2-0/+2
Patch series "Memory allocation profiling", v6. Overview: Low overhead [1] per-callsite memory allocation profiling. Not just for debug kernels, overhead low enough to be deployed in production. Example output: root@moria-kvm:~# sort -rn /proc/allocinfo 127664128 31168 mm/page_ext.c:270 func:alloc_page_ext 56373248 4737 mm/slub.c:2259 func:alloc_slab_page 14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded 14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash 13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs 11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio 9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node 4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable 4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start 3940352 962 mm/memory.c:4214 func:alloc_anon_folio 2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node ... Usage: kconfig options: - CONFIG_MEM_ALLOC_PROFILING - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT - CONFIG_MEM_ALLOC_PROFILING_DEBUG adds warnings for allocations that weren't accounted because of a missing annotation sysctl: /proc/sys/vm/mem_profiling Runtime info: /proc/allocinfo Notes: [1]: Overhead To measure the overhead we are comparing the following configurations: (1) Baseline with CONFIG_MEMCG_KMEM=n (2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n) (3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) (4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1) (5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT (6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n) && CONFIG_MEMCG_KMEM=y (7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y Performance overhead: To evaluate performance we implemented an in-kernel test executing multiple get_free_page/free_page and kmalloc/kfree calls with allocation sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU affinity set to a specific CPU to minimize the noise. Below are results from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on 56 core Intel Xeon: kmalloc pgalloc (1 baseline) 6.764s 16.902s (2 default disabled) 6.793s (+0.43%) 17.007s (+0.62%) (3 default enabled) 7.197s (+6.40%) 23.666s (+40.02%) (4 runtime enabled) 7.405s (+9.48%) 23.901s (+41.41%) (5 memcg) 13.388s (+97.94%) 48.460s (+186.71%) (6 def disabled+memcg) 13.332s (+97.10%) 48.105s (+184.61%) (7 def enabled+memcg) 13.446s (+98.78%) 54.963s (+225.18%) Memory overhead: Kernel size: text data bss dec diff (1) 26515311 18890222 17018880 62424413 (2) 26524728 19423818 16740352 62688898 264485 (3) 26524724 19423818 16740352 62688894 264481 (4) 26524728 19423818 16740352 62688898 264485 (5) 26541782 18964374 16957440 62463596 39183 Memory consumption on a 56 core Intel CPU with 125GB of memory: Code tags: 192 kB PageExts: 262144 kB (256MB) SlabExts: 9876 kB (9.6MB) PcpuExts: 512 kB (0.5MB) Total overhead is 0.2% of total memory. Benchmarks: Hackbench tests run 100 times: hackbench -s 512 -l 200 -g 15 -f 25 -P baseline disabled profiling enabled profiling avg 0.3543 0.3559 (+0.0016) 0.3566 (+0.0023) stdev 0.0137 0.0188 0.0077 hackbench -l 10000 baseline disabled profiling enabled profiling avg 6.4218 6.4306 (+0.0088) 6.5077 (+0.0859) stdev 0.0933 0.0286 0.0489 stress-ng tests: stress-ng --class memory --seq 4 -t 60 stress-ng --class cpu --seq 4 -t 60 Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/ [2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/ This patch (of 37): The next patch drops vmalloc.h from a system header in order to fix a circular dependency; this adds it to all the files that were pulling it in implicitly. [kent.overstreet@linux.dev: fix arch/alpha/lib/memcpy.c] Link: https://lkml.kernel.org/r/20240327002152.3339937-1-kent.overstreet@linux.dev [surenb@google.com: fix arch/x86/mm/numa_32.c] Link: https://lkml.kernel.org/r/20240402180933.1663992-1-surenb@google.com [kent.overstreet@linux.dev: a few places were depending on sizes.h] Link: https://lkml.kernel.org/r/20240404034744.1664840-1-kent.overstreet@linux.dev [arnd@arndb.de: fix mm/kasan/hw_tags.c] Link: https://lkml.kernel.org/r/20240404124435.3121534-1-arnd@kernel.org [surenb@google.com: fix arc build] Link: https://lkml.kernel.org/r/20240405225115.431056-1-surenb@google.com Link: https://lkml.kernel.org/r/20240321163705.3067592-1-surenb@google.com Link: https://lkml.kernel.org/r/20240321163705.3067592-2-surenb@google.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Tested-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-26mm/treewide: remove pXd_huge()Peter Xu1-10/+0
This API is not used anymore, drop it for the whole tree. Link: https://lkml.kernel.org/r/20240318200404.448346-13-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Fabio Estevam <festevam@denx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Mark Salter <msalter@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25riscv: T-Head: Test availability bit before enabling MAE errataChristoph Müllner1-4/+10
T-Head's memory attribute extension (XTheadMae) (non-compatible equivalent of RVI's Svpbmt) is currently assumed for all T-Head harts. However, QEMU recently decided to drop acceptance of guests that write reserved bits in PTEs. As XTheadMae uses reserved bits in PTEs and Linux applies the MAE errata for all T-Head harts, this broke the Linux startup on QEMU emulations of the C906 emulation. This patch attempts to address this issue by testing the MAE-enable bit in the th.sxstatus CSR. This CSR is available in HW and can be emulated in QEMU. This patch also makes the XTheadMae probing mechanism reliable, because a test for the right combination of mvendorid, marchid, and mimpid is not sufficient to enable MAE. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Link: https://lore.kernel.org/r/20240407213236.2121592-3-christoph.muellner@vrull.eu Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-25riscv: thead: Rename T-Head PBMT to MAEChristoph Müllner3-19/+19
T-Head's vendor extension to set page attributes has the name MAE (memory attribute extension). Let's rename it, so it is clear what this referes to. Link: https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadmae.adoc Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> Link: https://lore.kernel.org/r/20240407213236.2121592-2-christoph.muellner@vrull.eu Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-24riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}Jisheng Zhang1-0/+18
After selecting ARCH_USE_CMPXCHG_LOCKREF, one straight futher optimization is implementing the arch_cmpxchg64_relaxed() because the lockref code does not need the cmpxchg to have barrier semantics. At the same time, implement arch_cmpxchg64_acquire and arch_cmpxchg64_release as well. However, on both TH1520 and JH7110 platforms, I didn't see obvious performance improvement with Linus' test case [1]. IMHO, this may be related with the fence and lr.d/sc.d hw implementations. In theory, lr/sc without fence could give performance improvement over lr/sc plus fence, so add the code here to leave performance improvement room on newer HW platforms. Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1] Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20240325111038.1700-3-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-24riscv: select ARCH_USE_CMPXCHG_LOCKREFJisheng Zhang1-0/+1
Select ARCH_USE_CMPXCHG_LOCKREF to enable the cmpxchg-based lockless lockref implementation for riscv. Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement. On JH7110 platform, I see 12.0% improvement. Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1] Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20240325111038.1700-2-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-23riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMINClément Léger1-1/+1
The current definition yields a negative 32bits signed value which result in a mask with is obviously incorrect. Replace it by using a 1ULL bit shift value to obtain a single set bit mask. Fixes: 5dadda5e6a59 ("riscv: hwprobe: export Zvfh[min] ISA extensions") Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240409143839.558784-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-23riscv: dts: sophgo: add reserved memory node for CV1800BInochi Amaoto2-3/+14
The original dts of CV1800B has a weird memory length as it contains reserved memory for coprocessor. Make this area a separate node so it can get the real memory length. Link: https://lore.kernel.org/r/IA1PR20MB49531F274753B04A5547DB59BB052@IA1PR20MB4953.namprd20.prod.outlook.com Signed-off-by: Inochi Amaoto <inochiama@outlook.com> Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
2024-04-22riscv: dts: renesas: rzfive-smarc-som: Drop deleting interrupt properties ↵Lad Prabhakar1-16/+0
from ETH0/1 nodes Now that we have enabled IRQC support for RZ/Five SoC switch to interrupt mode for ethernet0/1 PHYs instead of polling mode. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20240403203503.634465-6-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2024-04-22riscv: dts: renesas: r9a07g043f: Add IRQC node to RZ/Five SoC DTSILad Prabhakar1-0/+75
Add the IRQC node to RZ/Five (R9A07G043F) SoC DTSI. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20240403203503.634465-4-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2024-04-22RISC-V: Use the minor version mask while computing sbi versionAtish Patra1-2/+2
As per the SBI specification, minor version is encoded in the lower 24 bits only. Make sure that the SBI version is computed with the appropriate mask. Currently, there is no minor version in use. Thus, it doesn't change anything functionality but it is good to be compliant with the specification. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240420151741.962500-8-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-22RISC-V: KVM: Rename the SBI_STA_SHMEM_DISABLE to a generic nameAtish Patra3-6/+6
SBI_STA_SHMEM_DISABLE is a macro to invoke disable shared memory commands. As this can be invoked from other SBI extension context as well, rename it to more generic name as SBI_SHMEM_DISABLE. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240420151741.962500-7-atishp@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>