summaryrefslogtreecommitdiff
path: root/arch/loongarch
AgeCommit message (Collapse)AuthorFilesLines
2023-09-20LoongArch: Don't inline kasan_mem_to_shadow()/kasan_shadow_to_mem()Huacai Chen2-53/+57
As Linus suggested, kasan_mem_to_shadow()/kasan_shadow_to_mem() are not performance-critical and too big to inline. This is simply wrong so just define them out-of-line. If they really need to be inlined in future, such as the objtool / SMAP issue for X86, we should mark them __always_inline. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-20kasan: Cleanup the __HAVE_ARCH_SHADOW_MAP usageHuacai Chen1-2/+8
As Linus suggested, __HAVE_ARCH_XYZ is "stupid" and "having historical uses of it doesn't make it good". So migrate __HAVE_ARCH_SHADOW_MAP to separate macros named after the respective functions. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-20LoongArch: Set all reserved memblocks on Node#0 at initializationHuacai Chen1-1/+3
After commit 61167ad5fecdea ("mm: pass nid to reserve_bootmem_region()") we get a panic if DEFERRED_STRUCT_PAGE_INIT is enabled: [ 0.000000] CPU 0 Unable to handle kernel paging request at virtual address 0000000000002b82, era == 90000000040e3f28, ra == 90000000040e3f18 [ 0.000000] Oops[#1]: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0+ #733 [ 0.000000] pc 90000000040e3f28 ra 90000000040e3f18 tp 90000000046f4000 sp 90000000046f7c90 [ 0.000000] a0 0000000000000001 a1 0000000000200000 a2 0000000000000040 a3 90000000046f7ca0 [ 0.000000] a4 90000000046f7ca4 a5 0000000000000000 a6 90000000046f7c38 a7 0000000000000000 [ 0.000000] t0 0000000000000002 t1 9000000004b00ac8 t2 90000000040e3f18 t3 90000000040f0800 [ 0.000000] t4 00000000000f0000 t5 80000000ffffe07e t6 0000000000000003 t7 900000047fff5e20 [ 0.000000] t8 aaaaaaaaaaaaaaab u0 0000000000000018 s9 0000000000000000 s0 fffffefffe000000 [ 0.000000] s1 0000000000000000 s2 0000000000000080 s3 0000000000000040 s4 0000000000000000 [ 0.000000] s5 0000000000000000 s6 fffffefffe000000 s7 900000000470b740 s8 9000000004ad4000 [ 0.000000] ra: 90000000040e3f18 reserve_bootmem_region+0xec/0x21c [ 0.000000] ERA: 90000000040e3f28 reserve_bootmem_region+0xfc/0x21c [ 0.000000] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 0.000000] PRMD: 00000000 (PPLV0 -PIE -PWE) [ 0.000000] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 0.000000] ECFG: 00070800 (LIE=11 VS=7) [ 0.000000] ESTAT: 00010800 [PIL] (IS=11 ECode=1 EsubCode=0) [ 0.000000] BADV: 0000000000002b82 [ 0.000000] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000) [ 0.000000] Modules linked in: [ 0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____)) [ 0.000000] Stack : 0000000000000000 9000000002eb5430 0000003a00000020 90000000045ccd00 [ 0.000000] 900000000470e000 90000000002c1918 0000000000000000 9000000004110780 [ 0.000000] 00000000fe6c0000 0000000480000000 9000000004b4e368 9000000004110748 [ 0.000000] 0000000000000000 900000000421ca84 9000000004620000 9000000004564970 [ 0.000000] 90000000046f7d78 9000000002cc9f70 90000000002c1918 900000000470e000 [ 0.000000] 9000000004564970 90000000040bc0e0 90000000046f7d78 0000000000000000 [ 0.000000] 0000000000004000 90000000045ccd00 0000000000000000 90000000002c1918 [ 0.000000] 90000000002c1900 900000000470b700 9000000004b4df78 9000000004620000 [ 0.000000] 90000000046200a8 90000000046200a8 0000000000000000 9000000004218b2c [ 0.000000] 9000000004270008 0000000000000001 0000000000000000 90000000045ccd00 [ 0.000000] ... [ 0.000000] Call Trace: [ 0.000000] [<90000000040e3f28>] reserve_bootmem_region+0xfc/0x21c [ 0.000000] [<900000000421ca84>] memblock_free_all+0x114/0x350 [ 0.000000] [<9000000004218b2c>] mm_core_init+0x138/0x3cc [ 0.000000] [<9000000004200e38>] start_kernel+0x488/0x7a4 [ 0.000000] [<90000000040df0d8>] kernel_entry+0xd8/0xdc [ 0.000000] [ 0.000000] Code: 02eb21ad 00410f4c 380c31ac <262b818d> 6800b70d 02c1c196 0015001c 57fe4bb1 260002cd The reason is early memblock_reserve() in memblock_init() set node id to MAX_NUMNODES, making NODE_DATA(nid) a NULL dereference in the call chain reserve_bootmem_region() -> init_reserved_page(). After memblock_init(), those late calls of memblock_reserve() operate on subregions of memblock .memory regions. As a result, these reserved regions will be set to the correct node at the first iteration of memmap_init_reserved_pages(). So set all reserved memblocks on Node#0 at initialization can avoid this panic. Reported-by: WANG Xuerui <git@xen0n.name> Tested-by: WANG Xuerui <git@xen0n.name> Reviewed-by: WANG Xuerui <git@xen0n.name> # with nits addressed Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-20LoongArch: Remove dead code in relocate_new_kernelTiezhu Yang1-1/+0
The initial aim is to silence the following objtool warning: arch/loongarch/kernel/relocate_kernel.o: warning: objtool: relocate_new_kernel+0x74: unreachable instruction There are two adjacent "b" instructions, the second one is unreachable, it is dead code, just remove it. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-20LoongArch: Use _UL() and _ULL()Andy Shevchenko1-6/+6
Use _UL() and _ULL() that are provided by const.h. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-20LoongArch: Fix some build warnings with W=1Bibo Mao16-39/+72
There are some building warnings when building LoongArch kernel with W=1 as following, this patch fixes them. arch/loongarch/kernel/acpi.c:284:13: warning: no previous prototype for ‘acpi_numa_arch_fixup’ [-Wmissing-prototypes] 284 | void __init acpi_numa_arch_fixup(void) {} | ^~~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/time.c:32:13: warning: no previous prototype for ‘constant_timer_interrupt’ [-Wmissing-prototypes] 32 | irqreturn_t constant_timer_interrupt(int irq, void *data) | ^~~~~~~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/traps.c:496:25: warning: no previous prototype for 'do_fpe' [-Wmissing-prototypes] 496 | asmlinkage void noinstr do_fpe(struct pt_regs *regs | ^~~~~~ arch/loongarch/kernel/traps.c:813:22: warning: variable ‘opcode’ set but not used [-Wunused-but-set-variable] 813 | unsigned int opcode; | ^~~~~~ arch/loongarch/kernel/signal.c:895:14: warning: no previous prototype for ‘get_sigframe’ [-Wmissing-prototypes] 895 | void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, | ^~~~~~~~~~~~ arch/loongarch/kernel/syscall.c:21:40: warning: initialized field overwritten [-Woverride-init] 21 | #define __SYSCALL(nr, call) [nr] = (call), | ^ arch/loongarch/kernel/syscall.c:40:14: warning: no previous prototype for ‘do_syscall’ [-Wmissing-prototypes] 40 | void noinstr do_syscall(struct pt_regs *regs) | ^~~~~~~~~~ arch/loongarch/kernel/smp.c:502:17: warning: no previous prototype for ‘start_secondary’ [-Wmissing-prototypes] 502 | asmlinkage void start_secondary(void) | ^~~~~~~~~~~~~~~ arch/loongarch/kernel/process.c:309:15: warning: no previous prototype for ‘arch_align_stack’ [-Wmissing-prototypes] 309 | unsigned long arch_align_stack(unsigned long sp) | ^~~~~~~~~~~~~~~~ arch/loongarch/kernel/topology.c:13:5: warning: no previous prototype for ‘arch_register_cpu’ [-Wmissing-prototypes] 13 | int arch_register_cpu(int cpu) | ^~~~~~~~~~~~~~~~~ arch/loongarch/kernel/topology.c:27:6: warning: no previous prototype for ‘arch_unregister_cpu’ [-Wmissing-prototypes] 27 | void arch_unregister_cpu(int cpu) | ^~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/module-sections.c:103:5: warning: no previous prototype for ‘module_frob_arch_sections’ [-Wmissing-prototypes] 103 | int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/loongarch/mm/hugetlbpage.c:56:5: warning: no previous prototype for ‘is_aligned_hugepage_range’ [-Wmissing-prototypes] 56 | int is_aligned_hugepage_range(unsigned long addr, unsigned long len) | ^~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-20LoongArch: Fix lockdep static memory detectionHelge Deller1-27/+28
Since commit 0a6b58c5cd0d ("lockdep: fix static memory detection even more") the lockdep code uses is_kernel_core_data(), is_kernel_rodata() and init_section_contains() to verify if a lock is located inside a kernel static data section. This change triggers a failure on LoongArch, for which the vmlinux.lds.S script misses to put the locks (as part of in the .data.rel symbols) into the Linux data section. This patch fixes the lockdep problem by moving *(.data.rel*) symbols into the kernel data section (from _sdata to _edata). Additionally, move other wrongly assigned symbols too: - altinstructions into the _initdata section, - PLT symbols behind the read-only section, and - *(.la_abs) into the data section. Cc: stable <stable@kernel.org> # v6.4+ Fixes: 0a6b58c5cd0d ("lockdep: fix static memory detection even more") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-08Merge tag 'loongarch-6.6' of ↵Linus Torvalds59-449/+2825
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Allow usage of LSX/LASX in the kernel, and use them for SIMD-optimized RAID5/RAID6 routines - Add Loongson Binary Translation (LBT) extension support - Add basic KGDB & KDB support - Add building with kcov coverage - Add KFENCE (Kernel Electric-Fence) support - Add KASAN (Kernel Address Sanitizer) support - Some bug fixes and other small changes - Update the default config file * tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits) LoongArch: Update Loongson-3 default config file LoongArch: Add KASAN (Kernel Address Sanitizer) support LoongArch: Simplify the processing of jumping new kernel for KASLR kasan: Add (pmd|pud)_init for LoongArch zero_(pud|p4d)_populate process kasan: Add __HAVE_ARCH_SHADOW_MAP to support arch specific mapping LoongArch: Add KFENCE (Kernel Electric-Fence) support LoongArch: Get partial stack information when providing regs parameter LoongArch: mm: Add page table mapped mode support for virt_to_page() kfence: Defer the assignment of the local variable addr LoongArch: Allow building with kcov coverage LoongArch: Provide kaslr_offset() to get kernel offset LoongArch: Add basic KGDB & KDB support LoongArch: Add Loongson Binary Translation (LBT) extension support raid6: Add LoongArch SIMD recovery implementation raid6: Add LoongArch SIMD syndrome calculation LoongArch: Add SIMD-optimized XOR routines LoongArch: Allow usage of LSX/LASX in the kernel LoongArch: Define symbol 'fault' as a local label in fpu.S LoongArch: Adjust {copy, clear}_user exception handler behavior LoongArch: Use static defined zero page rather than allocated ...
2023-09-07LoongArch: Update Loongson-3 default config fileHuacai Chen1-4/+70
1, Enable LSX and LASX. 2, Enable KASLR (CONFIG_RANDOMIZE_BASE). 3, Enable jump label (patching mechanism for static key). 4, Enable LoongArch CRC32(c) Acceleration. 5, Enable Loongson-specific drivers: I2C/RTC/DRM/SOC/CLK/PINCTRL/GPIO/SPI. 6, Enable EXFAT/NTFS3/JFS/GFS2/OCFS2/UBIFS/EROFS/CEPH file systems. 7, Enable WangXun NGBE/TXGBE NIC drivers. 8, Enable some IPVS options. 9, Remove CONFIG_SYSFS_DEPRECATED since it is removed in Kconfig. 10, Remove CONFIG_IP_NF_TARGET_CLUSTERIP since it is removed in Kconfig. 11, Remove CONFIG_NFT_OBJREF since it is removed in Kconfig. 12, Remove CONFIG_R8188EU since it is replaced by CONFIG_RTL8XXXU. Signed-off-by: Trevor Woerner <twoerner@gmail.com> Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Add KASAN (Kernel Address Sanitizer) supportQing Zhang14-9/+451
1/8 of kernel addresses reserved for shadow memory. But for LoongArch, There are a lot of holes between different segments and valid address space (256T available) is insufficient to map all these segments to kasan shadow memory with the common formula provided by kasan core, saying (addr >> KASAN_SHADOW_SCALE_SHIFT) + KASAN_SHADOW_OFFSET So LoongArch has a arch-specific mapping formula, different segments are mapped individually, and only limited space lengths of these specific segments are mapped to shadow. At early boot stage the whole shadow region populated with just one physical page (kasan_early_shadow_page). Later, this page is reused as readonly zero shadow for some memory that kasan currently don't track. After mapping the physical memory, pages for shadow memory are allocated and mapped. Functions like memset()/memcpy()/memmove() do a lot of memory accesses. If bad pointer passed to one of these function it is important to be caught. Compiler's instrumentation cannot do this since these functions are written in assembly. KASan replaces memory functions with manually instrumented variants. Original functions declared as weak symbols so strong definitions in mm/kasan/kasan.c could replace them. Original functions have aliases with '__' prefix in names, so we could call non-instrumented variant if needed. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Simplify the processing of jumping new kernel for KASLRQing Zhang3-12/+9
Modified relocate_kernel() doesn't return new kernel's entry point but the random_offset. In this way we share the start_kernel() processing with the normal kernel, which avoids calling 'jr a0' directly and allows some other operations (e.g, kasan_early_init) before start_kernel() when KASLR (CONFIG_RANDOMIZE_BASE) is turned on. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Add KFENCE (Kernel Electric-Fence) supportEnze Li4-9/+86
The LoongArch architecture is quite different from other architectures. When the allocating of KFENCE itself is done, it is mapped to the direct mapping configuration window [1] by default on LoongArch. It means that it is not possible to use the page table mapped mode which required by the KFENCE system and therefore it should be remapped to the appropriate region. This patch adds architecture specific implementation details for KFENCE. In particular, this implements the required interface in <asm/kfence.h>. Tested this patch by running the testcases and all passed. [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode Signed-off-by: Enze Li <lienze@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Get partial stack information when providing regs parameterEnze Li1-8/+10
Currently, arch_stack_walk() can only get the full stack information including NMI. This is because the implementation of arch_stack_walk() is forced to ignore the information passed by the regs parameter and use the current stack information instead. For some detection systems like KFENCE, only partial stack information is needed. In particular, the stack frame where the interrupt occurred. To support KFENCE, this patch modifies the implementation of the arch_stack_walk() function so that if this function is called with the regs argument passed, it retains all the stack information in regs and uses it to provide accurate information. Before this patch: [ 1.531195 ] ================================================================== [ 1.531442 ] BUG: KFENCE: out-of-bounds read in stack_trace_save_regs+0x48/0x6c [ 1.531442 ] [ 1.531900 ] Out-of-bounds read at 0xffff800012267fff (1B left of kfence-#12): [ 1.532046 ] stack_trace_save_regs+0x48/0x6c [ 1.532169 ] kfence_report_error+0xa4/0x528 [ 1.532276 ] kfence_handle_page_fault+0x124/0x270 [ 1.532388 ] no_context+0x50/0x94 [ 1.532453 ] do_page_fault+0x1a8/0x36c [ 1.532524 ] tlb_do_page_fault_0+0x118/0x1b4 [ 1.532623 ] test_out_of_bounds_read+0xa0/0x1d8 [ 1.532745 ] kunit_generic_run_threadfn_adapter+0x1c/0x28 [ 1.532854 ] kthread+0x124/0x130 [ 1.532922 ] ret_from_kernel_thread+0xc/0xa4 <snip> After this patch: [ 1.320220 ] ================================================================== [ 1.320401 ] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa8/0x1d8 [ 1.320401 ] [ 1.320898 ] Out-of-bounds read at 0xffff800012257fff (1B left of kfence-#10): [ 1.321134 ] test_out_of_bounds_read+0xa8/0x1d8 [ 1.321264 ] kunit_generic_run_threadfn_adapter+0x1c/0x28 [ 1.321392 ] kthread+0x124/0x130 [ 1.321459 ] ret_from_kernel_thread+0xc/0xa4 <snip> Suggested-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Enze Li <lienze@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: mm: Add page table mapped mode support for virt_to_page()Enze Li3-1/+21
According to LoongArch documentations, there are two types of address translation modes: direct mapped address translation mode (DMW mode) and page table mapped address translation mode (TLB mode). Currently, virt_to_page() only supports direct mapped mode. This patch determines which mode is used, and adds corresponding handling functions for both modes. For more details on the two modes, see [1]. [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode Signed-off-by: Enze Li <lienze@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Allow building with kcov coverageFeiyang Chen2-0/+4
Add ARCH_HAS_KCOV and HAVE_GCC_PLUGINS to the LoongArch Kconfig. And also disable instrumentation of vdso. Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Provide kaslr_offset() to get kernel offsetFeiyang Chen1-0/+6
Provide kaslr_offset() to get the kernel offset when KASLR is enabled. Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Add basic KGDB & KDB supportQing Zhang7-0/+844
KGDB is intended to be used as a source level debugger for the Linux kernel. It is used along with gdb to debug a Linux kernel. GDB can be used to "break in" to the kernel to inspect memory, variables and regs similar to the way an application developer would use GDB to debug an application. KDB is a frontend of KGDB which is similar to GDB. By now, in addition to the generic KGDB features, the LoongArch KGDB implements the following features: - Hardware breakpoints/watchpoints; - Software single-step support for KDB. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> # Framework & CoreFeature Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> # BreakPoint & SingleStep Signed-off-by: Hui Li <lihui@loongson.cn> # Some Minor Improvements Signed-off-by: Randy Dunlap <rdunlap@infradead.org> # Some Build Error Fixes Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Add Loongson Binary Translation (LBT) extension supportQi Hu19-31/+693
Loongson Binary Translation (LBT) is used to accelerate binary translation, which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags) and x87 fpu stack pointer (ftop). This patch support kernel to save/restore these registers, handle the LBT exception and maintain sigcontext. Signed-off-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Add SIMD-optimized XOR routinesWANG Xuerui7-0/+417
Add LSX and LASX implementations of xor operations, operating on 64 bytes (one L1 cache line) at a time, for a balance between memory utilization and instruction mix. Huacai confirmed that all future LoongArch implementations by Loongson (that we care) will likely also feature 64-byte cache lines, and experiments show no throughput improvement with further unrolling. Performance numbers measured during system boot on a 3A5000 @ 2.5GHz: > 8regs : 12702 MB/sec > 8regs_prefetch : 10920 MB/sec > 32regs : 12686 MB/sec > 32regs_prefetch : 10918 MB/sec > lsx : 17589 MB/sec > lasx : 26116 MB/sec Acked-by: Song Liu <song@kernel.org> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Allow usage of LSX/LASX in the kernelHuacai Chen1-4/+51
Allow usage of LSX/LASX in the kernel by extending kernel_fpu_begin() and kernel_fpu_end(). Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Define symbol 'fault' as a local label in fpu.STiezhu Yang1-3/+2
The initial aim is to silence the following objtool warnings: arch/loongarch/kernel/fpu.o: warning: objtool: _save_fp_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _restore_fp_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _save_lsx_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _restore_lsx_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _save_lasx_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _restore_lasx_context() falls through to next function fault() Currently, SYM_FUNC_START()/SYM_FUNC_END() defines the symbol 'fault' as SYM_T_FUNC which is STT_FUNC, the objtool warnings are generated through the following code: tools/objtool/include/objtool/check.h: static inline struct symbol *insn_func(struct instruction *insn) { struct symbol *sym = insn->sym; if (sym && sym->type != STT_FUNC) sym = NULL; return sym; } tools/objtool/check.c: static int validate_branch(struct objtool_file *file, struct symbol *func, struct instruction *insn, struct insn_state state) { ... if (func && insn_func(insn) && func != insn_func(insn)->pfunc) { ... WARN("%s() falls through to next function %s()", func->name, insn_func(insn)->name); return 1; } ... } We can see that the fixup can be a local label in the following code: arch/loongarch/include/asm/asm-extable.h: .pushsection __ex_table, "a"; \ .balign 4; \ .long ((insn) - .); \ .long ((fixup) - .); \ .short (type); \ .short (data); \ .popsection; .macro _asm_extable, insn, fixup __ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0) .endm Like arch/loongarch/lib/*.S, just define the symbol 'fault' as a local label in fpu.S. Before: $ readelf -s arch/loongarch/kernel/fpu.o | awk -F: /fault/'{print $2}' 000000000000053c 8 FUNC GLOBAL DEFAULT 1 fault After: $ readelf -s arch/loongarch/kernel/fpu.o | awk -F: /fault/'{print $2}' 000000000000053c 0 NOTYPE LOCAL DEFAULT 1 .L_fpu_fault Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Adjust {copy, clear}_user exception handler behaviorWeihao Li2-121/+127
The {copy, clear}_user function should returns number of bytes that could not be {copied, cleared}. So, try to {copy, clear} byte by byte when ld.{d,w,h} and st.{d,w,h} trapped into an exception. Reviewed-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Weihao Li <liweihao@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Use static defined zero page rather than allocatedBibo Mao4-35/+3
On LoongArch system, there is only one page needed for zero page (no cache synonyms), and there is no COLOR_ZERO_PAGE, so zero_page_mask is useless and the macro __HAVE_COLOR_ZERO_PAGE is not necessary. Like other popular architectures, It is simpler to define the zero page in kernel BSS code segment rather than dynamically allocate. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: mm: Introduce unified function populate_kernel_pte()Bibo Mao3-57/+23
Function pcpu_populate_pte() and fixmap_pte() are similar, they populate one page from kernel address space. And there is confusion between pgd and p4d in the function fixmap_pte(), such as pgd_none() always returns zero. This patch introduces a unified function populate_kernel_pte() and then replaces pcpu_populate_pte() and fixmap_pte(). Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Code improvements in function pcpu_populate_pte()Bibo Mao1-13/+15
Do some code improvements in function pcpu_populate_pte(): 1. Add memory allocation failure handling; 2. Replace pgd_populate() with p4d_populate(), it will be useful if there are four-level page tables. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Remove shm_align_mask and use SHMLBA insteadHuacai Chen2-8/+6
Both shm_align_mask and SHMLBA want to avoid cache alias. But they are inconsistent: shm_align_mask is (PAGE_SIZE - 1) while SHMLBA is SZ_64K, but PAGE_SIZE is not always equal to SZ_64K. This may cause problems when shmat() twice. Fix this problem by removing shm_align_mask and using SHMLBA (strictly SHMLBA - 1) instead. Reported-by: Jiantao Shan <shanjiantao@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: mm: Add p?d_leaf() definitionsHongchen Zhang1-0/+3
When I do LTP test, LTP test case ksm06 caused panic at break_ksm_pmd_entry -> pmd_leaf (Huge page table but False) -> pte_present (panic) The reason is pmd_leaf() is not defined, So like commit 501b81046701 ("mips: mm: add p?d_leaf() definitions") add p?d_leaf() definition for LoongArch. Fixes: 09cfefb7fa70 ("LoongArch: Add memory management") Cc: stable@vger.kernel.org Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06LoongArch: Drop unused parse_r and parse_v macrosNathan Chancellor2-150/+0
When building with CONFIG_LTO_CLANG_FULL, there are several errors due to the way that parse_r is defined with an __asm__ statement in a header: ld.lld: error: ld-temp.o <inline asm>:105:1: macro 'parse_r' is already defined .macro parse_r var r ^ This was an issue for arch/mips as well, which was resolved by commit 67512a8cf5a7 ("MIPS: Avoid macro redefinitions"). However, parse_r is unused in arch/loongarch after commit 83d8b38967d2 ("LoongArch: Simplify the invtlb wrappers"), so doing the same change does not make much sense now. Just remove parse_r (and parse_v, which is also unused) to resolve the redefinition error. If it needs to be brought back due to an actual use, it should be brought back with the same changes as the aforementioned arch/mips commit. Closes: https://github.com/ClangBuiltLinux/linux/issues/1924 Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-01Merge tag 'tty-6.6-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here is the big set of tty and serial driver changes for 6.6-rc1. Lots of cleanups in here this cycle, and some driver updates. Short summary is: - Jiri's continued work to make the tty code and apis be a bit more sane with regards to modern kernel coding style and types - cpm_uart driver updates - n_gsm updates and fixes - meson driver updates - sc16is7xx driver updates - 8250 driver updates for different hardware types - qcom-geni driver fixes - tegra serial driver change - stm32 driver updates - synclink_gt driver cleanups - tty structure size reduction All of these have been in linux-next this week with no reported issues. The last bit of cleanups from Jiri and the tty structure size reduction came in last week, a bit late but as they were just style changes and size reductions, I figured they should get into this merge cycle so that others can work on top of them with no merge conflicts" * tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits) tty: shrink the size of struct tty_struct by 40 bytes tty: n_tty: deduplicate copy code in n_tty_receive_buf_real_raw() tty: n_tty: extract ECHO_OP processing to a separate function tty: n_tty: unify counts to size_t tty: n_tty: use u8 for chars and flags tty: n_tty: simplify chars_in_buffer() tty: n_tty: remove unsigned char casts from character constants tty: n_tty: move newline handling to a separate function tty: n_tty: move canon handling to a separate function tty: n_tty: use MASK() for masking out size bits tty: n_tty: make n_tty_data::num_overrun unsigned tty: n_tty: use time_is_before_jiffies() in n_tty_receive_overrun() tty: n_tty: use 'num' for writes' counts tty: n_tty: use output character directly tty: n_tty: make flow of n_tty_receive_buf_common() a bool Revert "tty: serial: meson: Add a earlycon for the T7 SoC" Documentation: devices.txt: Fix minors for ttyCPM* Documentation: devices.txt: Remove ttySIOC* Documentation: devices.txt: Remove ttyIOC* serial: 8250_bcm7271: improve bcm7271 8250 port ...
2023-08-31Merge tag 'x86_shstk_for_6.6-rc1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 shadow stack support from Dave Hansen: "This is the long awaited x86 shadow stack support, part of Intel's Control-flow Enforcement Technology (CET). CET consists of two related security features: shadow stacks and indirect branch tracking. This series implements just the shadow stack part of this feature, and just for userspace. The main use case for shadow stack is providing protection against return oriented programming attacks. It works by maintaining a secondary (shadow) stack using a special memory type that has protections against modification. When executing a CALL instruction, the processor pushes the return address to both the normal stack and to the special permission shadow stack. Upon RET, the processor pops the shadow stack copy and compares it to the normal stack copy. For more information, refer to the links below for the earlier versions of this patch set" Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/ Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/ * tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) x86/shstk: Change order of __user in type x86/ibt: Convert IBT selftest to asm x86/shstk: Don't retry vm_munmap() on -EINTR x86/kbuild: Fix Documentation/ reference x86/shstk: Move arch detail comment out of core mm x86/shstk: Add ARCH_SHSTK_STATUS x86/shstk: Add ARCH_SHSTK_UNLOCK x86: Add PTRACE interface for shadow stack selftests/x86: Add shadow stack test x86/cpufeatures: Enable CET CR4 bit for shadow stack x86/shstk: Wire in shadow stack interface x86: Expose thread features in /proc/$PID/status x86/shstk: Support WRSS for userspace x86/shstk: Introduce map_shadow_stack syscall x86/shstk: Check that signal frame is shadow stack mem x86/shstk: Check that SSP is aligned on sigreturn x86/shstk: Handle signals for shadow stack x86/shstk: Introduce routines modifying shstk x86/shstk: Handle thread shadow stack x86/shstk: Add user-mode shadow stack support ...
2023-08-30Merge tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-0/+1
Pull drm updates from Dave Airlie: "The drm core grew a new generic gpu virtual address manager, and new execution locking helpers. These are used by nouveau now to provide uAPI support for the userspace Vulkan driver. AMD had a bunch of new IP core support, loads of refactoring around fbdev, but mostly just the usual amount of stuff across the board. core: - fix gfp flags in drmm_kmalloc gpuva: - add new generic GPU VA manager (for nouveau initially) syncobj: - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl dma-buf: - acquire resv lock for mmap() in exporters - support dma-buf self import automatically - docs fixes backlight: - fix fbdev interactions atomic: - improve logging prime: - remove struct gem_prim_mmap plus driver updates gem: - drm_exec: add locking over multiple GEM objects - fix lockdep checking fbdev: - make fbdev userspace interfaces optional - use linux device instead of fbdev device - use deferred i/o helper macros in various drivers - Make FB core selectable without drivers - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT - Add helper macros and Kconfig tokens for DMA-allocated framebuffer ttm: - support init_on_free - swapout fixes panel: - panel-edp: Support AUO B116XAB01.4 - Support Visionox R66451 plus DT bindings - ld9040: - Backlight support - magic improved - Kconfig fix - Convert to of_device_get_match_data() - Fix Kconfig dependencies - simple: - Set bpc value to fix warning - Set connector type for AUO T215HVN01 - Support Innolux G156HCE-L01 plus DT bindings - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings - sitronix-st7789v: - Support Inanbo T28CP45TN89 plus DT bindings - Support EDT ET028013DMA plus DT bindings - Various cleanups - edp: Add timings for N140HCA-EAC - Allow panels and touchscreens to power sequence together - Fix Innolux G156HCE-L01 LVDS clock bridge: - debugfs for chains support - dw-hdmi: - Improve support for YUV420 bus format - CEC suspend/resume - update EDID on HDMI detect - dw-mipi-dsi: Fix enable/disable of DSI controller - lt9611uxc: Use MODULE_FIRMWARE() - ps8640: Remove broken EDID code - samsung-dsim: Fix command transfer - tc358764: - Handle HS/VS polarity - Use BIT() macro - Various cleanups - adv7511: Fix low refresh rate - anx7625: - Switch to macros instead of hardcoded values - locking fixes - tc358767: fix hardware delays - sitronix-st7789v: - Support panel orientation - Support rotation property - Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings amdgpu: - SDMA 6.1.0 support - HDP 6.1 support - SMUIO 14.0 support - PSP 14.0 support - IH 6.1 support - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SMU 13.x fixes - PSP 13.x fixes - SubVP fixes - GC 9.4.3 fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes - initial freesync panel replay support - revert zpos properly until igt regression is fixeed - use TTM to manage doorbell BAR - Expose both current and average power via hwmon if supported amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes - Convert older APUs to use dGPU path like newer APUs - Drop IOMMUv2 path as it is no longer used - TBA fix for aldebaran i915: - ICL+ DSI modeset sequence - HDCP improvements - MTL display fixes and cleanups - HSW/BDW PSR1 restored - Init DDI ports in VBT order - General display refactors - Start using plane scale factor for relative data rate - Use shmem for dpt objects - Expose RPS thresholds in sysfs - Apply GuC SLPC min frequency softlimit correctly - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL - Fix a VMA UAF for multi-gt platform - Do not use stolen on MTL due to HW bug - Check HuC and GuC version compatibility on MTL - avoid infinite GPU waits due to premature release of request memory - Fixes and updates for GSC memory allocation - Display SDVO fixes - Take stolen handling out of FBC code - Make i915_coherent_map_type GT-centric - Simplify shmem_create_from_object map_type msm: - SM6125 MDSS support - DPU: SM6125 DPU support - DSI: runtime PM support, burst mode support - DSI PHY: SM6125 support in 14nm DSI PHY driver - GPU: prepare for a7xx - fix a690 firmware - disable relocs on a6xx and newer radeon: - Lots of checkpatch cleanups ast: - improve device-model detection - Represent BMV as virtual connector - Report DP connection status nouveau: - add new exec/bind interface to support Vulkan - document some getparam ioctls - improve VRAM detection - various fixes/cleanups - workraound DPCD issues ivpu: - MMU updates - debugfs support - Support vpu4 virtio: - add sync object support atmel-hlcdc: - Support inverted pixclock polarity etnaviv: - runtime PM cleanups - hang handling fixes exynos: - use fbdev DMA helpers - fix possible NULL ptr dereference komeda: - always attach encoder omapdrm: - use fbdev DMA helpers ingenic: - kconfig regmap fixes loongson: - support display controller mediatek: - Small mtk-dpi cleanups - DisplayPort: support eDP and aux-bus - Fix coverity issues - Fix potential memory leak if vmap() fail mgag200: - minor fixes mxsfb: - support disabling overlay planes panfrost: - fix sync in IRQ handling ssd130x: - Support per-controller default resolution plus DT bindings - Reduce memory-allocation overhead - Improve intermediate buffer size computation - Fix allocation of temporary buffers - Fix pitch computation - Fix shadow plane allocation tegra: - use fbdev DMA helpers - Convert to devm_platform_ioremap_resource() - support bridge/connector - enable PM tidss: - Support TI AM625 plus DT bindings - Implement new connector model plus driver updates vkms: - improve write back support - docs fixes - support gamma LUT zynqmp-dpsub: - misc fixes" * tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm: (1327 commits) drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map() drm/tests/drm_kunit_helpers: Place correct function name in the comment header drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitly drm/nouveau: uvmm: fix unset region pointer on remap drm/nouveau: sched: avoid job races between entities drm/i915: Fix HPD polling, reenabling the output poll work as needed drm: Add an HPD poll helper to reschedule the poll work drm/i915: Fix TLB-Invalidation seqno store drm/ttm/tests: Fix type conversion in ttm_pool_test drm/msm/a6xx: Bail out early if setting GPU OOB fails drm/msm/a6xx: Move LLC accessors to the common header drm/msm/a6xx: Introduce a6xx_llc_read drm/ttm/tests: Require MMU when testing drm/panel: simple: Fix Innolux G156HCE-L01 LVDS clock Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0"" drm/amdgpu: Add memory vendor information drm/amd: flush any delayed gfxoff on suspend entry drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix drm/amdgpu: Remove gfxoff check in GFX v9.4.3 drm/amd/pm: Update pci link speed for smu v13.0.6 ...
2023-08-30Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of ↵Linus Torvalds3-22/+10
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - An extensive rework of kexec and crash Kconfig from Eric DeVolder ("refactor Kconfig to consolidate KEXEC and CRASH options") - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a couple of macros to args.h") - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper commands") - vsprintf inclusion rationalization from Andy Shevchenko ("lib/vsprintf: Rework header inclusions") - Switch the handling of kdump from a udev scheme to in-kernel handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory hot un/plug") - Many singleton patches to various parts of the tree * tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits) document while_each_thread(), change first_tid() to use for_each_thread() drivers/char/mem.c: shrink character device's devlist[] array x86/crash: optimize CPU changes crash: change crash_prepare_elf64_headers() to for_each_possible_cpu() crash: hotplug support for kexec_load() x86/crash: add x86 crash hotplug support crash: memory and CPU hotplug sysfs attributes kexec: exclude elfcorehdr from the segment digest crash: add generic infrastructure for crash hotplug support crash: move a few code bits to setup support of crash hotplug kstrtox: consistently use _tolower() kill do_each_thread() nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse scripts/bloat-o-meter: count weak symbol sizes treewide: drop CONFIG_EMBEDDED lockdep: fix static memory detection even more lib/vsprintf: declare no_hash_pointers in sprintf.h lib/vsprintf: split out sprintf() and friends kernel/fork: stop playing lockless games for exe_file replacement adfs: delete unused "union adfs_dirtail" definition ...
2023-08-30Merge tag 'mm-stable-2023-08-28-18-26' of ↵Linus Torvalds8-38/+42
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") - Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which reduces the special-case code for handling hugetlb pages in GUP. It also speeds up GUP handling of transparent hugepages. - Peng Zhang provides some maple tree speedups ("Optimize the fast path of mas_store()"). - Sergey Senozhatsky has improved te performance of zsmalloc during compaction (zsmalloc: small compaction improvements"). - Domenico Cerasuolo has developed additional selftest code for zswap ("selftests: cgroup: add zswap test program"). - xu xin has doe some work on KSM's handling of zero pages. These changes are mainly to enable the user to better understand the effectiveness of KSM's treatment of zero pages ("ksm: support tracking KSM-placed zero-pages"). - Jeff Xu has fixes the behaviour of memfd's MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED"). - David Howells has fixed an fscache optimization ("mm, netfs, fscache: Stop read optimisation when folio removed from pagecache"). - Axel Rasmussen has given userfaultfd the ability to simulate memory poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD"). - Miaohe Lin has contributed some routine maintenance work on the memory-failure code ("mm: memory-failure: remove unneeded PageHuge() check"). - Peng Zhang has contributed some maintenance work on the maple tree code ("Improve the validation for maple tree and some cleanup"). - Hugh Dickins has optimized the collapsing of shmem or file pages into THPs ("mm: free retracted page table by RCU"). - Jiaqi Yan has a patch series which permits us to use the healthy subpages within a hardware poisoned huge page for general purposes ("Improve hugetlbfs read on HWPOISON hugepages"). - Kemeng Shi has done some maintenance work on the pagetable-check code ("Remove unused parameters in page_table_check"). - More folioification work from Matthew Wilcox ("More filesystem folio conversions for 6.6"), ("Followup folio conversions for zswap"). And from ZhangPeng ("Convert several functions in page_io.c to use a folio"). - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext"). - Baoquan He has converted some architectures to use the GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take GENERIC_IOREMAP way"). - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support batched/deferred tlb shootdown during page reclamation/migration"). - Better maple tree lockdep checking from Liam Howlett ("More strict maple tree lockdep"). Liam also developed some efficiency improvements ("Reduce preallocations for maple tree"). - Cleanup and optimization to the secondary IOMMU TLB invalidation, from Alistair Popple ("Invalidate secondary IOMMU TLB on permission upgrade"). - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes for arm64"). - Kemeng Shi provides some maintenance work on the compaction code ("Two minor cleanups for compaction"). - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most file-backed faults under the VMA lock"). - Aneesh Kumar contributes code to use the vmemmap optimization for DAX on ppc64, under some circumstances ("Add support for DAX vmemmap optimization for ppc64"). - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client data in page_ext"), ("minor cleanups to page_ext header"). - Some zswap cleanups from Johannes Weiner ("mm: zswap: three cleanups"). - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan"). - VMA handling cleanups from Kefeng Wang ("mm: convert to vma_is_initial_heap/stack()"). - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes: implement DAMOS tried total bytes file"), ("Extend DAMOS filters for address ranges and DAMON monitoring targets"). - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction"). - Liam Howlett has improved the maple tree node replacement code ("maple_tree: Change replacement strategy"). - ZhangPeng has a general code cleanup - use the K() macro more widely ("cleanup with helper macro K()"). - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap on memory feature on ppc64"). - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list in page_alloc"), ("Two minor cleanups for get pageblock migratetype"). - Vishal Moola introduces a memory descriptor for page table tracking, "struct ptdesc" ("Split ptdesc from struct page"). - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups for vm.memfd_noexec"). - MM include file rationalization from Hugh Dickins ("arch: include asm/cacheflush.h in asm/hugetlb.h"). - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text output"). - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use object_cache instead of kmemleak_initialized"). - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor and _folio_order"). - A VMA locking scalability improvement from Suren Baghdasaryan ("Per-VMA lock support for swap and userfaults"). - pagetable handling cleanups from Matthew Wilcox ("New page table range API"). - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop using page->private on tail pages for THP_SWAP + cleanups"). - Cleanups and speedups to the hugetlb fault handling from Matthew Wilcox ("Change calling convention for ->huge_fault"). - Matthew Wilcox has also done some maintenance work on the MM subsystem documentation ("Improve mm documentation"). * tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits) maple_tree: shrink struct maple_tree maple_tree: clean up mas_wr_append() secretmem: convert page_is_secretmem() to folio_is_secretmem() nios2: fix flush_dcache_page() for usage from irq context hugetlb: add documentation for vma_kernel_pagesize() mm: add orphaned kernel-doc to the rst files. mm: fix clean_record_shared_mapping_range kernel-doc mm: fix get_mctgt_type() kernel-doc mm: fix kernel-doc warning from tlb_flush_rmaps() mm: remove enum page_entry_size mm: allow ->huge_fault() to be called without the mmap_lock held mm: move PMD_ORDER to pgtable.h mm: remove checks for pte_index memcg: remove duplication detection for mem_cgroup_uncharge_swap mm/huge_memory: work on folio->swap instead of page->private when splitting folio mm/swap: inline folio_set_swap_entry() and folio_swap_entry() mm/swap: use dedicated entry for swap in folio mm/swap: stop using page->private on tail pages for THP_SWAP selftests/mm: fix WARNING comparing pointer to 0 selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check ...
2023-08-29Merge tag 'perf-core-2023-08-28' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event updates from Ingo Molnar: - AMD IBS improvements - Intel PMU driver updates - Extend core perf facilities & the ARM PMU driver to better handle ARM big.LITTLE events - Micro-optimize software events and the ring-buffer code - Misc cleanups & fixes * tag 'perf-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/uncore: Remove unnecessary ?: operator around pcibios_err_to_errno() call perf/x86/intel: Add Crestmont PMU x86/cpu: Update Hybrids x86/cpu: Fix Crestmont uarch x86/cpu: Fix Gracemont uarch perf: Remove unused extern declaration arch_perf_get_page_size() perf: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability arm_pmu: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability perf/x86: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability perf/x86/ibs: Set mem_lvl_num, mem_remote and mem_hops for data_src perf/mem: Add PERF_MEM_LVLNUM_NA to PERF_MEM_NA perf/mem: Introduce PERF_MEM_LVLNUM_UNC perf/ring_buffer: Use local_try_cmpxchg in __perf_output_begin locking/arch: Avoid variable shadowing in local_try_cmpxchg() perf/core: Use local64_try_cmpxchg in perf_swevent_set_period perf/x86: Use local64_try_cmpxchg perf/amd: Prevent grouping of IBS events
2023-08-26LoongArch: Fix hw_breakpoint_control() for watchpointsHuacai Chen1-2/+1
In hw_breakpoint_control(), encode_ctrl_reg() has already encoded the MWPnCFG3_LoadEn/MWPnCFG3_StoreEn bits in info->ctrl. We don't need to add (1 << MWPnCFG3_LoadEn | 1 << MWPnCFG3_StoreEn) unconditionally. Otherwise we can't set read watchpoint and write watchpoint separately. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-26LoongArch: Ensure FP/SIMD registers in the core dump file is up to dateHuacai Chen2-4/+22
This is a port of commit 379eb01c21795edb4c ("riscv: Ensure the value of FP registers in the core dump file is up to date"). The values of FP/SIMD registers in the core dump file come from the thread.fpu. However, kernel saves the FP/SIMD registers only before scheduling out the process. If no process switch happens during the exception handling, kernel will not have a chance to save the latest values of FP/SIMD registers. So it may cause their values in the core dump file incorrect. To solve this problem, force fpr_get()/simd_get() to save the FP/SIMD registers into the thread.fpu if the target task equals the current task. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Put the body of play_dead() into arch_cpu_idle_dead()Tiezhu Yang3-10/+1
The initial aim is to silence the following objtool warning: arch/loongarch/kernel/process.o: warning: objtool: arch_cpu_idle_dead() falls through to next function start_thread() According to tools/objtool/Documentation/objtool.txt, this is because the last instruction of arch_cpu_idle_dead() is a call to a noreturn function play_dead(). In order to silence the warning, one simple way is to add the noreturn function play_dead() to objtool's hard-coded global_noreturns array, that is to say, just put "NORETURN(play_dead)" into tools/objtool/noreturns.h, it works well. But I noticed that play_dead() is only defined once and only called by arch_cpu_idle_dead(), so put the body of play_dead() into the caller arch_cpu_idle_dead(), then remove the noreturn function play_dead() is an alternative way which can reduce the overhead of the function call at the same time. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Add identifier names to arguments of die() declarationTiezhu Yang1-1/+1
Add identifier names to arguments of die() declaration in ptrace.h to fix the following checkpatch warnings: WARNING: function definition argument 'const char *' should also have an identifier name WARNING: function definition argument 'struct pt_regs *' should also have an identifier name Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Return earlier in die() if notify_die() returns NOTIFY_STOPTiezhu Yang1-2/+4
After the call to oops_exit(), it should not panic or execute the crash kernel if the oops is to be suppressed. Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Do not kill the task in die() if notify_die() returns NOTIFY_STOPTiezhu Yang2-7/+7
If notify_die() returns NOTIFY_STOP, honor the return value from the handler chain invocation in die() and return without killing the task as, through a debugger, the fault may have been fixed. It makes sense even if ignoring the event will make the system unstable: by allowing access through a debugger it has been compromised already anyway. It makes our port consistent with x86, arm64, riscv and csky. Commit 20c0d2d44029 ("[PATCH] i386: pass proper trap numbers to die chain handlers") may be the earliest of similar changes. Link: https://lore.kernel.org/r/43DDF02E.76F0.0078.0@novell.com/ Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Remove <asm/export.h>Masahiro Yamada1-1/+0
All *.S files under arch/loongarch/ have been converted to include <linux/export.h> instead of <asm/export.h>. Remove <asm/export.h>. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Replace #include <asm/export.h> with #include <linux/export.h>Masahiro Yamada8-8/+8
Commit ddb5cdbafaaad ("kbuild: generate KSYMTAB entries by modpost") deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>. Replace #include <asm/export.h> with #include <linux/export.h>. After all the <asm/export.h> lines are converted, <asm/export.h> and <asm-generic/export.h> will be removed. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Remove unneeded #include <asm/export.h>Masahiro Yamada3-3/+0
There is no EXPORT_SYMBOL() line there, hence #include <asm/export.h> is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Replace -ffreestanding with finer-grained -fno-builtin'sWANG Xuerui1-1/+1
As explained by Nick in the original issue: the kernel usually does a good job of providing library helpers that have similar semantics as their ordinary userspace libc equivalents, but -ffreestanding disables such libcall optimization and other related features in the compiler, which can lead to unexpected things such as CONFIG_FORTIFY_SOURCE not working (!). However, due to the desire for better control over unaligned accesses with respect to CONFIG_ARCH_STRICT_ALIGN, and also for avoiding the GCC bug https://gcc.gnu.org/PR109465, we do want to still disable optimizations for the memory libcalls (memcpy, memmove and memset for now). Use finer-grained -fno-builtin-* toggles to achieve this without losing source fortification and other libcall optimizations. Closes: https://github.com/ClangBuiltLinux/linux/issues/1897 Reported-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25LoongArch: Remove redundant "source drivers/firmware/Kconfig"Xi Ruoyao1-2/+0
In drivers/Kconfig, drivers/firmware/Kconfig is sourced for all ports so there is no need to source it in the port-specific Kconfig file. And sourcing it here also caused the "Firmware Drivers" menu appeared two times: one in the "Device Drivers" menu, another in the toplevel menu. This is really puzzling so remove it. Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-08-25mm: rationalise flush_icache_pages() and flush_icache_page()Matthew Wilcox (Oracle)1-2/+0
Move the default (no-op) implementation of flush_icache_pages() to <linux/cacheflush.h> from <asm-generic/cacheflush.h>. Remove the flush_icache_page() wrapper from each architecture into <linux/cacheflush.h>. Link: https://lkml.kernel.org/r/20230802151406.3735276-32-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-25loongarch: implement the new page table range APIMatthew Wilcox (Oracle)5-19/+23
Add update_mmu_cache_range() and change _PFN_SHIFT to PFN_PTE_SHIFT. It would probably be more efficient to implement __update_tlb() by flushing the entire folio instead of calling __update_tlb() N times, but I'll leave that for someone who understands the architecture better. Link: https://lkml.kernel.org/r/20230802151406.3735276-15-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24BackMerge tag 'v6.5-rc7' into drm-nextDave Airlie8-16/+30
Linux 6.5-rc7 This is needed for the CI stuff and the msm pull has fixes in it. Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-08-21loongarch: convert various functions to use ptdescsVishal Moola (Oracle)2-15/+19
As part of the conversions to replace pgtable constructor/destructors with ptdesc equivalents, convert various page table functions to use ptdescs. Some of the functions use the *get*page*() helper functions. Convert these to use pagetable_alloc() and ptdesc_address() instead to help standardize page tables further. Link: https://lkml.kernel.org/r/20230807230513.102486-22-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matthew Wilcox <willy@infradead.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18nmi_backtrace: allow excluding an arbitrary CPUDouglas Anderson2-3/+3
The APIs that allow backtracing across CPUs have always had a way to exclude the current CPU. This convenience means callers didn't need to find a place to allocate a CPU mask just to handle the common case. Let's extend the API to take a CPU ID to exclude instead of just a boolean. This isn't any more complex for the API to handle and allows the hardlockup detector to exclude a different CPU (the one it already did a trace for) without needing to find space for a CPU mask. Arguably, this new API also encourages safer behavior. Specifically if the caller wants to avoid tracing the current CPU (maybe because they already traced the current CPU) this makes it more obvious to the caller that they need to make sure that the current CPU ID can't change. [akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub] Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>