summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2024-05-14Merge tag 'timers-core-2024-05-12' of ↵Linus Torvalds1-15/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timers and timekeeping updates from Thomas Gleixner: "Core code: - Make timekeeping and VDSO time readouts resilent against math overflow: In guest context the kernel is prone to math overflow when the host defers the timer interrupt due to overload, malfunction or malice. This can be mitigated by checking the clocksource delta for the maximum deferrement which is readily available. If that value is exceeded then the code uses a slowpath function which can handle the multiplication overflow. This functionality is enabled unconditionally in the kernel, but made conditional in the VDSO code. The latter is conditional because it allows architectures to optimize the check so it is not causing performance regressions. On X86 this is achieved by reworking the existing check for negative TSC deltas as a negative delta obviously exceeds the maximum deferrement when it is evaluated as an unsigned value. That avoids two conditionals in the hotpath and allows to hide both the negative delta and the large delta handling in the same slow path. - Add an initial minimal ktime_t abstraction for Rust - The usual boring cleanups and enhancements Drivers: - Boring updates to device trees and trivial enhancements in various drivers" * tag 'timers-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) clocksource/drivers/arm_arch_timer: Mark hisi_161010101_oem_info const clocksource/drivers/timer-ti-dm: Remove an unused field in struct dmtimer clocksource/drivers/renesas-ostm: Avoid reprobe after successful early probe clocksource/drivers/renesas-ostm: Allow OSTM driver to reprobe for RZ/V2H(P) SoC dt-bindings: timer: renesas: ostm: Document Renesas RZ/V2H(P) SoC rust: time: doc: Add missing C header links clocksource: Make the int help prompt unit readable in ncurses hrtimer: Rename __hrtimer_hres_active() to hrtimer_hres_active() timerqueue: Remove never used function timerqueue_node_expires() rust: time: Add Ktime vdso: Fix powerpc build U64_MAX undeclared error clockevents: Convert s[n]printf() to sysfs_emit() clocksource: Convert s[n]printf() to sysfs_emit() clocksource: Make watchdog and suspend-timing multiplication overflow safe timekeeping: Let timekeeping_cycles_to_ns() handle both under and overflow timekeeping: Make delta calculation overflow safe timekeeping: Prepare timekeeping_cycles_to_ns() for overflow safety timekeeping: Fold in timekeeping_delta_to_ns() timekeeping: Consolidate timekeeping helpers timekeeping: Refactor timekeeping helpers ...
2024-05-14Makefile: remove redundant tool coverage variablesMasahiro Yamada2-11/+0
Now Kbuild provides reasonable defaults for objtool, sanitizers, and profilers. Remove redundant variables. Note: This commit changes the coverage for some objects: - include arch/mips/vdso/vdso-image.o into UBSAN, GCOV, KCOV - include arch/sparc/vdso/vdso-image-*.o into UBSAN - include arch/sparc/vdso/vma.o into UBSAN - include arch/x86/entry/vdso/extable.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vdso-image-*.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vdso32-setup.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vma.o into GCOV, KCOV - include arch/x86/um/vdso/vma.o into KASAN, GCOV, KCOV I believe these are positive effects because all of them are kernel space objects. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Roberto Sassu <roberto.sassu@huawei.com>
2024-05-14powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriateMike Rapoport (IBM)6-9/+9
There are places where CONFIG_MODULES guards the code that depends on memory allocation being done with module_alloc(). Replace CONFIG_MODULES with CONFIG_EXECMEM in such places. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-05-14arch: make execmem setup available regardless of CONFIG_MODULESMike Rapoport (IBM)2-63/+64
execmem does not depend on modules, on the contrary modules use execmem. To make execmem available when CONFIG_MODULES=n, for instance for kprobes, split execmem_params initialization out from arch/*/kernel/module.c and compile it when CONFIG_EXECMEM=y Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-05-14powerpc: extend execmem_params for kprobes allocationsMike Rapoport (IBM)2-20/+7
powerpc overrides kprobes::alloc_insn_page() to remove writable permissions when STRICT_MODULE_RWX is on. Add definition of EXECMEM_KRPOBES to execmem_params to allow using the generic kprobes::alloc_insn_page() with the desired permissions. As powerpc uses breakpoint instructions to inject kprobes, it does not need to constrain kprobe allocations to the modules area and can use the entire vmalloc address space. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-05-14mm/execmem, arch: convert remaining overrides of module_alloc to execmemMike Rapoport (IBM)1-21/+39
Extend execmem parameters to accommodate more complex overrides of module_alloc() by architectures. This includes specification of a fallback range required by arm, arm64 and powerpc, EXECMEM_MODULE_DATA type required by powerpc, support for allocation of KASAN shadow required by s390 and x86 and support for late initialization of execmem required by arm64. The core implementation of execmem_alloc() takes care of suppressing warnings when the initial allocation fails but there is a fallback range defined. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Song Liu <song@kernel.org> Tested-by: Liviu Dudau <liviu@dudau.co.uk> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-05-14mm: introduce execmem_alloc() and execmem_free()Mike Rapoport (IBM)1-3/+3
module_alloc() is used everywhere as a mean to allocate memory for code. Beside being semantically wrong, this unnecessarily ties all subsystems that need to allocate code, such as ftrace, kprobes and BPF to modules and puts the burden of code allocation to the modules code. Several architectures override module_alloc() because of various constraints where the executable memory can be located and this causes additional obstacles for improvements of code allocation. Start splitting code allocation from modules by introducing execmem_alloc() and execmem_free() APIs. Initially, execmem_alloc() is a wrapper for module_alloc() and execmem_free() is a replacement of module_memfree() to allow updating all call sites to use the new APIs. Since architectures define different restrictions on placement, permissions, alignment and other parameters for memory that can be used by different subsystems that allocate executable memory, execmem_alloc() takes a type argument, that will be used to identify the calling subsystem and to allow architectures define parameters for ranges suitable for that subsystem. No functional changes. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-05-14Merge tag 'sched-core-2024-05-13' of ↵Linus Torvalds3-14/+22
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Add cpufreq pressure feedback for the scheduler - Rework misfit load-balancing wrt affinity restrictions - Clean up and simplify the code around ::overutilized and ::overload access. - Simplify sched_balance_newidle() - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES handling that changed the output. - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch() - Reorganize, clean up and unify most of the higher level scheduler balancing function names around the sched_balance_*() prefix - Simplify the balancing flag code (sched_balance_running) - Miscellaneous cleanups & fixes * tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) sched/pelt: Remove shift of thermal clock sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() thermal/cpufreq: Remove arch_update_thermal_pressure() sched/cpufreq: Take cpufreq feedback into account cpufreq: Add a cpufreq pressure feedback for the scheduler sched/fair: Fix update of rd->sg_overutilized sched/vtime: Do not include <asm/vtime.h> header s390/irq,nmi: Include <asm/vtime.h> header directly s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover sched/vtime: Get rid of generic vtime_task_switch() implementation sched/vtime: Remove confusing arch_vtime_task_switch() declaration sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized() sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded() sched/fair: Rename root_domain::overload to ::overloaded sched/fair: Use helper functions to access root_domain::overload sched/fair: Check root_domain::overload value before update sched/fair: Combine EAS check with root_domain::overutilized access sched/fair: Simplify the continue_balancing logic in sched_balance_newidle() ...
2024-05-14Merge tag 'execve-6.10-rc1' of ↵Linus Torvalds2-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve updates from Kees Cook: - Provide knob to change (previously fixed) coredump NOTES size (Allen Pais) - Add sched_prepare_exec tracepoint (Marco Elver) - Make /proc/$pid/auxv work under binfmt_elf_fdpic (Max Filippov) - Convert ARCH_HAVE_EXTRA_ELF_NOTES to proper Kconfig (Vignesh Balasubramanian) - Leave a gap between .bss and brk * tag 'execve-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: fs/coredump: Enable dynamic configuration of max file note size binfmt_elf_fdpic: fix /proc/<pid>/auxv binfmt_elf: Leave a gap between .bss and brk Replace macro "ARCH_HAVE_EXTRA_ELF_NOTES" with kconfig tracing: Add sched_prepare_exec tracepoint
2024-05-13Merge branch 'topic/kdump-hotplug' into nextMichael Ellerman9-305/+653
Merge our topic branch containing kdump hotplug changes, more detail from the original cover letter: Commit 247262756121 ("crash: add generic infrastructure for crash hotplug support") added a generic infrastructure that allows architectures to selectively update the kdump image component during CPU or memory add/remove events within the kernel itself. This patch series adds crash hotplug handler for PowerPC and enable support to update the kdump image on CPU/Memory add/remove events. Among the 6 patches in this series, the first two patches make changes to the generic crash hotplug handler to assist PowerPC in adding support for this feature. The last four patches add support for this feature. The following section outlines the problem addressed by this patch series, along with the current solution, its shortcomings, and the proposed resolution. Problem: ======== Due to CPU/Memory hotplug or online/offline events the elfcorehdr (which describes the CPUs and memory of the crashed kernel) and FDT (Flattened Device Tree) of kdump image becomes outdated. Consequently, attempting dump collection with an outdated elfcorehdr or FDT can lead to failed or inaccurate dump collection. Going forward CPU hotplug or online/offline events are referred as CPU/Memory add/remove events. Existing solution and its shortcoming: ====================================== The current solution to address the above issue involves monitoring the CPU/memory add/remove events in userspace using udev rules and whenever there are changes in CPU and memory resources, the entire kdump image is loaded again. The kdump image includes kernel, initrd, elfcorehdr, FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to CPU/Memory add/remove events, reloading the entire kdump image is inefficient. More importantly, kdump remains inactive for a substantial amount of time until the kdump reload completes. Proposed solution: ================== Instead of initiating a full kdump image reload from userspace on CPU/Memory hotplug and online/offline events, the proposed solution aims to update only the necessary kdump image component within the kernel itself.
2024-05-13Merge branch 'topic/ppc-kvm' into nextMichael Ellerman3-7/+3
Merge our KVM topic branch.
2024-05-10Merge tag 'loongarch-kvm-6.10' of ↵Paolo Bonzini5-13/+25
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.10 1. Add ParaVirt IPI support. 2. Add software breakpoint support. 3. Add mmio trace events support.
2024-05-10powerpc/85xx: fix compile error without CONFIG_CRASH_DUMPHari Bathini1-3/+6
Since commit 5c4233cc0920 ("powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency"), crashing_cpu is not available without CONFIG_CRASH_DUMP. Fix compile error on 64-BIT 85xx owing to this change. Fixes: 5c4233cc0920 ("powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency") Cc: stable@vger.kernel.org # v6.9+ Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Closes: https://lore.kernel.org/all/fa247ae4-5825-4dbe-a737-d93b7ab4d4b9@xenosoft.de/ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240510080757.560159-1-hbathini@linux.ibm.com
2024-05-10powerpc/fadump: pass additional parameters when fadump is activeHari Bathini3-0/+40
Append the additional parameters passed/set in the dedicated parameter area (RTAS_FADUMP_PARAM_AREA) to bootargs in fadump capture kernel. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240509115755.519982-4-hbathini@linux.ibm.com
2024-05-10powerpc/fadump: setup additional parameters for dump capture kernelHari Bathini5-9/+133
For fadump case, passing additional parameters to dump capture kernel helps in minimizing the memory footprint for it and also provides the flexibility to disable components/modules, like hugepages, that are hindering the boot process of the special dump capture environment. Set up a dedicated parameter area to be passed to the capture kernel. This area type is defined as RTAS_FADUMP_PARAM_AREA. Sysfs attribute '/sys/kernel/fadump/bootargs_append' is exported to the userspace to specify the additional parameters to be passed to the capture kernel Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240509115755.519982-3-hbathini@linux.ibm.com
2024-05-10powerpc/pseries/fadump: add support for multiple boot memory regionsHari Bathini5-120/+197
Currently, fadump on pseries assumes a single boot memory region even though f/w supports more than one boot memory region. Add support for more boot memory regions to make the implementation flexible for any enhancements that introduce other region types. For this, rtas memory structure for fadump is updated to have multiple boot memory regions instead of just one. Additionally, methods responsible for creating the fadump memory structure during both the first and second kernel boot have been modified to take these multiple boot memory regions into account. Also, a new callback has been added to the fadump_ops structure to get the maximum boot memory regions supported by the platform. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240509115755.519982-2-hbathini@linux.ibm.com
2024-05-09kbuild: use $(src) instead of $(srctree)/$(src) for source directoryMasahiro Yamada4-9/+7
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for checked-in source files. It is merely a convention without any functional difference. In fact, $(obj) and $(src) are exactly the same, as defined in scripts/Makefile.build: src := $(obj) When the kernel is built in a separate output directory, $(src) does not accurately reflect the source directory location. While Kbuild resolves this discrepancy by specifying VPATH=$(srctree) to search for source files, it does not cover all cases. For example, when adding a header search path for local headers, -I$(srctree)/$(src) is typically passed to the compiler. This introduces inconsistency between upstream and downstream Makefiles because $(src) is used instead of $(srctree)/$(src) for the latter. To address this inconsistency, this commit changes the semantics of $(src) so that it always points to the directory in the source tree. Going forward, the variables used in Makefiles will have the following meanings: $(obj) - directory in the object tree $(src) - directory in the source tree (changed by this commit) $(objtree) - the top of the kernel object tree $(srctree) - the top of the kernel source tree Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced with $(src). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-05-07mm: fix race between __split_huge_pmd_locked() and GUP-fastRyan Roberts1-0/+1
__split_huge_pmd_locked() can be called for a present THP, devmap or (non-present) migration entry. It calls pmdp_invalidate() unconditionally on the pmdp and only determines if it is present or not based on the returned old pmd. This is a problem for the migration entry case because pmd_mkinvalid(), called by pmdp_invalidate() must only be called for a present pmd. On arm64 at least, pmd_mkinvalid() will mark the pmd such that any future call to pmd_present() will return true. And therefore any lockless pgtable walker could see the migration entry pmd in this state and start interpretting the fields as if it were present, leading to BadThings (TM). GUP-fast appears to be one such lockless pgtable walker. x86 does not suffer the above problem, but instead pmd_mkinvalid() will corrupt the offset field of the swap entry within the swap pte. See link below for discussion of that problem. Fix all of this by only calling pmdp_invalidate() for a present pmd. And for good measure let's add a warning to all implementations of pmdp_invalidate[_ad](). I've manually reviewed all other pmdp_invalidate[_ad]() call sites and believe all others to be conformant. This is a theoretical bug found during code review. I don't have any test case to trigger it in practice. Link: https://lkml.kernel.org/r/20240501143310.1381675-1-ryan.roberts@arm.com Link: https://lore.kernel.org/all/0dd7827a-6334-439a-8fd0-43c98e6af22b@arm.com/ Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-07Merge tag 'kvm-riscv-6.10-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini1-2/+1
KVM/riscv changes for 6.10 - Support guest breakpoints using ebreak - Introduce per-VCPU mp_state_lock and reset_cntx_lock - Virtualize SBI PMU snapshot and counter overflow interrupts - New selftests for SBI PMU and Guest ebreak
2024-05-07KVM: PPC: Book3S HV nestedv2: Fix an error handling path in ↵Christophe JAILLET1-2/+2
gs_msg_ops_kvmhv_nestedv2_config_fill_info() The return value of kvmppc_gse_put_buff_info() is not assigned to 'rc' and 'rc' is uninitialized at this point. So the error handling can not work. Assign the expected value to 'rc' to fix the issue. Fixes: 19d31c5f1157 ("KVM: PPC: Add support for nestedv2 guests") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/a7ed4cc12e0a0bbd97fac44fe6c222d1c393ec95.1706441651.git.christophe.jaillet@wanadoo.fr
2024-05-07KVM: PPC: code cleanup for kvmppc_book3s_irqprio_deliverKunwu Chan1-4/+0
This part was commented from commit 2f4cf5e42d13 ("Add book3s.c") in about 14 years before. If there are no plans to enable this part code in the future, we can remove this dead code. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240125083348.533883-1-chentao@kylinos.cn
2024-05-07KVM: PPC: Book3S HV nestedv2: Cancel pending DEC exceptionVaibhav Jain1-1/+1
This reverts commit 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception") [1] which prevented canceling a pending HDEC exception for nestedv2 KVM guests. It was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'DEC expiry TB' register which was higher compared to handling extra decrementer exceptions. However recent benchmarks indicate that overhead of not handling 'DECR' expiry for Nested KVM Guest(L2) is higher and results in much larger exits to Pseries Host(L1) as indicated by the Unixbench-arithoh bench[2] Metric | Current upstream | Revert [1] | Difference % ======================================================================== arithoh-count (10) | 3244831634 | 3403089673 | +04.88% kvm_hv:kvm_guest_exit | 513558 | 152441 | -70.32% probe:kvmppc_gsb_recv | 28060 | 28110 | +00.18% N=1 As indicated by the data above that reverting [1] results in substantial reduction in number of L2->L1 exits with only slight increase in number of H_GUEST_GET_STATE hcalls to read the value of 'DEC expiry TB'. This results in an overall ~4% improvement of arithoh[2] throughput. [1] commit 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception") [2] https://github.com/kdlucas/byte-unixbench/ Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240415035731.103097-1-vaibhav@linux.ibm.com
2024-05-07powerpc/xmon: Check cpu id in commands "c#", "dp#" and "dx#"Greg Kurz1-3/+3
All these commands end up peeking into the PACA using the user originated cpu id as an index. Check the cpu id is valid in order to prevent xmon to crash. Instead of printing an error, this follows the same behavior as the "lp s #" command : ignore the buggy cpu id parameter and fall back to the #-less version of the command. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/161531347060.252863.10490063933688958044.stgit@bahia.lan
2024-05-07powerpc/code-patching: Use dedicated memory routines for patchingBenjamin Gray1-4/+27
The patching page set up as a writable alias may be in quadrant 0 (userspace) if the temporary mm path is used. This causes sanitiser failures if so. Sanitiser failures also occur on the non-mm path because the plain memset family is instrumented, and KASAN treats the patching window as poisoned. Introduce locally defined patch_* variants of memset that perform an uninstrumented lower level set, as well as detecting write errors like the original single patch variant does. copy_to_user() is not correct here, as the PTE makes it a proper kernel page (the EAA is privileged access only, RW). It just happens to be in quadrant 0 because that's the hardware's mechanism for using the current PID vs PID 0 in translations. Importantly, it's incorrect to allow user page accesses. Now that the patching memsets are used, we also propagate a failure up to the caller as the single patch variant does. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240325052815.854044-2-bgray@linux.ibm.com
2024-05-07powerpc/code-patching: Test patch_instructions() during bootBenjamin Gray1-0/+92
patch_instructions() introduces new behaviour with a couple of variations. Test each case of * a repeated 32-bit instruction, * a repeated 64-bit instruction (ppc64), and * a copied sequence of instructions for both on a single page and when it crosses a page boundary. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240325052815.854044-1-bgray@linux.ibm.com
2024-05-07powerpc64/kasan: Pass virtual addresses to kasan_init_phys_region()Benjamin Gray2-2/+2
The kasan_init_phys_region() function maps shadow pages necessary for the ranges of the linear map backed by physical pages. Currently kasan_init_phys_region() is being passed physical addresses, but kasan_mem_to_shadow() expects virtual addresses. It works right now because the lower bits (12:64) of the kasan_mem_to_shadow() calculation are the same for the real and virtual addresses, so the actual PTE value is the same in the end. But virtual addresses are the intended input, so fix it. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240212045020.70364-1-bgray@linux.ibm.com
2024-05-07powerpc: rename SPRN_HID2 define to SPRN_HID2_750FXMatthias Schiffer5-9/+13
This register number is hardware-specific, rename it for clarity. FIXME comments are added in a few places where it seems like the wrong register is used. As I can't test this, only the rename is done with no functional change. Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240124105031.45734-1-matthias.schiffer@ew.tq-group.com
2024-05-07powerpc: Fix typosBjorn Helgaas29-40/+40
Fix typos, most reported by "codespell arch/powerpc". Only touches comments, no code changes. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240103231605.1801364-8-helgaas@kernel.org
2024-05-07powerpc/eeh: Fix spelling of the word "auxillary" and update commentGhanshyam Agrawal2-4/+4
Fix spelling of the word "auxillary" in arch/powerpc/kernel/eeh_pe.c and arch/powerpc/include/asm/eeh.h. Also update the eeh_set_pe_aux_size() comment to include the units. Signed-off-by: Ghanshyam Agrawal <ghanshyam1898@gmail.com> [mpe: Squash into one commit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/2ab034609285b21c309cd8ab26c937c846d37ee7.1703756365.git.ghanshyam1898@gmail.com
2024-05-07powerpc/Makefile: Remove bits related to the previous use of -mcmodel=largeNaveen N Rao9-21/+1
All supported compilers today (gcc v5.1+ and clang v11+) have support for -mcmodel=medium. As such, NO_MINIMAL_TOC is no longer being set. Remove NO_MINIMAL_TOC as well as the fallback to -mminimal-toc. Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240110141237.3179199-1-naveen@kernel.org
2024-05-07powerpc/pseries/pci: Code cleanupKunwu Chan1-27/+0
This part was commented in about 19 years before. If there are no plans to enable this part code in the future, we can remove this dead code. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240126025030.577795-1-chentao@kylinos.cn
2024-05-07powerpc/cell: Code cleanup for spufs_mfc_flushKunwu Chan1-16/+4
This part was commented from commit a33a7d7309d7 ("[PATCH] spufs: implement mfc access for PPE-side DMA") in about 18 years before. If there are no plans to enable this part code in the future, we can remove this dead code. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240126021258.574916-1-chentao@kylinos.cn
2024-05-07powerpc/iommu: Code cleanup for cell/iommu.cKunwu Chan1-17/+0
This part was commented from commit 165785e5c0be ("[POWERPC] Cell iommu support") in about 17 years before. If there are no plans to enable this part code in the future, we can remove this dead code. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240125082637.532826-1-chentao@kylinos.cn
2024-05-07powerpc: Fix preserved memory size for int-vectorsGUO Zihua1-1/+9
The first 32k of memory is reserved for interrupt vectors, however for powerpc64 this might not be enough. Fix this by reserving the maximum size between 32k and the real size of interrupt vectors. Signed-off-by: GUO Zihua <guozihua@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240113080509.1598290-1-guozihua@huawei.com
2024-05-07powerpc: dts: fsl: rename ifc node name to be memory-controllerLi Yang5-5/+5
Update the node name to be align with binding document. Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240119203911.3143928-4-Frank.Li@nxp.com
2024-05-07powerpc: dts: mpc85xx: remove "simple-bus" compatible from ifc nodeLi Yang9-9/+9
Update dts to match dts binding document. Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240119203911.3143928-3-Frank.Li@nxp.com
2024-05-07powerpc: dts: p1010rdb: fix INTx interrupt issue on P1010RDB-PBXiaowei Bao3-16/+32
Due to the INTA is shared with the active-low PHY2 interrupt on P1010RDB-PA board, so configure P1010RDB-PA's INTA with polarity as active-low, the P1010RDB-PB board is used separately, so configure P1010RDB-PB's INTA with polarity as active-high. The INTX in P1010RDB-PB do not work because of the pcie@0 node fixup will be overwrited by p1010si-post.dtsi file, so we move the pcie@0 node fixup to p1010rdb-pb.dts and p1010rdb-pb_36b.dts. Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240119203911.3143928-2-Frank.Li@nxp.com
2024-05-07powerpc: dts: add power management nodes to FSL chipsRan Wang14-12/+83
Enable Power Management feature on device tree, including MPC8536, MPC8544, MPC8548, MPC8572, P1010, P1020, P1021, P1022, P2020, P2041, P3041, T104X, T1024. Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com> Signed-off-by: Ran Wang <ran.wang_1@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240119203911.3143928-1-Frank.Li@nxp.com
2024-05-07powerpc/rtas: Add kernel-doc comments to smp_startup_cpu()Yang Li1-0/+1
This commit adds kernel-doc style comments with complete parameter descriptions for the function smp_startup_cpu(). Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240408053109.96360-2-yang.lee@linux.alibaba.com
2024-05-07powerpc: Fix kernel-doc comments in fsl_gtm.cYang Li1-3/+3
Fix some function names in kernel-doc comments. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240408053109.96360-1-yang.lee@linux.alibaba.com
2024-05-07powerpc: boot: Fix kernel-doc param for partial_decompressYang Li1-1/+1
Fix the kernel-doc annotation for the 'skip' parameter in the partial_decompress() function by adding a missing underscore and colon. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240408083916.123369-1-yang.lee@linux.alibaba.com
2024-05-07powerpc: remove unused *_syscall_64.o variables in MakefileMasahiro Yamada1-3/+0
Commit ab1a517d55b0 ("powerpc/syscall: Rename syscall_64.c into interrupt.c") missed to update these three lines: GCOV_PROFILE_syscall_64.o := n KCOV_INSTRUMENT_syscall_64.o := n UBSAN_SANITIZE_syscall_64.o := n To restore the original behavior, we could replace them with: GCOV_PROFILE_interrupt.o := n KCOV_INSTRUMENT_interrupt.o := n UBSAN_SANITIZE_interrupt.o := n However, nobody has noticed the functional change in the past three years, so they were unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240216135517.2002749-1-masahiroy@kernel.org
2024-05-07powerpc/bpf/32: Fix failing test_bpf testsChristophe Leroy2-31/+110
Recent additions in BPF like cpu v4 instructions, test_bpf module exhibits the following failures: test_bpf: #82 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #83 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #84 ALU64_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #85 ALU64_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #86 ALU64_MOVSX | BPF_W jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #165 ALU_SDIV_X: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times) test_bpf: #166 ALU_SDIV_K: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times) test_bpf: #169 ALU_SMOD_X: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #170 ALU_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #172 ALU64_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #313 BSWAP 16: 0x0123456789abcdef -> 0xefcd eBPF filter opcode 00d7 (@2) unsupported jited:0 301 PASS test_bpf: #314 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 eBPF filter opcode 00d7 (@2) unsupported jited:0 555 PASS test_bpf: #315 BSWAP 64: 0x0123456789abcdef -> 0x67452301 eBPF filter opcode 00d7 (@2) unsupported jited:0 268 PASS test_bpf: #316 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89 eBPF filter opcode 00d7 (@2) unsupported jited:0 269 PASS test_bpf: #317 BSWAP 16: 0xfedcba9876543210 -> 0x1032 eBPF filter opcode 00d7 (@2) unsupported jited:0 460 PASS test_bpf: #318 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 eBPF filter opcode 00d7 (@2) unsupported jited:0 320 PASS test_bpf: #319 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe eBPF filter opcode 00d7 (@2) unsupported jited:0 222 PASS test_bpf: #320 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476 eBPF filter opcode 00d7 (@2) unsupported jited:0 273 PASS test_bpf: #344 BPF_LDX_MEMSX | BPF_B eBPF filter opcode 0091 (@5) unsupported jited:0 432 PASS test_bpf: #345 BPF_LDX_MEMSX | BPF_H eBPF filter opcode 0089 (@5) unsupported jited:0 381 PASS test_bpf: #346 BPF_LDX_MEMSX | BPF_W eBPF filter opcode 0081 (@5) unsupported jited:0 505 PASS test_bpf: #490 JMP32_JA: Unconditional jump: if (true) return 1 eBPF filter opcode 0006 (@1) unsupported jited:0 261 PASS test_bpf: Summary: 1040 PASSED, 10 FAILED, [924/1038 JIT'ed] Fix them by adding missing processing. Fixes: daabb2b098e0 ("bpf/tests: add tests for cpuv4 instructions") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/91de862dda99d170697eb79ffb478678af7e0b27.1709652689.git.christophe.leroy@csgroup.eu
2024-05-06printk: Remove redundant CONFIG_BASE_FULLYoann Congal5-5/+5
CONFIG_BASE_FULL is equivalent to !CONFIG_BASE_SMALL and is enabled by default: CONFIG_BASE_SMALL is the special case to take care of. So, remove CONFIG_BASE_FULL and move the config choice to CONFIG_BASE_SMALL (which defaults to 'n') For defconfigs explicitely disabling BASE_FULL, explicitely enable BASE_SMALL. For defconfigs explicitely enabling BASE_FULL, drop it as it is the default. Signed-off-by: Yoann Congal <yoann.congal@smile.fr> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240505080343.1471198-4-yoann.congal@smile.fr Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-05-06powerpc/64: Set _IO_BASE to POISON_POINTER_DELTA not 0 for CONFIG_PCI=nMichael Ellerman1-1/+1
There is code that builds with calls to IO accessors even when CONFIG_PCI=n, but the actual calls are guarded by runtime checks. If not those calls would be faulting, because the page at virtual address zero is (usually) not mapped into the kernel. As Arnd pointed out, it is possible a large port value could cause the address to be above mmap_min_addr which would then access userspace, which would be a bug. To avoid any such issues, set _IO_BASE to POISON_POINTER_DELTA. That is a value chosen to point into unmapped space between the kernel and userspace, so any access will always fault. Note that on 32-bit POISON_POINTER_DELTA is 0, so the patch only has an effect on 64-bit. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240503075619.394467-2-mpe@ellerman.id.au
2024-05-06powerpc/io: Avoid clang null pointer arithmetic warningsMichael Ellerman1-12/+12
With -Wextra clang warns about pointer arithmetic using a null pointer. When building with CONFIG_PCI=n, that triggers a warning in the IO accessors, eg: In file included from linux/arch/powerpc/include/asm/io.h:672: linux/arch/powerpc/include/asm/io-defs.h:23:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 23 | DEF_PCI_AC_RET(inb, u8, (unsigned long port), (port), pio, port) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... linux/arch/powerpc/include/asm/io.h:591:53: note: expanded from macro '__do_inb' 591 | #define __do_inb(port) readb((PCI_IO_ADDR)_IO_BASE + port); | ~~~~~~~~~~~~~~~~~~~~~ ^ That is because when CONFIG_PCI=n, _IO_BASE is defined as 0. Although _IO_BASE is defined as plain 0, the cast (PCI_IO_ADDR) converts it to void * before the addition with port happens. Instead the addition can be done first, and then the cast. The resulting value will be the same, but avoids the warning, and also avoids void pointer arithmetic which is apparently non-standard. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Closes: https://lore.kernel.org/all/CA+G9fYtEh8zmq8k8wE-8RZwW-Qr927RLTn+KqGnq1F=ptaaNsA@mail.gmail.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240503075619.394467-1-mpe@ellerman.id.au
2024-05-06powerpc/bpf: enable kfunc callHari Bathini2-10/+61
Currently, bpf jit code on powerpc assumes all the bpf functions and helpers to be part of core kernel text. This is false for kfunc case, as function addresses may not be part of core kernel text area. So, add support for addresses that are not within core kernel text area too, to enable kfunc support. Emit instructions based on whether the function address is within core kernel text address or not, to retain optimized instruction sequence where possible. In case of PCREL, as a bpf function that is not within core kernel text area is likely to go out of range with relative addressing on kernel base, use PC relative addressing. If that goes out of range, load the full address with PPC_LI64(). With addresses that are not within core kernel text area supported, override bpf_jit_supports_kfunc_call() to enable kfunc support. Also, override bpf_jit_supports_far_kfunc_call() to enable 64-bit pointers, as an address offset can be more than 32-bit long on PPC64. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240502173205.142794-2-hbathini@linux.ibm.com
2024-05-06powerpc/64/bpf: fix tail calls for PCREL addressingHari Bathini1-14/+16
With PCREL addressing, there is no kernel TOC. So, it is not setup in prologue when PCREL addressing is used. But the number of instructions to skip on a tail call was not adjusted accordingly. That resulted in not so obvious failures while using tailcalls. 'tailcalls' selftest crashed the system with the below call trace: bpf_test_run+0xe8/0x3cc (unreliable) bpf_prog_test_run_skb+0x348/0x778 __sys_bpf+0xb04/0x2b00 sys_bpf+0x28/0x38 system_call_exception+0x168/0x340 system_call_vectored_common+0x15c/0x2ec Also, as bpf programs are always module addresses and a bpf helper in general is a core kernel text address, using PC relative addressing often fails with "out of range of pcrel address" error. Switch to using kernel base for relative addressing to handle this better. Fixes: 7e3a68be42e1 ("powerpc/64: vmlinux support building with PCREL addresing") Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240502173205.142794-1-hbathini@linux.ibm.com
2024-05-06powerpc/dexcr: Add DEXCR prctl interfaceBenjamin Gray2-0/+111
Now that we track a DEXCR on a per-task basis, individual tasks are free to configure it as they like. The interface is a pair of getter/setter prctl's that work on a single aspect at a time (multiple aspects at once is more difficult if there are different rules applied for each aspect, now or in future). The getter shows the current state of the process config, and the setter allows setting/clearing the aspect. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> [mpe: Account for PR_RISCV_SET_ICACHE_FLUSH_CTX, shrink some longs lines] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240417112325.728010-5-bgray@linux.ibm.com
2024-05-05Merge tag 'powerpc-6.9-4' of ↵Linus Torvalds3-8/+15
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix incorrect delay handling in the plpks (keystore) code - Fix a panic when an LPAR boots with a frozen PE Thanks to Andrew Donnellan, Gaurav Batra, Nageswara R Sastry, and Nayna Jain. * tag 'powerpc-6.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries/iommu: LPAR panics during boot up with a frozen PE powerpc/pseries: make max polling consistent for longer H_CALLs