summaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/powernv/idle.c
AgeCommit message (Collapse)AuthorFilesLines
2023-06-21powerpc: powernv: Fix KCSAN datarace warnings on idle_state contentionRohan McLure1-7/+9
The idle_state entry in the PACA on PowerNV features a bit which is atomically tested and set through ldarx/stdcx. to be used as a spinlock. This lock then guards access to other bit fields of idle_state. KCSAN cannot differentiate between any of these bitfield accesses as they all are implemented by 8-byte store/load instructions, thus cores contending on the bit-lock appear to data race with modifications to idle_state. Separate the bit-lock entry from the data guarded by the lock to avoid the possibility of data races being detected by KCSAN. Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230510033117.1395895-7-rmclure@linux.ibm.com
2023-03-17powerpc/powernv: move to use bus_get_dev_root()Greg Kroah-Hartman1-2/+7
Direct access to the struct bus_type dev_root pointer is going away soon so replace that with a call to bus_get_dev_root() instead, which is what it is there for. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Joel Stanley <joel@jms.id.au> Cc: Liang He <windhl@126.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julia Lawall <Julia.Lawall@inria.fr> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230313182918.1312597-13-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-05powerpc/powernv: Add missing of_node_put()sLiang He1-0/+1
In these driver init functions, there are two kinds of errors: (1) missing of_put_node() for of_find_compatible_node()'s returned pointer (refcount incremented) in fail path or when it is not used anymore. (2) missing of_put_node() for 'for_each_xxx' loop's break Signed-off-by: Liang He <windhl@126.com> [mpe: Use out_put_xxx goto label naming] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220620132553.4073863-1-windhl@126.com
2022-08-26powerpc: move from strlcpy with unused retval to strscpyWolfram Sang1-1/+1
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Link: https://lore.kernel.org/r/20220818205946.6336-1-wsa+renesas@sang-engineering.com
2022-05-05powerpc: fix typos in commentsJulia Lawall1-2/+2
Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr
2022-03-08powerpc: Move C prototypes out of asm-prototypes.hChristophe Leroy1-1/+0
We originally added asm-prototypes.h in commit 42f5b4cacd78 ("powerpc: Introduce asm-prototypes.h"). It's purpose was for prototypes of C functions that are only called from asm, in order to fix sparse warnings about missing prototypes. A few months later Nick added a different use case in commit 4efca4ed05cb ("kbuild: modversions for EXPORT_SYMBOL() for asm") for C prototypes for exported asm functions. This is basically the inverse of our original usage. Since then we've added various prototypes to asm-prototypes.h for both reasons, meaning we now need to unstitch it all. Dispatch prototypes of C functions into relevant headers and keep only the prototypes for functions defined in assembly. For the time being, leave prom_init() there because moving it into asm/prom.h or asm/setup.h conflicts with drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowrom.o This will be fixed later by untaggling asm/pci.h and asm/prom.h or by renaming the function in shadowrom.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/62d46904eca74042097acf4cb12c175e3067f3d1.1646413435.git.christophe.leroy@csgroup.eu
2021-12-23powerpc/powernv: Add __init attribute to eligible functionsNick Child1-3/+3
Some functions defined in 'arch/powerpc/platforms/powernv' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-12-nick.child@ibm.com
2021-12-09powerpc/64s: Move hash MMU support code under CONFIG_PPC_64S_HASH_MMUNicholas Piggin1-0/+2
Compiling out hash support code when CONFIG_PPC_64S_HASH_MMU=n saves 128kB kernel image size (90kB text) on powernv_defconfig minus KVM, 350kB on pseries_defconfig minus KVM, 40kB on a tiny config. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fixup defined(ARCH_HAS_MEMREMAP_COMPAT_ALIGN), which needs CONFIG. Fix radix_enabled() use in setup_initial_memory_limit(). Add some stubs to reduce number of ifdefs.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-18-npiggin@gmail.com
2021-11-29powerpc: remove cpu_online_cores_map functionNicholas Piggin1-5/+5
This function builds the cores online map with on-stack cpumasks which can cause high stack usage with large NR_CPUS. It is not used in any performance sensitive paths, so instead just check for first thread sibling. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105035042.1398309-1-npiggin@gmail.com
2021-11-24powerpc/64s: Keep AMOR SPR a constant ~0 at runtimeNicholas Piggin1-5/+3
This register controls supervisor SPR modifications, and as such is only relevant for KVM. KVM always sets AMOR to ~0 on guest entry, and never restores it coming back out to the host, so it can be kept constant and avoid the mtSPR in KVM guest entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-10-npiggin@gmail.com
2021-11-24powerpc/64s: Remove WORT SPR from POWER9/10 (take 2)Nicholas Piggin1-1/+0
This removes a missed remnant of the WORT SPR. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-2-npiggin@gmail.com
2021-08-26Merge branch 'topic/ppc-kvm' into nextMichael Ellerman1-2/+0
Merge some KVM patches we are keeping in a topic branch in case there are any merge conflicts that need resolving.
2021-08-25powerpc/64s: Remove WORT SPR from POWER9/10Nicholas Piggin1-2/+0
This register is not architected and not implemented in POWER9 or 10, it just reads back zeroes for compatibility. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Link: https://lore.kernel.org/r/20210811160134.904987-11-npiggin@gmail.com
2021-08-10powerpc: Replace deprecated CPU-hotplug functions.Sebastian Andrzej Siewior1-2/+2
The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210803141621.780504-4-bigeasy@linutronix.de
2021-06-10KVM: PPC: Book3S HV: remove ISA v3.0 and v3.1 support from P7/8 pathNicholas Piggin1-44/+8
POWER9 and later processors always go via the P9 guest entry path now. Remove the remaining support from the P7/8 path. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210528090752.3542186-33-npiggin@gmail.com
2021-02-08powerpc: convert interrupt handlers to use wrappersNicholas Piggin1-0/+1
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-29-npiggin@gmail.com
2020-12-07powerpc/powernv/idle: Restore CIABR after idle for Power9Jordan Niethe1-0/+3
On Power9, CIABR is lost after idle. This means that instruction breakpoints set by xmon which use CIABR do not work. Fix this by restoring CIABR after idle. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201207010519.15597-2-jniethe5@gmail.com
2020-09-15powerpc/powernv/idle: add a basic stop 0-3 driver for POWER10Nicholas Piggin1-93/+209
This driver does not restore stop > 3 state, so it limits itself to states which do not lose full state or TB. The POWER10 SPRs are sufficiently different from P9 that it seems easier to split out the P10 code. The POWER10 deep sleep code (e.g., the BHRB restore) has been taken out, but it can be re-added when stop > 3 support is added. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Pratik Rajesh Sampat<psampat@linux.ibm.com> Tested-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Reviewed-by: Pratik Rajesh Sampat<psampat@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200819094700.493399-1-npiggin@gmail.com
2020-08-27Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check"Pratik Rajesh Sampat1-1/+1
cpuidle stop state implementation has minor optimizations for P10 where hardware preserves more SPR registers compared to P9. The current P9 driver works for P10, although does few extra save-restores. P9 driver can provide the required power management features like SMT thread folding and core level power savings on a P10 platform. Until the P10 stop driver is available, revert the commit which allows for only P9 systems to utilize cpuidle and blocks all idle stop states for P10. CPU idle states are enabled and tested on the P10 platform with this fix. This reverts commit 8747bf36f312356f8a295a0c39ff092d65ce75ae. Fixes: 8747bf36f312 ("powerpc/powernv/idle: Replace CPU feature check with PVR check") Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200826082918.89306-1-psampat@linux.ibm.com
2020-07-23powerpc/powernv/idle: Exclude mfspr on HID1, 4, 5 on P9 and abovePratik Rajesh Sampat1-3/+3
POWER9 onwards the support for the registers HID1, HID4, HID5 has been receded. Although mfspr on the above registers worked in Power9, In Power10 simulator is unrecognized. Moving their assignment under the check for machines lower than Power9 Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200721153708.89057-4-psampat@linux.ibm.com
2020-07-23powerpc/powernv/idle: Rename pnv_first_spr_loss_level variablePratik Rajesh Sampat1-9/+9
Replace the variable name from using "pnv_first_spr_loss_level" to "deep_spr_loss_state". pnv_first_spr_loss_level is supposed to be the earliest state that has OPAL_PM_LOSE_FULL_CONTEXT set, in other places the kernel uses the "deep" states as terminology. Hence renaming the variable to be coherent to its semantics. Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200721153708.89057-3-psampat@linux.ibm.com
2020-07-23powerpc/powernv/idle: Replace CPU feature check with PVR checkPratik Rajesh Sampat1-1/+1
The POWER9 idle driver contains implementation-specific details that means it is not suitable to run on any processor that implements ISA v3.0 (e.g., POWER10), so only init the driver when running on a POWER9. Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> [mpe: Use updated change log from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200721153708.89057-2-psampat@linux.ibm.com
2020-07-22powerpc/perf: BHRB control to disable BHRB logic when not usedAthira Rajeev1-2/+20
PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB). BHRB disable is controlled via Monitor Mode Control Register A (MMCRA) bit, namely "BHRB Recording Disable (BHRBRD)". This field controls whether BHRB entries are written when BHRB recording is enabled by other bits. This patch implements support for this BHRB disable bit. By setting 0b1 to this bit will disable the BHRB and by setting 0b0 to this bit will have BHRB enabled. This addresses backward compatibility (for older OS), since this bit will be cleared and hardware will be writing to BHRB by default. This patch addresses changes to set MMCRA (BHRBRD) at boot for power10 (there by the core will run faster) and enable this feature only on runtime ie, on explicit need from user. Also save/restore MMCRA in the restore path of state-loss idle state to make sure we keep BHRB disabled if it was not enabled on request at runtime. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1594996707-3727-12-git-send-email-atrajeev@linux.vnet.ibm.com
2020-05-11powerpc/powernv: Fix a warning messageChristophe JAILLET1-1/+1
Fix a cut'n'paste error in a warning message. This should be 'cpu-idle-state-residency-ns' to match the property searched in the previous 'of_property_read_u32_array()' Fixes: 9c7b185ab2fe ("powernv/cpuidle: Parse dt idle properties into global structure") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200502115949.139000-1-christophe.jaillet@wanadoo.fr
2019-08-30powerpc/powernv: Access LDBAR only if ultravisor disabledClaudio Carvalho1-2/+4
LDBAR is a per-thread SPR populated and used by the thread-imc pmu driver to dump the data counter into memory. It contains memory along with few other configuration bits. LDBAR is populated and enabled only when any of the thread imc pmu events are monitored. In ultravisor enabled systems, LDBAR becomes ultravisor privileged and an attempt to write to it will cause a Hypervisor Emulation Assistance interrupt. In ultravisor enabled systems, the ultravisor is responsible to maintain the LDBAR (e.g. save and restore it). This restricts LDBAR access to only when ultravisor is disabled. Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Ryan Grimm <grimm@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190822034838.27876-7-cclaudio@linux.ibm.com
2019-07-14Merge tag 'powerpc-5.3-1' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver, as well as some other functions only used by drivers that haven't (yet?) made it upstream. - A fix for a bug in our handling of hardware watchpoints (eg. perf record -e mem: ...) which could lead to register corruption and kernel crashes. - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for vmalloc when using the Radix MMU. - A large but incremental rewrite of our exception handling code to use gas macros rather than multiple levels of nested CPP macros. And the usual small fixes, cleanups and improvements. Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Cédric Le Goater, Christian Lamparter, Christophe Leroy, Christophe Lombard, Christoph Hellwig, Daniel Axtens, Denis Efremov, Enrico Weigelt, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg Kurz, Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi Bangoria, Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher Boessenkool, Shaokun Zhang, Shawn Anastasio, Stewart Smith, Suraj Jitindar Singh, Thiago Jung Bauermann, YueHaibing" * tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits) powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state. powerpc/eeh: Handle hugepages in ioremap space ocxl: Update for AFU descriptor template version 1.1 powerpc/boot: pass CONFIG options in a simpler and more robust way powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h powerpc/irq: Don't WARN continuously in arch_local_irq_restore() powerpc/module64: Use symbolic instructions names. powerpc/module32: Use symbolic instructions names. powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h powerpc/module64: Fix comment in R_PPC64_ENTRY handling powerpc/boot: Add lzo support for uImage powerpc/boot: Add lzma support for uImage powerpc/boot: don't force gzipped uImage powerpc/8xx: Add microcode patch to move SMC parameter RAM. powerpc/8xx: Use IO accessors in microcode programming. powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c powerpc/8xx: refactor programming of microcode CPM params. powerpc/8xx: refactor printing of microcode patch name. powerpc/8xx: Refactor microcode write powerpc/8xx: refactor writing of CPM microcode arrays ...
2019-07-12powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.Athira Rajeev1-1/+1
commit 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C") reimplemented book3S code to pltform/powernv/idle.c. But when doing so missed to add the per-thread LDBAR update in the core_woken path of the power9_idle_stop(). Patch fixes the same. Fixes: 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190702105836.26695-1-maddy@linux.vnet.ibm.com
2019-07-03powerpc/64s: Rename PPC_INVALIDATE_ERAT to PPC_ISA_3_0_INVALIDATE_ERATNicholas Piggin1-1/+1
This makes it clear to the caller that it can only be used on POWER9 and later CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Use "ISA_3_0" rather than "ARCH_300"] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-19powerpc/64s: Fix misleading SPR and timebase informationShaokun Zhang1-2/+2
pr_info shows SPR and timebase as a decimal value with a '0x' prefix, which is somewhat misleading. Fix it to print hexadecimal, as was intended. Fixes: 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C") Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-30powerpc/powernv/idle: Restore AMR/UAMOR/AMOR/IAMR after idleMichael Ellerman1-6/+46
This is an implementation of commits 53a712bae5dd ("powerpc/powernv/idle: Restore AMR/UAMOR/AMOR after idle") and a3f3072db6ca ("powerpc/powernv/idle: Restore IAMR after idle") using the new C-based idle code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Extract from Nick's patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-04-30powerpc/64s: Reimplement book3s idle code in CNicholas Piggin1-173/+689
Reimplement Book3S idle code in C, moving POWER7/8/9 implementation speific HV idle code to the powernv platform code. Book3S assembly stubs are kept in common code and used only to save the stack frame and non-volatile GPRs before executing architected idle instructions, and restoring the stack and reloading GPRs then returning to C after waking from idle. The complex logic dealing with threads and subcores, locking, SPRs, HMIs, timebase resync, etc., is all done in C which makes it more maintainable. This is not a strict translation to C code, there are some significant differences: - Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs, but saves and restores them itself. - The optimisation where EC=ESL=0 idle modes did not have to save GPRs or change MSR is restored, because it's now simple to do. ESL=1 sleeps that do not lose GPRs can use this optimization too. - KVM secondary entry and cede is now more of a call/return style rather than branchy. nap_state_lost is not required because KVM always returns via NVGPR restoring path. - KVM secondary wakeup from offline sequence is moved entirely into the offline wakeup, which avoids a hwsync in the normal idle wakeup path. Performance measured with context switch ping-pong on different threads or cores, is possibly improved a small amount, 1-3% depending on stop state and core vs thread test for shallow states. Deep states it's in the noise compared with other latencies. KVM improvements: - Idle sleepers now always return to caller rather than branch out to KVM first. - This allows optimisations like very fast return to caller when no state has been lost. - KVM no longer requires nap_state_lost because it controls NVGPR save/restore itself on the way in and out. - The heavy idle wakeup KVM request check can be moved out of the normal host idle code and into the not-performance-critical offline code. - KVM nap code now returns from where it is called, which makes the flow a bit easier to follow. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Squash the KVM changes in] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-21powerpc/powernv: Don't reprogram SLW image on every KVM guest entry/exitPaul Mackerras1-25/+2
Commit 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug", 2017-07-21) added two calls to opal_slw_set_reg() inside pnv_cpu_offline(), with the aim of changing the LPCR value in the SLW image to disable wakeups from the decrementer while a CPU is offline. However, pnv_cpu_offline() gets called each time a secondary CPU thread is woken up to participate in running a KVM guest, that is, not just when a CPU is offlined. Since opal_slw_set_reg() is a very slow operation (with observed execution times around 20 milliseconds), this means that an offline secondary CPU can often be busy doing the opal_slw_set_reg() call when the primary CPU wants to grab all the secondary threads so that it can run a KVM guest. This leads to messages like "KVM: couldn't grab CPU n" being printed and guest execution failing. There is no need to reprogram the SLW image on every KVM guest entry and exit. So that we do it only when a CPU is really transitioning between online and offline, this moves the calls to pnv_program_cpu_hotplug_lpcr() into pnv_smp_cpu_kill_self(). Fixes: 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-10powerpc/powernv/idle: Fix build errorAneesh Kumar K.V1-1/+1
Fix the below build error using strlcpy instead of strncpy In function 'pnv_parse_cpuidle_dt', inlined from 'pnv_init_idle_states' at arch/powerpc/platforms/powernv/idle.c:840:7, inlined from '__machine_initcall_powernv_pnv_init_idle_states' at arch/powerpc/platforms/powernv/idle.c:870:1: arch/powerpc/platforms/powernv/idle.c:820:3: error: 'strncpy' specified bound 16 equals destination size [-Werror=stringop-truncation] strncpy(pnv_idle_states[i].name, temp_string[i], ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ PNV_IDLE_NAME_LEN); Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-03powernv/cpuidle: Fix idle states all being marked invalidNicholas Piggin1-1/+2
Commit 9c7b185ab2fe ("powernv/cpuidle: Parse dt idle properties into global structure") parses dt idle states into structs, but never marks them valid. This results in all idle states being lost. Fixes: 9c7b185ab2fe ("powernv/cpuidle: Parse dt idle properties into global structure") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-31powernv/cpuidle: Parse dt idle properties into global structureAkshay Adiga1-78/+138
Device-tree parsing happens twice, once while deciding idle state to be used for hotplug and once during cpuidle init. Hence, parsing the device tree and caching it will reduce code duplication. Parsing code has been moved to pnv_parse_cpuidle_dt() from pnv_probe_idle_states(). In addition to the properties in the device tree the number of available states is also required. Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-16powerpc/64s: Remove POWER9 DD1 supportNicholas Piggin1-28/+0
POWER9 DD1 was never a product. It is no longer supported by upstream firmware, and it is not effectively supported in Linux due to lack of testing. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [mpe: Remove arch_make_huge_pte() entirely] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-05-28powerpc/powernv/cpuidle: Init all present cpus for deep statesAkshay Adiga1-2/+2
Init all present cpus for deep states instead of "all possible" cpus. Init fails if a possible cpu is guarded. Resulting in making only non-deep states available for cpuidle/hotplug. Stewart says, this means that for single threaded workloads, if you guard out a CPU core you'll not get WoF (Workload Optimised Frequency), which means that performance goes down when you wouldn't expect it to. Fixes: 77b54e9f213f ("powernv/powerpc: Add winkle support for offline cpus") Cc: stable@vger.kernel.org # v3.19+ Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03powerpc/powernv: Fix SMT4 forcing idle codeNicholas Piggin1-4/+0
The PSSCR value is not stored to PACA_REQ_PSSCR if the CPU does not have the XER[SO] bug. Fix this by storing up-front, outside the workaround code. The initial test is not required because it is a slow path. The workaround is made to depend on CONFIG_KVM_BOOK3S_HV_POSSIBLE, to match pnv_power9_force_smt4_catch() where it is used. Drop the comment on pnv_power9_force_smt4_catch() as it's no longer true. Fixes: 7672691a08c8 ("powerpc/powernv: Provide a way to force a core into SMT4 mode") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31powerpc/64s/idle: POWER9 implement a separate idle stop function for hotplugNicholas Piggin1-1/+1
Implement a new function to invoke stop, power9_offline_stop, which is like power9_idle_stop but used by the cpu hotplug code. Move KVM secondary state manipulation code to the offline case. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31Merge branch 'topic/paca' into nextMichael Ellerman1-15/+12
Bring in yet another series that touches KVM code, and might need to be merged into the kvm-ppc branch to resolve conflicts. This required some changes in pnv_power9_force_smt4_catch/release() due to the paca array becomming an array of pointers.
2018-03-30powerpc/64: Use array of paca pointers and allocate pacas individuallyNicholas Piggin1-6/+7
Change the paca array into an array of pointers to pacas. Allocate pacas individually. This allows flexibility in where the PACAs are allocated. Future work will allocate them node-local. Platforms that don't have address limits on PACAs would be able to defer PACA allocations until later in boot rather than allocate all possible ones up-front then freeing unused. This is slightly more overhead (one additional indirection) for cross CPU paca references, but those aren't too common. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/powernv: Provide a way to force a core into SMT4 modePaul Mackerras1-0/+81
POWER9 processors up to and including "Nimbus" v2.2 have hardware bugs relating to transactional memory and thread reconfiguration. One of these bugs has a workaround which is to get the core into SMT4 state temporarily. This workaround is only needed when running bare-metal. This patch provides a function which gets the core into SMT4 mode by preventing threads from going to a stop state, and waking up those which are already in a stop state. Once at least 3 threads are not in a stop state, the core will be in SMT4 and we can continue. To do this, we add a "dont_stop" flag to the paca to tell the thread not to go into a stop state. If this flag is set, power9_idle_stop() just returns immediately with a return value of 0. The pnv_power9_force_smt4_catch() function does the following: 1. Set the dont_stop flag for each thread in the core, except ourselves (in fact we use an atomic_inc() in case more than one thread is calling this function concurrently). 2. See how many threads are awake, indicated by their requested_psscr field in the paca being 0. If this is at least 3, skip to step 5. 3. Send a doorbell interrupt to each thread that was seen as being in a stop state in step 2. 4. Until at least 3 threads are awake, scan the threads to which we sent a doorbell interrupt and check if they are awake now. This relies on the following properties: - Once dont_stop is non-zero, requested_psccr can't go from zero to non-zero, except transiently (and without the thread doing stop). - requested_psscr being zero guarantees that the thread isn't in a state-losing stop state where thread reconfiguration could occur. - Doing stop with a PSSCR value of 0 won't be a state-losing stop and thus won't allow thread reconfiguration. - Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core must be in SMT4 mode, since SMT modes are powers of 2. This does add a sync to power9_idle_stop(), which is necessary to provide the correct ordering between setting requested_psscr and checking dont_stop. The overhead of the sync should be unnoticeable compared to the latency of going into and out of a stop state. Because some objected to incurring this extra latency on systems where the XER[SO] bug is not relevant, I have put the test in power9_idle_stop inside a feature section. This means that pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will probably hang the system. In order to cater for uses where the caller has an operation that has to be done while the core is in SMT4, the core continues to be kept in SMT4 after pnv_power9_force_smt4_catch() function returns, until the pnv_power9_force_smt4_release() function is called. It undoes the effect of step 1 above and allows the other threads to go into a stop state. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-20powerpc/powernv: Clear LPCR[PECE1] via stop-api only for deep state offlineGautham R. Shenoy1-1/+7
Commit 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug") clears the PECE1 bit of the LPCR via stop-api during CPU-Hotplug to prevent wakeup due to a decrementer on an offlined CPU which is in a deep stop state. In the case where the stop-api support is found to be lacking, the commit 785a12afdb4a ("powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails") disables deep states that lose hypervisor context. Thus in this case, the offlined CPU will be put to some shallow idle state. However, we currently unconditionally clear the PECE1 in LPCR via stop-api during CPU-Hotplug even when deep states are disabled due to stop-api failure. Fix this by clearing PECE1 of LPCR via stop-api during CPU-Hotplug *only* when the offlined CPU will be put to a deep state that loses hypervisor context. Fixes: 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug") Reported-by: Pavithra Prakash <pavirampu@linux.vnet.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Pavithra Prakash <pavrampu@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-23Merge branch 'fixes' into nextMichael Ellerman1-3/+38
There's a non-trivial dependency between some commits we want to put in next and the KVM prefetch work around that went into fixes. So merge fixes into next.
2017-08-08powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api failsGautham R. Shenoy1-3/+38
Currently, we use the opal call opal_slw_set_reg() to inform the Sleep-Winkle Engine (SLW) to restore the contents of some of the Hypervisor state on wakeup from deep idle states that lose full hypervisor context (characterized by the flag OPAL_PM_LOSE_FULL_CONTEXT). However, the current code has a bug in that if opal_slw_set_reg() fails, we don't disable the use of these deep states (winkle on POWER8, stop4 onwards on POWER9). This patch fixes this bug by ensuring that if programing the sleep-winkle engine to restore the hypervisor states in pnv_save_sprs_for_deep_states() fails, then we exclude such states by clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from supported_cpuidle_states. As a result POWER8 will be prevented from using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to the default stop state when available. Further, we ensure in the initialization of the cpuidle-powernv driver to only include those states whose flags are present in supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT states when they have been disabled due to stop-api failure. Fixes: 1e1601b38e6 ("powerpc/powernv/idle: Restore SPRs for deep idle states via stop API.") Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-01powerpc/powernv: Clear PECE1 in LPCR via stop-api only on HotplugGautham R. Shenoy1-1/+33
Currently we use the stop-api provided by the firmware to program the SLW engine to restore the values of hypervisor resources that get lost on deeper idle states (such as winkle). Since the deep states were only used for CPU-Hotplug on POWER8 systems, we would program the LPCR to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously woken up by decrementer. On POWER9, some of the deep platform idle states such as stop4 can be used in cpuidle as well. In this case, we want the CPU in stop4 to be woken up by the decrementer when some timer on the CPU expires. In this patch, we program the stop-api for LPCR with PECE1 bit cleared only when we are offlining the CPU and set it back once the CPU is online. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19powerpc/64s/idle: Run latch switch is done with MSR[EE]=0Nicholas Piggin1-6/+6
In the idle sleep/wake code we know that MSR[EE] is clear, so we can avoid 2 x mfmsr and 2 x mtmsr by calling the double-underscore versions of the run latch routines which assume interrupts are already disabled. Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19powerpc/64s/idle: Process interrupts from system reset wakeupNicholas Piggin1-2/+8
When the CPU wakes from low power state, it begins at the system reset interrupt with the exception that caused the wakeup encoded in SRR1. Today, powernv idle wakeup ignores the wakeup reason (except a special case for HMI), and the regular interrupt corresponding to the exception will fire after the idle wakeup exits. Change this to replay the interrupt from the idle wakeup before interrupts are hard-enabled. Test on POWER8 of context_switch selftests benchmark with polling idle disabled (e.g., always nap, giving cross-CPU IPIs) gives the following results: original wakeup direct Different threads, same core: 315k/s 264k/s Different cores: 235k/s 242k/s There is a slowdown for doorbell IPI (same core) case because system reset wakeup does not clear the message and the doorbell interrupt fires again needlessly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-19powerpc/powernv: Simplify lazy IRQ handling in CPU offlineNicholas Piggin1-8/+15
Rather than concern ourselves with any soft-mask logic in the CPU hotplug handler, just hard disable interrupts. This ensures there are no lazy-irqs pending, which means we can call directly to idle instruction in order to sleep. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>