summaryrefslogtreecommitdiff
path: root/drivers/perf
AgeCommit message (Collapse)AuthorFilesLines
2024-03-01Merge tag 'riscv-for-linus-6.8-rc7' of ↵Linus Torvalds3-18/+18
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - detect ".option arch" support on not-yet-released LLVM builds - fix missing TLB flush when modifying non-leaf PTEs - fixes for T-Head custom extensions - fix for systems with the legacy PMU, that manifests as a crash on kernels built without SBI PMU support - fix for systems that clear *envcfg on suspend, which manifests as cbo.zero trapping after resume - fixes for Svnapot systems, including removing Svnapot support for huge vmalloc/vmap regions * tag 'riscv-for-linus-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Sparse-Memory/vmemmap out-of-bounds fix riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode perf: RISCV: Fix panic on pmu overflow handler MAINTAINERS: Update SiFive driver maintainers drivers: perf: ctr_get_width function for legacy is not defined drivers: perf: added capabilities for legacy PMU RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly riscv: add CALLER_ADDRx support RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH kbuild: Add -Wa,--fatal-warnings to as-instr invocation riscv: tlb: fix __p*d_free_tlb()
2024-02-29perf: RISCV: Fix panic on pmu overflow handlerFei Wu1-4/+4
(1 << idx) of int is not desired when setting bits in unsigned long overflowed_ctrs, use BIT() instead. This panic happens when running 'perf record -e branches' on sophgo sg2042. [ 273.311852] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098 [ 273.320851] Oops [#1] [ 273.323179] Modules linked in: [ 273.326303] CPU: 0 PID: 1475 Comm: perf Not tainted 6.6.0-rc3+ #9 [ 273.332521] Hardware name: Sophgo Mango (DT) [ 273.336878] epc : riscv_pmu_ctr_get_width_mask+0x8/0x62 [ 273.342291] ra : pmu_sbi_ovf_handler+0x2e0/0x34e [ 273.347091] epc : ffffffff80aecd98 ra : ffffffff80aee056 sp : fffffff6e36928b0 [ 273.354454] gp : ffffffff821f82d0 tp : ffffffd90c353200 t0 : 0000002ade4f9978 [ 273.361815] t1 : 0000000000504d55 t2 : ffffffff8016cd8c s0 : fffffff6e3692a70 [ 273.369180] s1 : 0000000000000020 a0 : 0000000000000000 a1 : 00001a8e81800000 [ 273.376540] a2 : 0000003c00070198 a3 : 0000003c00db75a4 a4 : 0000000000000015 [ 273.383901] a5 : ffffffd7ff8804b0 a6 : 0000000000000015 a7 : 000000000000002a [ 273.391327] s2 : 000000000000ffff s3 : 0000000000000000 s4 : ffffffd7ff8803b0 [ 273.398773] s5 : 0000000000504d55 s6 : ffffffd905069800 s7 : ffffffff821fe210 [ 273.406139] s8 : 000000007fffffff s9 : ffffffd7ff8803b0 s10: ffffffd903f29098 [ 273.413660] s11: 0000000080000000 t3 : 0000000000000003 t4 : ffffffff8017a0ca [ 273.421022] t5 : ffffffff8023cfc2 t6 : ffffffd9040780e8 [ 273.426437] status: 0000000200000100 badaddr: 0000000000000098 cause: 000000000000000d [ 273.434512] [<ffffffff80aecd98>] riscv_pmu_ctr_get_width_mask+0x8/0x62 [ 273.441169] [<ffffffff80076bd8>] handle_percpu_devid_irq+0x98/0x1ee [ 273.447562] [<ffffffff80071158>] generic_handle_domain_irq+0x28/0x36 [ 273.454151] [<ffffffff8047a99a>] riscv_intc_irq+0x36/0x4e [ 273.459659] [<ffffffff80c944de>] handle_riscv_irq+0x4a/0x74 [ 273.465442] [<ffffffff80c94c48>] do_irq+0x62/0x92 [ 273.470360] Code: 0420 60a2 6402 5529 0141 8082 0013 0000 0013 0000 (6d5c) b783 [ 273.477921] ---[ end trace 0000000000000000 ]--- [ 273.482630] Kernel panic - not syncing: Fatal exception in interrupt Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Fei Wu <fei2.wu@intel.com> Link: https://lore.kernel.org/r/20240228115425.2613856-1-fei2.wu@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-27drivers: perf: ctr_get_width function for legacy is not definedVadim Shakirov2-14/+12
With parameters CONFIG_RISCV_PMU_LEGACY=y and CONFIG_RISCV_PMU_SBI=n linux kernel crashes when you try perf record: $ perf record ls [ 46.749286] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 46.750199] Oops [#1] [ 46.750342] Modules linked in: [ 46.750608] CPU: 0 PID: 107 Comm: perf-exec Not tainted 6.6.0 #2 [ 46.750906] Hardware name: riscv-virtio,qemu (DT) [ 46.751184] epc : 0x0 [ 46.751430] ra : arch_perf_update_userpage+0x54/0x13e [ 46.751680] epc : 0000000000000000 ra : ffffffff8072ee52 sp : ff2000000022b8f0 [ 46.751958] gp : ffffffff81505988 tp : ff6000000290d400 t0 : ff2000000022b9c0 [ 46.752229] t1 : 0000000000000001 t2 : 0000000000000003 s0 : ff2000000022b930 [ 46.752451] s1 : ff600000028fb000 a0 : 0000000000000000 a1 : ff600000028fb000 [ 46.752673] a2 : 0000000ae2751268 a3 : 00000000004fb708 a4 : 0000000000000004 [ 46.752895] a5 : 0000000000000000 a6 : 000000000017ffe3 a7 : 00000000000000d2 [ 46.753117] s2 : ff600000028fb000 s3 : 0000000ae2751268 s4 : 0000000000000000 [ 46.753338] s5 : ffffffff8153e290 s6 : ff600000863b9000 s7 : ff60000002961078 [ 46.753562] s8 : ff60000002961048 s9 : ff60000002961058 s10: 0000000000000001 [ 46.753783] s11: 0000000000000018 t3 : ffffffffffffffff t4 : ffffffffffffffff [ 46.754005] t5 : ff6000000292270c t6 : ff2000000022bb30 [ 46.754179] status: 0000000200000100 badaddr: 0000000000000000 cause: 000000000000000c [ 46.754653] Code: Unable to access instruction at 0xffffffffffffffec. [ 46.754939] ---[ end trace 0000000000000000 ]--- [ 46.755131] note: perf-exec[107] exited with irqs disabled [ 46.755546] note: perf-exec[107] exited with preempt_count 4 This happens because in the legacy case the ctr_get_width function was not defined, but it is used in arch_perf_update_userpage. Also remove extra check in riscv_pmu_ctr_get_width_mask Signed-off-by: Vadim Shakirov <vadim.shakirov@syntacore.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Fixes: cc4c07c89aad ("drivers: perf: Implement perf event mmap support in the SBI backend") Link: https://lore.kernel.org/r/20240227170002.188671-3-vadim.shakirov@syntacore.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-27drivers: perf: added capabilities for legacy PMUVadim Shakirov1-0/+2
Added the PERF_PMU_CAP_NO_INTERRUPT flag because the legacy pmu driver does not provide sampling capabilities Added the PERF_PMU_CAP_NO_EXCLUDE flag because the legacy pmu driver does not provide the ability to disable counter incrementation in different privilege modes Suggested-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Vadim Shakirov <vadim.shakirov@syntacore.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Fixes: 9b3e150e310e ("RISC-V: Add a simple platform driver for RISC-V legacy perf") Link: https://lore.kernel.org/r/20240227170002.188671-2-vadim.shakirov@syntacore.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-20perf: CXL: fix CPMU filter value mask lengthHojin Nam1-5/+5
CPMU filter value is described as 4B length in CXL r3.0 8.2.7.2.2. However, it is used as 2B length in code and comments. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Hojin Nam <hj96.nam@samsung.com> Link: https://lore.kernel.org/r/20240216014522.32321-1-hj96.nam@samsung.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)Ilkka Koskinen1-0/+11
AmpereOneX mesh implementation has a bug in HN-P nodes that makes them report incorrect child count. The failing crosspoints report 8 children while they only have two. When the driver tries to access the inexistent child nodes, it believes it has reached an invalid node type and probing fails. The workaround is to ignore those incorrect child nodes and continue normally. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> [ rm: rewrote simpler generalised version ] Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ce4b1442135fe03d0de41859b04b268c88c854a3.1707498577.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf: CXL: fix mismatched cpmu event opcodeHojin Nam1-1/+1
S2M NDR BI-ConflictAck opcode is described as 4 in the CXL r3.0 3.3.9 Table 3.43. However, it is defined as 3 in macro definition. Fixes: 5d7107c72796 ("perf: CXL Performance Monitoring Unit driver") Signed-off-by: Hojin Nam <hj96.nam@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240208013415epcms2p2904187c8a863f4d0d2adc980fb91a2dc@epcms2p2 Signed-off-by: Will Deacon <will@kernel.org>
2024-01-10Merge tag 'acpi-6.8-rc1' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "From the new features standpoint, the most significant change here is the addition of CSI-2 and MIPI DisCo for Imaging support to the ACPI device enumeration code that will allow MIPI cameras to be enumerated through the platform firmware on systems using ACPI. Also significant is the switch-over to threaded interrupt handlers for the ACPI SCI and the dedicated EC interrupt (on systems where the former is not used) which essentially allows all ACPI code to run with local interrupts enabled. That should improve responsiveness significantly on systems where multiple GPEs are enabled and the handling of one SCI involves many I/O address space accesses which previously had to be carried out in one go with disabled interrupts on the local CPU. Apart from the above, the ACPI thermal zone driver will use the Thermal fast Sampling Period (_TFP) object if available, which should allow temperature changes to be followed more accurately on some systems, the ACPI Notify () handlers can run on all CPUs (not just on CPU0), which should generally speed up the processing of events signaled through the ACPI SCI, and the ACPI power button driver will trigger wakeup key events via the input subsystem (on systems where it is a system wakeup device) In addition to that, there are the usual bunch of fixes and cleanups. Specifics: - Add CSI-2 and DisCo for Imaging support to the ACPI device enumeration code (Sakari Ailus, Rafael J. Wysocki) - Adjust the cpufreq thermal reduction algorithm in the ACPI processor driver for Tegra241 (Srikar Srimath Tirumala, Arnd Bergmann) - Make acpi_proc_quirk_mwait_check() x86-specific (Rafael J. Wysocki) - Switch over ACPI to using a threaded interrupt handler for the SCI (Rafael J. Wysocki) - Allow ACPI Notify () handlers to run on all CPUs and clean up the ACPI interface for deferred events processing (Rafael J. Wysocki) - Switch over the ACPI EC driver to using a threaded handler for the dedicated IRQ on systems without the EC GPE (Rafael J. Wysocki) - Adjust code using ACPICA spinlocks and the ACPI EC driver spinlock to keep local interrupts on (Rafael J. Wysocki) - Adjust the USB4 _OSC handshake to correctly handle cases in which certain types of OS control are denied by the platform (Mika Westerberg) - Correct and clean up the generic function for parsing ACPI data-only tables with array structure (Yuntao Wang) - Modify acpi_dev_uid_match() to support different types of its second argument and adjust its users accordingly (Raag Jadav) - Clean up code related to acpi_evaluate_reference() and ACPI device lists (Rafael J. Wysocki) - Use generic ACPI helpers for evaluating trip point temperature objects in the ACPI thermal zone driver (Rafael J. Wysockii, Arnd Bergmann) - Add Thermal fast Sampling Period (_TFP) support to the ACPI thermal zone driver (Jeff Brasen) - Modify the ACPI LPIT table handling code to avoid u32 multiplication overflows in state residency computations (Nikita Kiryushin) - Drop an unused helper function from the ACPI backlight (video) driver and add a clarifying comment to it (Hans de Goede) - Update the ACPI backlight driver to avoid using uninitialized memory in some cases (Nikita Kiryushin) - Add ACPI backlight quirk for the Colorful X15 AT 23 laptop (Yuluo Qiu) - Add support for vendor-defined error types to the ACPI APEI error injection code (Avadhut Naik) - Adjust APEI to properly set MF_ACTION_REQUIRED on synchronous memory failure events, so they are handled differently from the asynchronous ones (Shuai Xue) - Fix NULL pointer dereference check in the ACPI extlog driver (Prarit Bhargava) - Adjust the ACPI extlog driver to clear the Extended Error Log status when RAS_CEC handled the error (Tony Luck) - Add IRQ override quirks for some Infinity laptops and for TongFang GMxXGxx (David McFarland, Hans de Goede) - Clean up the ACPI NUMA code and fix it to ensure that fake_pxm is not the same as one of the real pxm values (Yuntao Wang) - Fix the fractional clock divider flags in the ACPI LPSS (Intel SoC) driver so as to prevent miscalculation of the values in the clock divider (Andy Shevchenko) - Adjust comments in the ACPI watchdog driver to prevent kernel-doc from complaining during documentation builds (Randy Dunlap) - Make the ACPI button driver send wakeup key events to user space in addition to power button events on systems that can be woken up by the power button (Ken Xue) - Adjust pnpacpi_parse_allocated_vendor() to use memcpy() on a full structure field (Dmitry Antipov)" * tag 'acpi-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (56 commits) ACPI: resource: Add Infinity laptops to irq1_edge_low_force_override ACPI: button: trigger wakeup key events ACPI: resource: Add another DMI match for the TongFang GMxXGxx ACPI: EC: Use a spin lock without disabing interrupts ACPI: EC: Use a threaded handler for dedicated IRQ ACPI: OSL: Use spin locks without disabling interrupts ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events ACPI: utils: Introduce helper for _DEP list lookup ACPI: utils: Fix white space in struct acpi_handle_list definition ACPI: utils: Refine acpi_handle_list_equal() slightly ACPI: utils: Return bool from acpi_evaluate_reference() ACPI: utils: Rearrange in acpi_evaluate_reference() ACPI: arm64: export acpi_arch_thermal_cpufreq_pctg() ACPI: extlog: Clear Extended Error Log status when RAS_CEC handled the error ACPI: LPSS: Fix the fractional clock divider flags ACPI: NUMA: Fix the logic of getting the fake_pxm value ACPI: NUMA: Optimize the check for the availability of node values ACPI: NUMA: Remove unnecessary check in acpi_parse_gi_affinity() ACPI: watchdog: fix kernel-doc warnings ACPI: extlog: fix NULL pointer dereference check ...
2024-01-04Merge branch 'for-next/fixes' into for-next/coreWill Deacon1-1/+1
Merge in arm64 fixes queued for 6.7 so that kpti_install_ng_mappings() can be updated to use arm64_kernel_unmapped_at_el0() instead of checking the ARM64_UNMAP_KERNEL_AT_EL0 CPU capability directly. * for-next/fixes: arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify perf/arm-cmn: Fail DTC counter allocation correctly arm64: Avoid enabling KPTI unnecessarily
2024-01-04Merge branch 'acpi-utils'Rafael J. Wysocki1-3/+1
Merge ACPI utility functions updates for 6.8-rc1: - Modify acpi_dev_uid_match() to support different types of its second argument and adjust its users accordingly (Raag Jadav). - Clean up code related to acpi_evaluate_reference() and ACPI device lists (Rafael J. Wysocki). * acpi-utils: ACPI: utils: Introduce helper for _DEP list lookup ACPI: utils: Fix white space in struct acpi_handle_list definition ACPI: utils: Refine acpi_handle_list_equal() slightly ACPI: utils: Return bool from acpi_evaluate_reference() ACPI: utils: Rearrange in acpi_evaluate_reference() perf: arm_cspmu: drop redundant acpi_dev_uid_to_integer() efi: dev-path-parser: use acpi_dev_uid_match() for matching _UID ACPI: LPSS: use acpi_dev_uid_match() for matching _UID ACPI: bus: update acpi_dev_hid_uid_match() to support multiple types ACPI: bus: update acpi_dev_uid_match() to support multiple types
2023-12-13drivers/perf: add DesignWare PCIe PMU driverShuai Xue3-0/+800
This commit adds the PCIe Performance Monitoring Unit (PMU) driver support for T-Head Yitian SoC chip. Yitian is based on the Synopsys PCI Express Core controller IP which provides statistics feature. The PMU is a PCIe configuration space register block provided by each PCIe Root Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error injection, and Statistics). To facilitate collection of statistics the controller provides the following two features for each Root Port: - one 64-bit counter for Time Based Analysis (RX/TX data throughput and time spent in each low-power LTSSM state) and - one 32-bit counter for Event Counting (error and non-error events for a specified lane) Note: There is no interrupt for counter overflow. This driver adds PMU devices for each PCIe Root Port. And the PMU device is named based the BDF of Root Port. For example, 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) the PMU device name for this Root Port is dwc_rootport_3018. Example usage of counting PCIe RX TLP data payload (Units of bytes):: $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ average RX bandwidth can be calculated like this: PCIe TX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20231208025652.87192-5-xueshuai@linux.alibaba.com [will: Fix sparse error due to use of uninitialised 'vsec' symbol in dwc_pcie_match_des_cap()] Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"Will Deacon1-1/+21
This reverts commit a5f4ca68f348ac059efd6a3d7ad4040aed1c0818. Pulling in the Arm-specific 'linux/perf/arm_pmu.h' header breaks the allmodconfig build for x86: > In file included from drivers/perf/arm_dmc620_pmu.c:26: > include/linux/perf/arm_pmu.h:15:10: fatal error: asm/cputype.h: No such file or directory > 15 | #include <asm/cputype.h> > | ^~~~~~~~~~~~~~~ Just put things back like they were so that the driver can continue to be compile-tested on a variety of architectures. Link: https://lore.kernel.org/r/20231213100931.12d9d85e@canb.auug.org.au Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12perf/arm-cmn: Fail DTC counter allocation correctlyRobin Murphy1-1/+1
Calling arm_cmn_event_clear() before all DTC indices are allocated is wrong, and can lead to arm_cmn_event_add() erroneously clearing live counters from full DTCs where allocation fails. Since the DTC counters are only updated by arm_cmn_init_counter() after all DTC and DTM allocations succeed, nothing actually needs cleaning up in this case anyway, and it should just return directly as it did before. Fixes: 7633ec2c262f ("perf/arm-cmn: Rework DTC counters (again)") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/ed589c0d8e4130dc68b8ad1625226d28bdc185d4.1702322847.git.robin.murphy@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-12-12arm64: perf: Add support for event counting thresholdJames Clark1-1/+78
FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on events whose count meets a specified threshold condition. For example if PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or equal, count), and the threshold is set to 2, then the PMU counter will now only increment by 1 when an event would have previously incremented the PMU counter by 2 or more on a single processor cycle. Three new Perf event config fields, 'threshold', 'threshold_compare' and 'threshold_count' have been added to control the feature. threshold_compare maps to the upper two bits of PMEVTYPERn.TC and threshold_count maps to the first bit of TC. These separate attributes have been picked rather than enumerating all the possible combinations of the TC field as in the Arm ARM. The attributes would be used on a Perf command line like this: $ perf stat -e stall_slot/threshold=2,threshold_compare=2/ A new capability for reading out the maximum supported threshold value has also been added: $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max 0x000000ff If a threshold higher than threshold_max is provided, then an error is generated. If FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is running, then threshold_max reads zero, and attempting to set a threshold value will also result in an error. The threshold is per PMU counter, and there are potentially different threshold_max values per PMU type on heterogeneous systems. Bits higher than 32 now need to be written into PMEVTYPER, so armv8pmu_write_evtype() has to be updated to take an unsigned long value rather than u32 which gives the correct behavior on both aarch32 and 64. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-11-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12arm: pmu: Move error message and -EOPNOTSUPP to individual PMUsJames Clark3-10/+13
-EPERM or -EINVAL always get converted to -EOPNOTSUPP, so replace them. This will allow __hw_perf_event_init() to return a different code or not print that particular message for a different error in the next commit. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-10-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12perf/arm_dmc620: Remove duplicate format attribute #definesJames Clark1-21/+1
These were copied from the SPE driver, but now they're in the arm_pmu.h header so delete them and use the header instead. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-8-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12arm: pmu: Share user ABI format mechanism with SPEJames Clark2-27/+16
This mechanism makes it much easier to define and read new attributes so move it to the arm_pmu.h header so that it can be shared. At the same time update the existing format attributes to use it. GENMASK has to be changed to GENMASK_ULL because the config fields are 64 bits even on arm32 where this will also be used now. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-7-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12arm64: perf: Include threshold control fields in PMEVTYPER maskJames Clark1-1/+8
FEAT_PMUv3_TH (Armv8.8) adds two new fields to PMEVTYPER, so include them in the mask. These aren't writable on 32 bit kernels as they are in the high part of the register, so only include them for arm64. It would be difficult to do this statically in the asm header files for each platform without resulting in circular includes or #ifdefs inline in the code. For that reason the ARMV8_PMU_EVTYPE_MASK definition has been removed and the mask is constructed programmatically. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-6-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12arm: perf: Convert remaining fields to use GENMASKJames Clark1-1/+1
Convert the remaining fields to use either GENMASK or be built from other fields. These all already started at bit 0 so don't need a code change for the lack of _SHIFT. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-5-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12arm: perf: Use GENMASK for PMMIR fieldsJames Clark1-5/+3
This is so that FIELD_GET and FIELD_PREP can be used and that the fields are in a consistent format to arm64/tools/sysreg Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-4-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_NJames Clark1-2/+2
This is so that FIELD_GET and FIELD_PREP can be used and that the fields are in a consistent format to arm64/tools/sysreg Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-3-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12arm: perf: Remove inlines from arm_pmuv3.cJames Clark1-23/+23
These are all static and in one compilation unit so the inline has no effect on the binary. Except if FTRACE is enabled, then 3 functions which were already not inlined now get the nops added which allows them to be traced. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-2-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12drivers/perf: arm_dsu_pmu: Remove kerneldoc-style comment syntaxWill Deacon1-3/+3
For some reason, the Arm DSU PMU driver uses kerneldoc-style comment syntax (i.e. /** ) for non-kerneldoc comments. This makes the robots very angry indeed, so just revert these to normal comments to stop the noise. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312092000.8ltwotjt-lkp@intel.com/ Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12drivers/perf: Remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET1-3/+3
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). This is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/85b0b73a1b2f743dd5db15d4765c7685100de27f.1702230488.git.christophe.jaillet@wanadoo.fr Signed-off-by: Will Deacon <will@kernel.org>
2023-12-06perf: arm_cspmu: drop redundant acpi_dev_uid_to_integer()Raag Jadav1-3/+1
Now that we have _UID matching support for integer types, we can use acpi_dev_hid_uid_match() for it. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-05perf: fsl_imx8_ddr: Add driver support for i.MX8DXL DDR PerfXu Yang1-0/+6
Add driver support for i.MX8DXL DDR Perf, which supports AXI ID PORT CHANNEL filter. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20231120093317.2652866-4-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05perf: fsl_imx8_ddr: Add AXI ID PORT CHANNEL filter supportXu Yang1-0/+39
This is the extension of AXI ID filter. Filter is defined with 2 configuration registers per counter 1-3 (counter 0 is not used for filtering and lacks these registers). * Counter N MASK COMP register - AXI_ID and AXI_MASKING. * Counter N MUX CNTL register - AXI CHANNEL and AXI PORT. -- 0: address channel -- 1: data channel This filter is exposed to userspace as an additional (channel, port) pair. The definition of axi_channel is inverted in userspace, and it will be reverted in driver automatically. AXI filter of Perf Monitor in DDR Subsystem, only a single port0 exist, so axi_port is reserved which should be 0. e.g. perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20231120093317.2652866-1-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05drivers: perf: arm_pmu: Drop 'pmu_lock' element from 'struct pmu_hw_events'Anshuman Khandual1-1/+0
As 'pmu_lock' element is not being used in any ARM PMU implementation, just drop this from 'struct pmu_hw_events'. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20231115092805.737822-3-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05drivers/perf: hisi: Fix some event id for HiSilicon UC pmuJunhao He1-2/+2
Some event id of HiSilicon uncore UC PMU driver is incorrect, fix them. Fixes: 312eca95e28d ("drivers/perf: hisi: Add support for HiSilicon UC PMU driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231204110425.20354-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05drivers/perf: pmuv3: don't expose SW_INCR event in sysfsMark Rutland1-1/+5
The SW_INCR event is somewhat unusual, and depends on the specific HW counter that it is programmed into. When programmed into PMEVCNTR<n>, SW_INCR will count any writes to PMSWINC_EL0 with bit n set, ignoring writes to SW_INCR with bit n clear. Event rotation means that there's no fixed relationship between perf_events and HW counters, so this isn't all that useful. Further, we program PMUSERENR.{SW,EN}=={0,0}, which causes EL0 writes to PMSWINC_EL0 to be trapped and handled as UNDEFINED, resulting in a SIGILL to userspace. Given that, it's not a good idea to expose SW_INCR in sysfs. Hide it as we did for CHAIN back in commit: 4ba2578fa7b55701 ("arm64: perf: don't expose CHAIN event in sysfs") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20231204115847.2993026-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05drivers: perf: arm_pmuv3: Add new macro PMUV3_INIT_MAP_EVENT()Anshuman Khandual1-41/+20
This further compacts all remaining PMU init procedures requiring specific map_event functions via a new macro PMUV3_INIT_MAP_EVENT(). While here, it also changes generated init function names to match to those generated via the other macro PMUV3_INIT_SIMPLE(). This does not cause functional change. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20231114061656.337231-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05perf/arm-cmn: Fix HN-F class_occup_id eventsRobin Murphy1-1/+1
A subtle copy-paste error managed to slip through the reorganisation of these patches in development, and not only give some HN-F events the wrong type, but use that wrong type before the subsequent patch defined it. Too late to fix history, but we can at least fix the bug. Fixes: b1b7dc38e482 ("perf/arm-cmn: Refactor HN-F event selector macros") Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/5a22439de84ff188ef76674798052448eb03a3e1.1700740693.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-11-10Merge tag 'arm64-fixes' of ↵Linus Torvalds2-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: "Mostly PMU fixes and a reworking of the pseudo-NMI disabling on broken MediaTek firmware: - Move the MediaTek GIC quirk handling from irqchip to core. Before the merging window commit 44bd78dd2b88 ("irqchip/gic-v3: Disable pseudo NMIs on MediaTek devices w/ firmware issues") temporarily addressed this issue. Fixed now at a deeper level in the arch code - Reject events meant for other PMUs in the CoreSight PMU driver, otherwise some of the core PMU events would disappear - Fix the Armv8 PMUv3 driver driver to not truncate 64-bit registers, causing some events to be invisible - Remove duplicate declaration of __arm64_sys##name following the patch to avoid prototype warning for syscalls - Typos in the elf_hwcap documentation" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/syscall: Remove duplicate declaration Revert "arm64: smp: avoid NMI IPIs with broken MediaTek FW" arm64: Move MediaTek GIC quirk handling from irqchip to core arm64/arm: arm_pmuv3: perf: Don't truncate 64-bit registers perf: arm_cspmu: Reject events meant for other PMUs Documentation/arm64: Fix typos in elf_hwcaps
2023-11-10Merge tag 'riscv-for-linus-6.7-mw2' of ↵Linus Torvalds1-5/+8
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - Support for handling misaligned accesses in S-mode - Probing for misaligned access support is now properly cached and handled in parallel - PTDUMP now reflects the SW reserved bits, as well as the PBMT and NAPOT extensions - Performance improvements for TLB flushing - Support for many new relocations in the module loader - Various bug fixes and cleanups * tag 'riscv-for-linus-6.7-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits) riscv: Optimize bitops with Zbb extension riscv: Rearrange hwcap.h and cpufeature.h drivers: perf: Do not broadcast to other cpus when starting a counter drivers: perf: Check find_first_bit() return value of: property: Add fw_devlink support for msi-parent RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings riscv: Don't use PGD entries for the linear mapping RISC-V: Probe misaligned access speed in parallel RISC-V: Remove __init on unaligned_emulation_finish() RISC-V: Show accurate per-hart isa in /proc/cpuinfo RISC-V: Don't rely on positional structure initialization riscv: Add tests for riscv module loading riscv: Add remaining module relocations riscv: Avoid unaligned access when relocating modules riscv: split cache ops out of dma-noncoherent.c riscv: Improve flush_tlb_kernel_range() riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb riscv: Improve flush_tlb_range() for hugetlb pages riscv: Improve tlb_flush() ...
2023-11-09riscv: Rearrange hwcap.h and cpufeature.hXiao Wang1-1/+1
Now hwcap.h and cpufeature.h are mutually including each other, and most of the variable/API declarations in hwcap.h are implemented in cpufeature.c, so, it's better to move them into cpufeature.h and leave only macros for ISA extension logical IDs in hwcap.h. BTW, the riscv_isa_extension_mask macro is not used now, so this patch removes it. Suggested-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231031064553.2319688-2-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-09Merge patch "drivers: perf: Do not broadcast to other cpus when starting a ↵Palmer Dabbelt2-5/+8
counter" This is really just a single patch, but since the offending fix hasn't yet made it to my for-next I'm merging it here. Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-09drivers: perf: Do not broadcast to other cpus when starting a counterAlexandre Ghiti1-4/+2
This command: $ perf record -e cycles:k -e instructions:k -c 10000 -m 64M dd if=/dev/zero of=/dev/null count=1000 gives rise to this kernel warning: [ 444.364395] WARNING: CPU: 0 PID: 104 at kernel/smp.c:775 smp_call_function_many_cond+0x42c/0x436 [ 444.364515] Modules linked in: [ 444.364657] CPU: 0 PID: 104 Comm: perf-exec Not tainted 6.6.0-rc6-00051-g391df82e8ec3-dirty #73 [ 444.364771] Hardware name: riscv-virtio,qemu (DT) [ 444.364868] epc : smp_call_function_many_cond+0x42c/0x436 [ 444.364917] ra : on_each_cpu_cond_mask+0x20/0x32 [ 444.364948] epc : ffffffff8009f9e0 ra : ffffffff8009fa5a sp : ff20000000003800 [ 444.364966] gp : ffffffff81500aa0 tp : ff60000002b83000 t0 : ff200000000038c0 [ 444.364982] t1 : ffffffff815021f0 t2 : 000000000000001f s0 : ff200000000038b0 [ 444.364998] s1 : ff60000002c54d98 a0 : ff60000002a73940 a1 : 0000000000000000 [ 444.365013] a2 : 0000000000000000 a3 : 0000000000000003 a4 : 0000000000000100 [ 444.365029] a5 : 0000000000010100 a6 : 0000000000f00000 a7 : 0000000000000000 [ 444.365044] s2 : 0000000000000000 s3 : ffffffffffffffff s4 : ff60000002c54d98 [ 444.365060] s5 : ffffffff81539610 s6 : ffffffff80c20c48 s7 : 0000000000000000 [ 444.365075] s8 : 0000000000000000 s9 : 0000000000000001 s10: 0000000000000001 [ 444.365090] s11: ffffffff80099394 t3 : 0000000000000003 t4 : 00000000eac0c6e6 [ 444.365104] t5 : 0000000400000000 t6 : ff60000002e010d0 [ 444.365120] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 444.365226] [<ffffffff8009f9e0>] smp_call_function_many_cond+0x42c/0x436 [ 444.365295] [<ffffffff8009fa5a>] on_each_cpu_cond_mask+0x20/0x32 [ 444.365311] [<ffffffff806e90dc>] pmu_sbi_ctr_start+0x7a/0xaa [ 444.365327] [<ffffffff806e880c>] riscv_pmu_start+0x48/0x66 [ 444.365339] [<ffffffff8012111a>] perf_adjust_freq_unthr_context+0x196/0x1ac [ 444.365356] [<ffffffff801237aa>] perf_event_task_tick+0x78/0x8c [ 444.365368] [<ffffffff8003faf4>] scheduler_tick+0xe6/0x25e [ 444.365383] [<ffffffff8008a042>] update_process_times+0x80/0x96 [ 444.365398] [<ffffffff800991ec>] tick_sched_handle+0x26/0x52 [ 444.365410] [<ffffffff800993e4>] tick_sched_timer+0x50/0x98 [ 444.365422] [<ffffffff8008a6aa>] __hrtimer_run_queues+0x126/0x18a [ 444.365433] [<ffffffff8008b350>] hrtimer_interrupt+0xce/0x1da [ 444.365444] [<ffffffff806cdc60>] riscv_timer_interrupt+0x30/0x3a [ 444.365457] [<ffffffff8006afa6>] handle_percpu_devid_irq+0x80/0x114 [ 444.365470] [<ffffffff80065b82>] generic_handle_domain_irq+0x1c/0x2a [ 444.365483] [<ffffffff8045faec>] riscv_intc_irq+0x2e/0x46 [ 444.365497] [<ffffffff808a9c62>] handle_riscv_irq+0x4a/0x74 [ 444.365521] [<ffffffff808aa760>] do_irq+0x7c/0x7e [ 444.365796] ---[ end trace 0000000000000000 ]--- That's because the fix in commit 3fec323339a4 ("drivers: perf: Fix panic in riscv SBI mmap support") was wrong since there is no need to broadcast to other cpus when starting a counter, that's only needed in mmap when the counters could have already been started on other cpus, so simply remove this broadcast. Fixes: 3fec323339a4 ("drivers: perf: Fix panic in riscv SBI mmap support") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Clément Léger <cleger@rivosinc.com> Tested-by: Yu Chien Peter Lin <peterlin@andestech.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> #On Link: https://lore.kernel.org/r/20231026084010.11888-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-09drivers: perf: Check find_first_bit() return valueAlexandre Ghiti1-0/+5
We must check the return value of find_first_bit() before using the return value as an index array since it happens to overflow the array and then panic: [ 107.318430] Kernel BUG [#1] [ 107.319434] CPU: 3 PID: 1238 Comm: kill Tainted: G E 6.6.0-rc6ubuntu-defconfig #2 [ 107.319465] Hardware name: riscv-virtio,qemu (DT) [ 107.319551] epc : pmu_sbi_ovf_handler+0x3a4/0x3ae [ 107.319840] ra : pmu_sbi_ovf_handler+0x52/0x3ae [ 107.319868] epc : ffffffff80a0a77c ra : ffffffff80a0a42a sp : ffffaf83fecda350 [ 107.319884] gp : ffffffff823961a8 tp : ffffaf8083db1dc0 t0 : ffffaf83fecda480 [ 107.319899] t1 : ffffffff80cafe62 t2 : 000000000000ff00 s0 : ffffaf83fecda520 [ 107.319921] s1 : ffffaf83fecda380 a0 : 00000018fca29df0 a1 : ffffffffffffffff [ 107.319936] a2 : 0000000001073734 a3 : 0000000000000004 a4 : 0000000000000000 [ 107.319951] a5 : 0000000000000040 a6 : 000000001d1c8774 a7 : 0000000000504d55 [ 107.319965] s2 : ffffffff82451f10 s3 : ffffffff82724e70 s4 : 000000000000003f [ 107.319980] s5 : 0000000000000011 s6 : ffffaf8083db27c0 s7 : 0000000000000000 [ 107.319995] s8 : 0000000000000001 s9 : 00007fffb45d6558 s10: 00007fffb45d81a0 [ 107.320009] s11: ffffaf7ffff60000 t3 : 0000000000000004 t4 : 0000000000000000 [ 107.320023] t5 : ffffaf7f80000000 t6 : ffffaf8000000000 [ 107.320037] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 107.320081] [<ffffffff80a0a77c>] pmu_sbi_ovf_handler+0x3a4/0x3ae [ 107.320112] [<ffffffff800b42d0>] handle_percpu_devid_irq+0x9e/0x1a0 [ 107.320131] [<ffffffff800ad92c>] generic_handle_domain_irq+0x28/0x36 [ 107.320148] [<ffffffff8065f9f8>] riscv_intc_irq+0x36/0x4e [ 107.320166] [<ffffffff80caf4a0>] handle_riscv_irq+0x54/0x86 [ 107.320189] [<ffffffff80cb0036>] do_irq+0x64/0x96 [ 107.320271] Code: 85a6 855e b097 ff7f 80e7 9220 b709 9002 4501 bbd9 (9002) 6097 [ 107.320585] ---[ end trace 0000000000000000 ]--- [ 107.320704] Kernel panic - not syncing: Fatal exception in interrupt [ 107.320775] SMP: stopping secondary CPUs [ 107.321219] Kernel Offset: 0x0 from 0xffffffff80000000 [ 107.333051] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fixes: 4905ec2fb7e6 ("RISC-V: Add sscofpmf extension support") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231109082128.40777-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-07arm64/arm: arm_pmuv3: perf: Don't truncate 64-bit registersIlkka Koskinen1-3/+3
The driver used to truncate several 64-bit registers such as PMCEID[n] registers used to describe whether architectural and microarchitectural events in range 0x4000-0x401f exist. Due to discarding the bits, the driver made the events invisible, even if they existed. Moreover, PMCCFILTR and PMCR registers have additional bits in the upper 32 bits. This patch makes them available although they aren't currently used. Finally, functions handling PMXEVCNTR and PMXEVTYPER registers are removed as they not being used at all. Fixes: df29ddf4f04b ("arm64: perf: Abstract system register accesses away") Reported-by: Carl Worth <carl@os.amperecomputing.com> Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Acked-by: Will Deacon <will@kernel.org> Closes: https://lore.kernel.org/.. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20231102183012.1251410-1-ilkka@os.amperecomputing.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-11-07perf: arm_cspmu: Reject events meant for other PMUsIlkka Koskinen1-0/+3
Coresight PMU driver didn't reject events meant for other PMUs. This caused some of the Core PMU events disappearing from the output of "perf list". In addition, trying to run e.g. $ perf stat -e r2 sleep 1 made Coresight PMU driver to handle the event instead of letting Core PMU driver to deal with it. Cc: stable@vger.kernel.org Fixes: e37dfd65731d ("perf: arm_cspmu: Add support for ARM CoreSight PMU driver") Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20231103001654.35565-1-ilkka@os.amperecomputing.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-11-02Merge tag 'sysctl-6.7-rc1' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl updates from Luis Chamberlain: "To help make the move of sysctls out of kernel/sysctl.c not incur a size penalty sysctl has been changed to allow us to not require the sentinel, the final empty element on the sysctl array. Joel Granados has been doing all this work. On the v6.6 kernel we got the major infrastructure changes required to support this. For v6.7-rc1 we have all arch/ and drivers/ modified to remove the sentinel. Both arch and driver changes have been on linux-next for a bit less than a month. It is worth re-iterating the value: - this helps reduce the overall build time size of the kernel and run time memory consumed by the kernel by about ~64 bytes per array - the extra 64-byte penalty is no longer inncurred now when we move sysctls out from kernel/sysctl.c to their own files For v6.8-rc1 expect removal of all the sentinels and also then the unneeded check for procname == NULL. The last two patches are fixes recently merged by Krister Johansen which allow us again to use softlockup_panic early on boot. This used to work but the alias work broke it. This is useful for folks who want to detect softlockups super early rather than wait and spend money on cloud solutions with nothing but an eventual hung kernel. Although this hadn't gone through linux-next it's also a stable fix, so we might as well roll through the fixes now" * tag 'sysctl-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (23 commits) watchdog: move softlockup_panic back to early_param proc: sysctl: prevent aliased sysctls from getting passed to init intel drm: Remove now superfluous sentinel element from ctl_table array Drivers: hv: Remove now superfluous sentinel element from ctl_table array raid: Remove now superfluous sentinel element from ctl_table array fw loader: Remove the now superfluous sentinel element from ctl_table array sgi-xp: Remove the now superfluous sentinel element from ctl_table array vrf: Remove the now superfluous sentinel element from ctl_table array char-misc: Remove the now superfluous sentinel element from ctl_table array infiniband: Remove the now superfluous sentinel element from ctl_table array macintosh: Remove the now superfluous sentinel element from ctl_table array parport: Remove the now superfluous sentinel element from ctl_table array scsi: Remove now superfluous sentinel element from ctl_table array tty: Remove now superfluous sentinel element from ctl_table array xen: Remove now superfluous sentinel element from ctl_table array hpet: Remove now superfluous sentinel element from ctl_table array c-sky: Remove now superfluous sentinel element from ctl_talbe array powerpc: Remove now superfluous sentinel element from ctl_table arrays riscv: Remove now superfluous sentinel element from ctl_table array x86/vdso: Remove now superfluous sentinel element from ctl_table array ...
2023-11-01Merge tag 'arm64-upstream' of ↵Linus Torvalds15-204/+642
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "No major architecture features this time around, just some new HWCAP definitions, support for the Ampere SoC PMUs and a few fixes/cleanups. The bulk of the changes is reworking of the CPU capability checking code (cpus_have_cap() etc). - Major refactoring of the CPU capability detection logic resulting in the removal of the cpus_have_const_cap() function and migrating the code to "alternative" branches where possible - Backtrace/kgdb: use IPIs and pseudo-NMI - Perf and PMU: - Add support for Ampere SoC PMUs - Multi-DTC improvements for larger CMN configurations with multiple Debug & Trace Controllers - Rework the Arm CoreSight PMU driver to allow separate registration of vendor backend modules - Fixes: add missing MODULE_DEVICE_TABLE to the amlogic perf driver; use device_get_match_data() in the xgene driver; fix NULL pointer dereference in the hisi driver caused by calling cpuhp_state_remove_instance(); use-after-free in the hisi driver - HWCAP updates: - FEAT_SVE_B16B16 (BFloat16) - FEAT_LRCPC3 (release consistency model) - FEAT_LSE128 (128-bit atomic instructions) - SVE: remove a couple of pseudo registers from the cpufeature code. There is logic in place already to detect mismatched SVE features - Miscellaneous: - Reduce the default swiotlb size (currently 64MB) if no ZONE_DMA bouncing is needed. The buffer is still required for small kmalloc() buffers - Fix module PLT counting with !RANDOMIZE_BASE - Restrict CPU_BIG_ENDIAN to LLVM IAS 15.x or newer move synchronisation code out of the set_ptes() loop - More compact cpufeature displaying enabled cores - Kselftest updates for the new CPU features" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits) arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper perf: hisi: Fix use-after-free when register pmu fails drivers/perf: hisi_pcie: Initialize event->cpu only on success drivers/perf: hisi_pcie: Check the type first in pmu::event_init() arm64: cpufeature: Change DBM to display enabled cores arm64: cpufeature: Display the set of cores with a feature perf/arm-cmn: Enable per-DTC counter allocation perf/arm-cmn: Rework DTC counters (again) perf/arm-cmn: Fix DTC domain detection drivers: perf: arm_pmuv3: Drop some unused arguments from armv8_pmu_init() drivers: perf: arm_pmuv3: Read PMMIR_EL1 unconditionally drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process clocksource/drivers/arm_arch_timer: limit XGene-1 workaround arm64: Remove system_uses_lse_atomics() arm64: Mark the 'addr' argument to set_ptes() and __set_pte_at() as unused drivers/perf: xgene: Use device_get_match_data() perf/amlogic: add missing MODULE_DEVICE_TABLE arm64/mm: Hoist synchronization out of set_ptes() loop ...
2023-10-25perf: arm_cspmu: use acpi_dev_hid_uid_match() for matching _HID and _UIDRaag Jadav1-5/+3
Convert manual _UID references to use the standard ACPI helpers. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-24perf: hisi: Fix use-after-free when register pmu failsJunhao He2-4/+4
When we fail to register the uncore pmu, the pmu context may not been allocated. The error handing will call cpuhp_state_remove_instance() to call uncore pmu offline callback, which migrate the pmu context. Since that's liable to lead to some kind of use-after-free. Use cpuhp_state_remove_instance_nocalls() instead of cpuhp_state_remove_instance() so that the notifiers don't execute after the PMU device has been failed to register. Fixes: a0ab25cd82ee ("drivers/perf: hisi: Add support for HiSilicon PA PMU driver") FIxes: 3bf30882c3c7 ("drivers/perf: hisi: Add support for HiSilicon SLLC PMU driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20231024113630.13472-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-24drivers/perf: hisi_pcie: Initialize event->cpu only on successYicong Yang1-2/+2
Initialize the event->cpu only on success. To be more reasonable and keep consistent with other PMUs. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231024092954.42297-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-24drivers/perf: hisi_pcie: Check the type first in pmu::event_init()Yicong Yang1-3/+4
Check whether the event type matches the PMU type firstly in pmu::event_init() before touching the event. Otherwise we'll change the events of others and lead to incorrect results. Since in perf_init_event() we may call every pmu's event_init() in a certain case, we should not modify the event if it's not ours. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231024092954.42297-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-23perf/arm-cmn: Enable per-DTC counter allocationRobin Murphy1-8/+10
Finally enable independent per-DTC-domain counter allocation, except on CMN-600 where we still need to cope with not knowing the domain topology and thus keep counter indices sychronised across domains. This allows users to simultaneously count up to 8 targeted events per domain, rather than 8 globally, for up to 4x wider coverage on maximum configurations. Even though this now looks deceptively simple, I stand by my previous assertion that it was a flippin' nightmare to implement; all the real head-scratchers are hidden in the foundations in the previous patch... Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/849f65566582cb102c6d0843d0f26e231180f8ac.1697824215.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-23perf/arm-cmn: Rework DTC counters (again)Robin Murphy1-62/+64
The bitmap-based scheme for tracking DTC counter usage turns out to be a complete dead-end for its imagined purpose, since by the time we have to keep track of a per-DTC counter index anyway, we already have enough information to make the bitmap itself redundant. Revert the remains of it back to almost the original scheme, but now expanded to track per-DTC indices, in preparation for making use of them in anger. Note that since cycle count events always use a dedicated counter on a single DTC, we reuse the field to encode their DTC index directly. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/5f6ade76b47f033836d7a36c03555da896dfb4a3.1697824215.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-23perf/arm-cmn: Fix DTC domain detectionRobin Murphy1-2/+14
It transpires that dtm_unit_info is another register which got shuffled in CMN-700 without me noticing. Fix that in a way which also proactively fixes the fragile laziness of its consumer, just in case any further fields ever get added alongside dtc_domain. Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/3076ee83d0554f6939fbb6ee49ab2bdb28d8c7ee.1697824215.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-20perf: qcom: use acpi_device_uid() for fetching _UIDRaag Jadav1-2/+2
Convert manual _UID references to use the standard ACPI helper. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>