summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/perf_event.c
AgeCommit message (Collapse)AuthorFilesLines
2021-08-11arm64/perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEFAnshuman Khandual1-1/+1
ID_AA64DFR0_PMUVER_IMP_DEF, indicating an "implementation defined" PMU, never actually gets used although there are '0xf' instances scattered all around. Use the symbolic name instead of the raw hex constant. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/1628652427-24695-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-11arm64: perf: Simplify EVENT ATTR macro in perf_event.cQi Liu1-4/+1
Use common macro PMU_EVENT_ATTR_ID to simplify ARMV8_EVENT_ATTR Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1623220863-58233-8-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-03arm64: perf: Add more support on caps under sysfsShaokun Zhang1-0/+33
Armv8.7 has introduced BUS_SLOTS and BUS_WIDTH in PMMIR_EL1 register, add two entries in caps for bus_slots and bus_width under sysfs. It will return the true slots and width if the information is available, otherwise it will return 0. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1622704502-63951-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-01arm64: perf: Convert snprintf to sysfs_emitTian Tao1-1/+1
Use sysfs_emit instead of snprintf to avoid buf overrun,because in sysfs_emit it strictly checks whether buf is null or buf whether pagesize aligned, otherwise it returns an error. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/1621497585-30887-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-04-01arm64: perf: Remove redundant initialization in perf_event.cQi Liu1-3/+2
The initialization of value in function armv8pmu_read_hw_counter() and armv8pmu_read_counter() seem redundant, as they are soon updated. So, We can remove them. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1617275801-1980-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-10arm64: perf: Fix 64-bit event counter read truncationRob Herring1-1/+1
Commit 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection") changed armv8pmu_read_evcntr() to return a u32 instead of u64. The result is silent truncation of the event counter when using 64-bit counters. Given the offending commit appears to have passed thru several folks, it seems likely this was a bad rebase after v8.5 PMU 64-bit counters landed. Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: <stable@vger.kernel.org> Fixes: 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection") Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20210310004412.1450128-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-12Merge branch 'for-next/perf' into for-next/coreWill Deacon1-3/+10
Perf and PMU updates including support for Cortex-A78 and the v8.3 SPE extensions. * for-next/perf: drivers/perf: Replace spin_lock_irqsave to spin_lock dt-bindings: arm: add Cortex-A78 binding arm64: perf: add support for Cortex-A78 arm64: perf: Constify static attribute_group structs drivers/perf: Prevent forced unbinding of ARM_DMC620_PMU drivers perf/arm-cmn: Move IRQs when migrating context perf/arm-cmn: Fix PMU instance naming perf: Constify static struct attribute_group perf: hisi: Constify static struct attribute_group perf/imx_ddr: Constify static struct attribute_group perf: qcom: Constify static struct attribute_group drivers/perf: Add support for ARMv8.3-SPE
2021-02-04arm64: improve whitespaceZhiyuan Dai1-1/+1
In a few places we don't have whitespace between macro parameters, which makes them hard to read. This patch adds whitespace to clearly separate the parameters. In a few places we have unnecessary whitespace around unary operators, which is confusing, This patch removes the unnecessary whitespace. Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn> Link: https://lore.kernel.org/r/1612403029-5011-1-git-send-email-daizhiyuan@phytium.com.cn Signed-off-by: Will Deacon <will@kernel.org>
2021-02-03arm64: perf: add support for Cortex-A78Seiya Wang1-0/+7
Add support for Cortex-A78 using generic PMUv3 for now. Signed-off-by: Seiya Wang <seiya.wang@mediatek.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210203055348.4935-2-seiya.wang@mediatek.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-02arm64: perf: Constify static attribute_group structsRikard Falkeborn1-3/+3
The only usage of these is to put their addresses in an array of pointers to const attribute_group structs. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Signed-off-by: Will Deacon <will@kernel.org>
2021-01-13Revert "arm64: Enable perf events based hard lockup detector"Will Deacon1-39/+2
This reverts commit 367c820ef08082e68df8a3bc12e62393af21e4b5. lockup_detector_init() makes heavy use of per-cpu variables and must be called with preemption disabled. Usually, it's handled early during boot in kernel_init_freeable(), before SMP has been initialised. Since we do not know whether or not our PMU interrupt can be signalled as an NMI until considerably later in the boot process, the Arm PMU driver attempts to re-initialise the lockup detector off the back of a device_initcall(). Unfortunately, this is called from preemptible context and results in the following splat: | BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 | caller is debug_smp_processor_id+0x20/0x2c | CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #276 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x3c0 | show_stack+0x20/0x6c | dump_stack+0x2f0/0x42c | check_preemption_disabled+0x1cc/0x1dc | debug_smp_processor_id+0x20/0x2c | hardlockup_detector_event_create+0x34/0x18c | hardlockup_detector_perf_init+0x2c/0x134 | watchdog_nmi_probe+0x18/0x24 | lockup_detector_init+0x44/0xa8 | armv8_pmu_driver_init+0x54/0x78 | do_one_initcall+0x184/0x43c | kernel_init_freeable+0x368/0x380 | kernel_init+0x1c/0x1cc | ret_from_fork+0x10/0x30 Rather than bodge this with raw_smp_processor_id() or randomly disabling preemption, simply revert the culprit for now until we figure out how to do this properly. Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com> Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Sumit Garg <sumit.garg@linaro.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20201221162249.3119-1-lecopzer.chen@mediatek.com Link: https://lore.kernel.org/r/20210112221855.10666-1-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-25arm64: Enable perf events based hard lockup detectorSumit Garg1-2/+39
With the recent feature added to enable perf events to use pseudo NMIs as interrupts on platforms which support GICv3 or later, its now been possible to enable hard lockup detector (or NMI watchdog) on arm64 platforms. So enable corresponding support. One thing to note here is that normally lockup detector is initialized just after the early initcalls but PMU on arm64 comes up much later as device_initcall(). So we need to re-initialize lockup detection once PMU has been initialized. Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Acked-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/1602060704-10921-1-git-send-email-sumit.garg@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Defer irq_work to IPI_IRQ_WORKJulien Thierry1-9/+5
When handling events, armv8pmu_handle_irq() calls perf_event_overflow(), and subsequently calls irq_work_run() to handle any work queued by perf_event_overflow(). As perf_event_overflow() raises IPI_IRQ_WORK when queuing the work, this isn't strictly necessary and the work could be handled as part of the IPI_IRQ_WORK handler. In the common case the IPI handler will run immediately after the PMU IRQ handler, and where the PE is heavily loaded with interrupts other handlers may run first, widening the window where some counters are disabled. In practice this window is unlikely to be a significant issue, and removing the call to irq_work_run() would make the PMU IRQ handler NMI safe in addition to making it simpler, so let's do that. [Alexandru E.: Reworded commit message] Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-5-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Remove PMU lockingJulien Thierry1-28/+0
The PMU is disabled and enabled, and the counters are programmed from contexts where interrupts or preemption is disabled. The functions to toggle the PMU and to program the PMU counters access the registers directly and don't access data modified by the interrupt handler. That, and the fact that they're always called from non-preemptible contexts, means that we don't need to disable interrupts or use a spinlock. [Alexandru E.: Explained why locking is not needed, removed WARN_ONs] Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-4-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Avoid PMXEV* indirectionMark Rutland1-14/+85
Currently we access the counter registers and their respective type registers indirectly. This requires us to write to PMSELR, issue an ISB, then access the relevant PMXEV* registers. This is unfortunate, because: * Under virtualization, accessing one register requires two traps to the hypervisor, even though we could access the register directly with a single trap. * We have to issue an ISB which we could otherwise avoid the cost of. * When we use NMIs, the NMI handler will have to save/restore the select register in case the code it preempted was attempting to access a counter or its type register. We can avoid these issues by directly accessing the relevant registers. This patch adds helpers to do so. In armv8pmu_enable_event() we still need the ISB to prevent the PE from reordering the write to PMINTENSET_EL1 register. If the interrupt is enabled before we disable the counter and the new event is configured, we might get an interrupt triggered by the previously programmed event overflowing, but which we wrongly attribute to the event that we are enabling. Execute an ISB after we disable the counter. In the process, remove the comment that refers to the ARMv7 PMU. [Julien T.: Don't inline read/write functions to avoid big code-size increase, remove unused read_pmevtypern function, fix counter index issue.] [Alexandru E.: Removed comment, removed trailing semicolons in macros, added ISB] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-3-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Add missing ISB in armv8pmu_enable_counter()Alexandru Elisei1-0/+5
Writes to the PMXEVTYPER_EL0 register are not self-synchronising. In armv8pmu_enable_event(), the PE can reorder configuring the event type after we have enabled the counter and the interrupt. This can lead to an interrupt being asserted because of the previous event type that we were counting using the same counter, not the one that we've just configured. The same rationale applies to writes to the PMINTENSET_EL1 register. The PE can reorder enabling the interrupt at any point in the future after we have enabled the event. Prevent both situations from happening by adding an ISB just before we enable the event counter. Fixes: 030896885ade ("arm64: Performance counters support") Reported-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-2-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28arm64: perf: Add support caps under sysfsShaokun Zhang1-33/+70
ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events, like STALL_SLOT etc, are related to it. Let's add a caps directory to /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from PMMIR_EL1 registers in this entry. The user programs can get the slots from sysfs directly. /sys/bus/event_source/devices/armv8_pmuv3_0/caps/slots is exposed under sysfs. Both ARMv8.4-PMU and STALL_SLOT event are implemented, it returns the slots from PMMIR_EL1, otherwise it will return 0. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1600754025-53535-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07arm64: perf: Remove unnecessary event_idx checkQi Liu1-18/+2
event_idx is obtained from armv8pmu_get_event_idx(), and this idx must be between ARMV8_IDX_CYCLE_COUNTER and cpu_pmu->num_events. So it's unnecessary to do this check. Let's remove it. Signed-off-by: Qi Liu <liuqi115@huawei.com> Link: https://lore.kernel.org/r/1599213458-28394-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07arm64: perf: Add general hardware LLC events for PMUv3Leo Yan1-0/+3
This patch is to add the general hardware last level cache (LLC) events for PMUv3: one event is for LLC access and another is for LLC miss. With this change, perf tool can support last level cache profiling, below is an example to demonstrate the usage on Arm64: $ perf stat -e LLC-load-misses -e LLC-loads -- \ perf bench mem memcpy -s 1024MB -l 100 -f default [...] Performance counter stats for 'perf bench mem memcpy -s 1024MB -l 100 -f default': 35,534,262 LLC-load-misses # 2.16% of all LL-cache hits 1,643,946,443 LLC-loads [...] Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200811053505.21223-1-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21arm64: perf: Expose some new events via sysfsShaokun Zhang1-0/+19
Some new PMU events can been detected by PMCEID1_EL0, but it can't be listed, Let's expose these through sysfs. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1595328573-12751-2-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Add cap_user_time_shortPeter Zijlstra1-5/+7
This completes the ARM64 cap_user_time support. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-7-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Only advertise cap_user_time for arch_timerPeter Zijlstra1-6/+13
When sched_clock is running on anything other than arch_timer, don't advertise cap_user_time*. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-5-leo.yan@linaro.org Requested-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Implement correct cap_user_timePeter Zijlstra1-9/+29
As reported by Leo; the existing implementation is broken when the clock and counter don't intersect at 0. Use the sched_clock's struct clock_read_data information to correctly implement cap_user_time and cap_user_time_zero. Note that the ARM64 counter is architecturally only guaranteed to be 56bit wide (implementations are allowed to be wider) and the existing perf ABI cannot deal with wrap-around. This implementation should also be faster than the old; seeing how we don't need to recompute mult and shift all the time. [leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow] Reported-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-4-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20arm64: perf: Correct the event index in sysfsShaokun Zhang1-5/+8
When PMU event ID is equal or greater than 0x4000, it will be reduced by 0x4000 and it is not the raw number in the sysfs. Let's correct it and obtain the raw event ID. Before this patch: cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed event=0x001 After this patch: cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed event=0x4001 Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1592487344-30555-3-git-send-email-zhangshaokun@hisilicon.com [will: fixed formatting of 'if' condition] Signed-off-by: Will Deacon <will@kernel.org>
2020-03-18arm64: perf: Add support for ARMv8.5-PMU 64-bit countersAndrew Murray1-16/+71
At present ARMv8 event counters are limited to 32-bits, though by using the CHAIN event it's possible to combine adjacent counters to achieve 64-bits. The perf config1:0 bit can be set to use such a configuration. With the introduction of ARMv8.5-PMU support, all event counters can now be used as 64-bit counters. Let's enable 64-bit event counters where support exists. Unless the user sets config1:0 we will adjust the counter value such that it overflows upon 32-bit overflow. This follows the same behaviour as the cycle counter which has always been (and remains) 64-bits. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> [Mark: fix ID field names, compare with 8.5 value] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-03-18arm64: perf: Clean up enable/disable callsRobin Murphy1-52/+35
Reading this code bordered on painful, what with all the repetition and pointless return values. More fundamentally, dribbling the hardware enables and disables in one bit at a time incurs needless system register overhead for chained events and on reset. We already use bitmask values for the KVM hooks, so consolidate all the register accesses to match, and make a reasonable saving in both source and object code. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-03-02arm64: perf: Support new DT compatiblesRobin Murphy1-0/+56
Add support for matching the new PMUs. For now, this just wires them up as generic PMUv3 such that people writing DTs for new SoCs can do the right thing, and at least have architectural and raw events be usable. We can come back and fill in event maps for sysfs and/or perf tools at a later date. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-03-02arm64: perf: Refactor PMU init callbacksRobin Murphy1-97/+27
The PMU init callbacks are already drowning in boilerplate, so before doubling the number of supported PMU models, give it a sensible refactor to significantly reduce the bloat, both in source and object code. Although nobody uses non-default sysfs attributes today, there's minimal impact to preserving the notion that maybe, some day, somebody might, so we may as well keep up appearances. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-11-01arm64: perf: Simplify the ARMv8 PMUv3 event attributesShaokun Zhang1-125/+66
For each PMU event, there is a ARMV8_EVENT_ATTR(xx, XX) and &armv8_event_attr_xx.attr.attr. Let's redefine the ARMV8_EVENT_ATTR to simplify the armv8_pmuv3_event_attrs. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> [will: Dropped unnecessary array syntax] Signed-off-by: Will Deacon <will@kernel.org>
2019-08-20arm64: perf_event: Add missing header needed for smp_processor_id()Raphael Gault1-0/+1
In perf_event.c we use smp_processor_id(), but we haven't included <linux/smp.h> where it is defined, and rely on this being pulled in via a transitive include. Let's make this more robust by including <linux.smp.h> explicitly. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Raphael Gault <raphael.gault@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-23arm64: perf: Remove unused macroShaokun Zhang1-1/+0
ARMV8_EVENT_ATTR_RESOLVE became unused after commit <4b1a9e6934ec> ("arm64/perf: Filter common events based on PMCEIDn_EL0"). Remove it. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234Thomas Gleixner1-12/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-9/+41
Pull KVM updates from Paolo Bonzini: "ARM: - support for SVE and Pointer Authentication in guests - PMU improvements POWER: - support for direct access to the POWER9 XIVE interrupt controller - memory and performance optimizations x86: - support for accessing memory not backed by struct page - fixes and refactoring Generic: - dirty page tracking improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (155 commits) kvm: fix compilation on aarch64 Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU" kvm: x86: Fix L1TF mitigation for shadow MMU KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible KVM: PPC: Book3S: Remove useless checks in 'release' method of KVM device KVM: PPC: Book3S HV: XIVE: Fix spelling mistake "acessing" -> "accessing" KVM: PPC: Book3S HV: Make sure to load LPID for radix VCPUs kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete tests: kvm: Add tests for KVM_SET_NESTED_STATE KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state tests: kvm: Add tests for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_CPU_ID tests: kvm: Add tests to .gitignore KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 KVM: Fix kvm_clear_dirty_log_protect off-by-(minus-)one KVM: Fix the bitmap range to copy during clear dirty KVM: arm64: Fix ptrauth ID register masking logic KVM: x86: use direct accessors for RIP and RSP KVM: VMX: Use accessors for GPRs outside of dedicated caching logic KVM: x86: Omit caching logic for always-available GPRs kvm, x86: Properly check whether a pfn is an MMIO or not ...
2019-04-24arm64: KVM: Enable VHE support for :G/:H perf event modifiersAndrew Murray1-1/+5
With VHE different exception levels are used between the host (EL2) and guest (EL1) with a shared exception level for userpace (EL0). We can take advantage of this and use the PMU's exception level filtering to avoid enabling/disabling counters in the world-switch code. Instead we just modify the counter type to include or exclude EL0 at vcpu_{load,put} time. We also ensure that trapped PMU system register writes do not re-enable EL0 when reconfiguring the backing perf events. This approach completely avoids blackout windows seen with !VHE. Suggested-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Andrew Murray <andrew.murray@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-24arm64: arm_pmu: Add !VHE support for exclude_host/exclude_guest attributesAndrew Murray1-7/+36
Add support for the :G and :H attributes in perf by handling the exclude_host/exclude_guest event attributes. We notify KVM of counters that we wish to be enabled or disabled on guest entry/exit and thus defer from starting or stopping events based on their event attributes. With !VHE we switch the counters between host/guest at EL2. We are able to eliminate counters counting host events on the boundaries of guest entry/exit when using :G by filtering out EL2 for exclude_host. When using !exclude_hv there is a small blackout window at the guest entry/exit where host events are not captured. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-24arm64: arm_pmu: Remove unnecessary isb instructionAndrew Murray1-1/+0
The armv8pmu_enable_event_counter function issues an isb instruction after enabling a pair of counters - this doesn't provide any value and is inconsistent with the armv8pmu_disable_event_counter. In any case armv8pmu_enable_event_counter is always called with the PMU stopped. Starting the PMU with armv8pmu_start results in an isb instruction being issued prior to writing to PMCR_EL0. Let's remove the unnecessary isb instruction. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-11arm64: perf_event: Remove wrongfully used inlineRaphael Gault1-2/+2
The functions armv8pmu_read_counter() and armv8pmu_write_counter() are `static inline` while they are only referenced when assigned to a function pointer field in a `struct arm_pmu` instance. The inline keyword is thus counter intuitive and shouldn't be used. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Raphael Gault <raphael.gault@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-01-20arm64: perf: remove misleading commentAndrew Murray1-1/+1
The comment for the armv8pmu_set_event_filter function suggests that it only works for PMUv2 PMUs - this is incorrect. Let's remove the incorrect comment. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21arm64: perf: set suppress_bind_attrs flag to trueAnders Roxell1-0/+1
The armv8_pmuv3 driver doesn't have a remove function, and when the test 'CONFIG_DEBUG_TEST_DRIVER_REMOVE=y' is enabled, the following Call trace can be seen. [ 1.424287] Failed to register pmu: armv8_pmuv3, reason -17 [ 1.424870] WARNING: CPU: 0 PID: 1 at ../kernel/events/core.c:11771 perf_event_sysfs_init+0x98/0xdc [ 1.425220] Modules linked in: [ 1.425531] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.19.0-rc7-next-20181012-00003-ge7a97b1ad77b-dirty #35 [ 1.425951] Hardware name: linux,dummy-virt (DT) [ 1.426212] pstate: 80000005 (Nzcv daif -PAN -UAO) [ 1.426458] pc : perf_event_sysfs_init+0x98/0xdc [ 1.426720] lr : perf_event_sysfs_init+0x98/0xdc [ 1.426908] sp : ffff00000804bd50 [ 1.427077] x29: ffff00000804bd50 x28: ffff00000934e078 [ 1.427429] x27: ffff000009546000 x26: 0000000000000007 [ 1.427757] x25: ffff000009280710 x24: 00000000ffffffef [ 1.428086] x23: ffff000009408000 x22: 0000000000000000 [ 1.428415] x21: ffff000009136008 x20: ffff000009408730 [ 1.428744] x19: ffff80007b20b400 x18: 000000000000000a [ 1.429075] x17: 0000000000000000 x16: 0000000000000000 [ 1.429418] x15: 0000000000000400 x14: 2e79726f74636572 [ 1.429748] x13: 696420656d617320 x12: 656874206e692065 [ 1.430060] x11: 6d616e20656d6173 x10: 2065687420687469 [ 1.430335] x9 : ffff00000804bd50 x8 : 206e6f7361657220 [ 1.430610] x7 : 2c3376756d705f38 x6 : ffff00000954d7ce [ 1.430880] x5 : 0000000000000000 x4 : 0000000000000000 [ 1.431226] x3 : 0000000000000000 x2 : ffffffffffffffff [ 1.431554] x1 : 4d151327adc50b00 x0 : 0000000000000000 [ 1.431868] Call trace: [ 1.432102] perf_event_sysfs_init+0x98/0xdc [ 1.432382] do_one_initcall+0x6c/0x1a8 [ 1.432637] kernel_init_freeable+0x1bc/0x280 [ 1.432905] kernel_init+0x18/0x160 [ 1.433115] ret_from_fork+0x10/0x18 [ 1.433297] ---[ end trace 27fd415390eb9883 ]--- Rework to set suppress_bind_attrs flag to avoid removing the device when CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, since there's no real reason to remove the armv8_pmuv3 driver. Cc: Arnd Bergmann <arnd@arndb.de> Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21arm64: perf: Fix typos in commentShaokun Zhang1-1/+1
Fix up one typos: Onl -> Only Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21arm64: perf: Hook up new eventsWill Deacon1-0/+24
There have been some additional events added to the PMU architecture since Armv8.0, so expose them via our sysfs infrastructure. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21arm64: perf: Move event definitions into perf_event.hWill Deacon1-144/+1
The PMU event numbers are split between perf_event.h and perf_event.c, which makes it difficult to spot any gaps in the numbers which may be allocated in the future. This patch sorts the events numerically, adds some missing events and moves the definitions into perf_event.h. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21arm64: perf: Remove duplicate generic cache eventsWill Deacon1-4/+0
We cannot distinguish reads from writes in our generic cache events, so drop the WRITE entries and leave the READ entries pointing to the combined read/write events, as is done by other CPUs and architectures. Reported-by: Ganapatrao Kulkarni <Ganapatrao.Kulkarni@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21arm64: perf: Add support for Armv8.1 PMCEID register formatWill Deacon1-7/+18
Armv8.1 allocated the upper 32-bits of the PMCEID registers to describe the common architectural and microarchitecture events beginning at 0x4000. Add support for these registers to our probing code, so that we can advertise the SPE events when they are supported by the CPU. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21arm64: perf: Terminate PMU assignment statements with semicolonsWill Deacon1-10/+10
As a hangover from when this code used a designated initialiser, we've been using commas to terminate the arm_pmu field assignments. Whilst harmless, it's also weird, so replace them with semicolons instead. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-10-12arm64: perf: Reject stand-alone CHAIN events for PMUv3Will Deacon1-0/+7
It doesn't make sense for a perf event to be configured as a CHAIN event in isolation, so extend the arm_pmu structure with a ->filter_match() function to allow the backend PMU implementation to reject CHAIN events early. Cc: <stable@vger.kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31arm64: perf: Add cap_user_time aarch64Michael O'Farrell1-0/+30
It is useful to get the running time of a thread. Doing so in an efficient manner can be important for performance of user applications. Avoiding system calls in `clock_gettime` when handling CLOCK_THREAD_CPUTIME_ID is important. Other clocks are handled in the VDSO, but CLOCK_THREAD_CPUTIME_ID falls back on the system call. CLOCK_THREAD_CPUTIME_ID is not handled in the VDSO since it would have costs associated with maintaining updated user space accessible time offsets. These offsets have to be updated everytime the a thread is scheduled/descheduled. However, for programs regularly checking the running time of a thread, this is a performance improvement. This patch takes a middle ground, and adds support for cap_user_time an optional feature of the perf_event API. This way costs are only incurred when the perf_event api is enabled. This is done the same way as it is in x86. Ultimately this allows calculating the thread running time in userspace on aarch64 as follows (adapted from perf_event_open manpage): u32 seq, time_mult, time_shift; u64 running, count, time_offset, quot, rem, delta; struct perf_event_mmap_page *pc; pc = buf; // buf is the perf event mmaped page as documented in the API. if (pc->cap_usr_time) { do { seq = pc->lock; barrier(); running = pc->time_running; count = readCNTVCT_EL0(); // Read ARM hardware clock. time_offset = pc->time_offset; time_mult = pc->time_mult; time_shift = pc->time_shift; barrier(); } while (pc->lock != seq); quot = (count >> time_shift); rem = count & (((u64)1 << time_shift) - 1); delta = time_offset + quot * time_mult + ((rem * time_mult) >> time_shift); running += delta; // running now has the current nanosecond level thread time. } Summary of changes in the patch: For aarch64 systems, make arch_perf_update_userpage update the timing information stored in the perf_event page. Requiring the following calculations: - Calculate the appropriate time_mult, and time_shift factors to convert ticks to nano seconds for the current clock frequency. - Adjust the mult and shift factors to avoid shift factors of 32 bits. (possibly unnecessary) - The time_offset userspace should apply when doing calculations: negative the current sched time (now), because time_running and time_enabled fields of the perf_event page have just been updated. Toggle bits to appropriate values: - Enable cap_user_time Signed-off-by: Michael O'Farrell <micpof@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-10arm64: perf: Add support for chaining event countersSuzuki K Poulose1-27/+158
Add support for 64bit event by using chained event counters and 64bit cycle counters. PMUv3 allows chaining a pair of adjacent 32-bit counters, effectively forming a 64-bit counter. The low/even counter is programmed to count the event of interest, and the high/odd counter is programmed to count the CHAIN event, taken when the low/even counter overflows. For CPU cycles, when 64bit mode is requested, the cycle counter is used in 64bit mode. If the cycle counter is not available, falls back to chaining. Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-10arm64: perf: Disable PMU while processing counter overflowsSuzuki K Poulose1-22/+28
The arm64 PMU updates the event counters and reprograms the counters in the overflow IRQ handler without disabling the PMU. This could potentially cause skews in for group counters, where the overflowed counters may potentially loose some event counts, while they are reprogrammed. To prevent this, disable the PMU while we process the counter overflows and enable it right back when we are done. This patch also moves the PMU stop/start routines to avoid a forward declaration. Suggested-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-10arm64: perf: Clean up armv8pmu_select_counterSuzuki K Poulose1-10/+19
armv8pmu_select_counter always returns the passed idx. So let us make that void and get rid of the pointless checks. Suggested-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>