From 27f850880192c19a26d89dab19e2bddcf276032f Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 5 Jan 2023 14:51:58 +0000 Subject: cpuidle: teo: Optionally skip polling states in teo_find_shallower_state() Add a no_poll flag to teo_find_shallower_state() that will let the function optionally not consider polling states. This allows the caller to guard against the function inadvertently resulting in TEO putting the CPU in a polling state when that behaviour is undesirable. Signed-off-by: Kajetan Puchalski Signed-off-by: Rafael J. Wysocki --- drivers/cpuidle/governors/teo.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) (limited to 'drivers/cpuidle') diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c index d9262db79cae..e2864474a98d 100644 --- a/drivers/cpuidle/governors/teo.c +++ b/drivers/cpuidle/governors/teo.c @@ -258,15 +258,17 @@ static s64 teo_middle_of_bin(int idx, struct cpuidle_driver *drv) * @dev: Target CPU. * @state_idx: Index of the capping idle state. * @duration_ns: Idle duration value to match. + * @no_poll: Don't consider polling states. */ static int teo_find_shallower_state(struct cpuidle_driver *drv, struct cpuidle_device *dev, int state_idx, - s64 duration_ns) + s64 duration_ns, bool no_poll) { int i; for (i = state_idx - 1; i >= 0; i--) { - if (dev->states_usage[i].disable) + if (dev->states_usage[i].disable || + (no_poll && drv->states[i].flags & CPUIDLE_FLAG_POLLING)) continue; state_idx = i; @@ -469,7 +471,7 @@ end: */ if (idx > idx0 && drv->states[idx].target_residency_ns > delta_tick) - idx = teo_find_shallower_state(drv, dev, idx, delta_tick); + idx = teo_find_shallower_state(drv, dev, idx, delta_tick, false); } return idx; -- cgit v1.2.3 From 9ce0f7c4bc64d820b02a1c53f7e8dba9539f942b Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 5 Jan 2023 14:51:59 +0000 Subject: cpuidle: teo: Introduce util-awareness MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Modern interactive systems, such as recent Android phones, tend to have power efficient shallow idle states. Selecting deeper idle states on a device while a latency-sensitive workload is running can adversely impact performance due to increased latency. Additionally, if the CPU wakes up from a deeper sleep before its target residency as is often the case, it results in a waste of energy on top of that. At the moment, none of the available idle governors take any scheduling information into account. They also tend to overestimate the idle duration quite often, which causes them to select excessively deep idle states, thus leading to increased wakeup latency and lower performance with no power saving. For 'menu' while web browsing on Android for instance, those types of wakeups ('too deep') account for over 24% of all wakeups. At the same time, on some platforms idle state 0 can be power efficient enough to warrant wanting to prefer it over idle state 1. This is because the power usage of the two states can be so close that sufficient amounts of too deep state 1 sleeps can completely offset the state 1 power saving to the point where it would've been more power efficient to just use state 0 instead. This is, of course, for systems where state 0 is not a polling state, such as arm-based devices. Sleeps that happened in state 0 while they could have used state 1 ('too shallow') only save less power than they otherwise could have. Too deep sleeps, on the other hand, harm performance and nullify the potential power saving from using state 1 in the first place. While taking this into account, it is clear that on balance it is preferable for an idle governor to have more too shallow sleeps instead of more too deep sleeps on those kinds of platforms. This patch specifically tunes TEO to prefer shallower idle states in order to reduce wakeup latency and achieve better performance. To this end, before selecting the next idle state it uses the avg_util signal of a CPU's runqueue in order to determine to what extent the CPU is being utilized. This util value is then compared to a threshold defined as a percentage of the CPU's capacity (capacity >> 6 ie. ~1.5% in the current implementation). If the util is above the threshold, the index of the idle state selected by TEO metrics will be reduced by 1, thus selecting a shallower state. If the util is below the threshold, the governor defaults to the TEO metrics mechanism to try to select the deepest available idle state based on the closest timer event and its own correctness. The main goal of this is to reduce latency and increase performance for some workloads. Under some workloads it will result in an increase in power usage (Geekbench 5) while for other workloads it will also result in a decrease in power usage compared to TEO (PCMark Web, Jankbench, Speedometer). It can provide drastically decreased latency and performance benefits in certain types of workloads that are sensitive to latency. Example test results: 1. GB5 (better score, latency & more power usage) | metric | menu | teo | teo-util-aware | | ------------------------------------- | -------------- | ----------------- | ----------------- | | gmean score | 2826.5 (0.0%) | 2764.8 (-2.18%) | 2865 (1.36%) | | gmean power usage [mW] | 2551.4 (0.0%) | 2606.8 (2.17%) | 2722.3 (6.7%) | | gmean too deep % | 14.99% | 9.65% | 4.02% | | gmean too shallow % | 2.5% | 5.96% | 14.59% | | gmean task wakeup latency (asynctask) | 78.16μs (0.0%) | 61.60μs (-21.19%) | 54.45μs (-30.34%) | 2. Jankbench (better score, latency & less power usage) | metric | menu | teo | teo-util-aware | | ------------------------------------- | -------------- | ----------------- | ----------------- | | gmean frame duration | 13.9 (0.0%) | 14.7 (6.0%) | 12.6 (-9.0%) | | gmean jank percentage | 1.5 (0.0%) | 2.1 (36.99%) | 1.3 (-17.37%) | | gmean power usage [mW] | 144.6 (0.0%) | 136.9 (-5.27%) | 121.3 (-16.08%) | | gmean too deep % | 26.00% | 11.00% | 2.54% | | gmean too shallow % | 4.74% | 11.89% | 21.93% | | gmean wakeup latency (RenderThread) | 139.5μs (0.0%) | 116.5μs (-16.49%) | 91.11μs (-34.7%) | | gmean wakeup latency (surfaceflinger) | 124.0μs (0.0%) | 151.9μs (22.47%) | 87.65μs (-29.33%) | Signed-off-by: Kajetan Puchalski [ rjw: Comment edits and white space adjustments ] Signed-off-by: Rafael J. Wysocki --- drivers/cpuidle/governors/teo.c | 94 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 93 insertions(+), 1 deletion(-) (limited to 'drivers/cpuidle') diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c index e2864474a98d..987fc5f3997d 100644 --- a/drivers/cpuidle/governors/teo.c +++ b/drivers/cpuidle/governors/teo.c @@ -2,8 +2,13 @@ /* * Timer events oriented CPU idle governor * + * TEO governor: * Copyright (C) 2018 - 2021 Intel Corporation * Author: Rafael J. Wysocki + * + * Util-awareness mechanism: + * Copyright (C) 2022 Arm Ltd. + * Author: Kajetan Puchalski */ /** @@ -99,14 +104,55 @@ * select the given idle state instead of the candidate one. * * 3. By default, select the candidate state. + * + * Util-awareness mechanism: + * + * The idea behind the util-awareness extension is that there are two distinct + * scenarios for the CPU which should result in two different approaches to idle + * state selection - utilized and not utilized. + * + * In this case, 'utilized' means that the average runqueue util of the CPU is + * above a certain threshold. + * + * When the CPU is utilized while going into idle, more likely than not it will + * be woken up to do more work soon and so a shallower idle state should be + * selected to minimise latency and maximise performance. When the CPU is not + * being utilized, the usual metrics-based approach to selecting the deepest + * available idle state should be preferred to take advantage of the power + * saving. + * + * In order to achieve this, the governor uses a utilization threshold. + * The threshold is computed per-CPU as a percentage of the CPU's capacity + * by bit shifting the capacity value. Based on testing, the shift of 6 (~1.56%) + * seems to be getting the best results. + * + * Before selecting the next idle state, the governor compares the current CPU + * util to the precomputed util threshold. If it's below, it defaults to the + * TEO metrics mechanism. If it's above, the closest shallower idle state will + * be selected instead, as long as is not a polling state. */ #include #include #include +#include #include +#include #include +/* + * The number of bits to shift the CPU's capacity by in order to determine + * the utilized threshold. + * + * 6 was chosen based on testing as the number that achieved the best balance + * of power and performance on average. + * + * The resulting threshold is high enough to not be triggered by background + * noise and low enough to react quickly when activity starts to ramp up. + */ +#define UTIL_THRESHOLD_SHIFT 6 + + /* * The PULSE value is added to metrics when they grow and the DECAY_SHIFT value * is used for decreasing metrics on a regular basis. @@ -137,9 +183,11 @@ struct teo_bin { * @time_span_ns: Time between idle state selection and post-wakeup update. * @sleep_length_ns: Time till the closest timer event (at the selection time). * @state_bins: Idle state data bins for this CPU. - * @total: Grand total of the "intercepts" and "hits" mertics for all bins. + * @total: Grand total of the "intercepts" and "hits" metrics for all bins. * @next_recent_idx: Index of the next @recent_idx entry to update. * @recent_idx: Indices of bins corresponding to recent "intercepts". + * @util_threshold: Threshold above which the CPU is considered utilized + * @utilized: Whether the last sleep on the CPU happened while utilized */ struct teo_cpu { s64 time_span_ns; @@ -148,10 +196,29 @@ struct teo_cpu { unsigned int total; int next_recent_idx; int recent_idx[NR_RECENT]; + unsigned long util_threshold; + bool utilized; }; static DEFINE_PER_CPU(struct teo_cpu, teo_cpus); +/** + * teo_cpu_is_utilized - Check if the CPU's util is above the threshold + * @cpu: Target CPU + * @cpu_data: Governor CPU data for the target CPU + */ +#ifdef CONFIG_SMP +static bool teo_cpu_is_utilized(int cpu, struct teo_cpu *cpu_data) +{ + return sched_cpu_util(cpu) > cpu_data->util_threshold; +} +#else +static bool teo_cpu_is_utilized(int cpu, struct teo_cpu *cpu_data) +{ + return false; +} +#endif + /** * teo_update - Update CPU metrics after wakeup. * @drv: cpuidle driver containing state data. @@ -323,6 +390,22 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, goto end; } + cpu_data->utilized = teo_cpu_is_utilized(dev->cpu, cpu_data); + /* + * If the CPU is being utilized over the threshold and there are only 2 + * states to choose from, the metrics need not be considered, so choose + * the shallowest non-polling state and exit. + */ + if (drv->state_count < 3 && cpu_data->utilized) { + for (i = 0; i < drv->state_count; ++i) { + if (!dev->states_usage[i].disable && + !(drv->states[i].flags & CPUIDLE_FLAG_POLLING)) { + idx = i; + goto end; + } + } + } + /* * Find the deepest idle state whose target residency does not exceed * the current sleep length and the deepest idle state not deeper than @@ -454,6 +537,13 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, if (idx > constraint_idx) idx = constraint_idx; + /* + * If the CPU is being utilized over the threshold, choose a shallower + * non-polling state to improve latency + */ + if (cpu_data->utilized) + idx = teo_find_shallower_state(drv, dev, idx, duration_ns, true); + end: /* * Don't stop the tick if the selected state is a polling one or if the @@ -510,9 +600,11 @@ static int teo_enable_device(struct cpuidle_driver *drv, struct cpuidle_device *dev) { struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu); + unsigned long max_capacity = arch_scale_cpu_capacity(dev->cpu); int i; memset(cpu_data, 0, sizeof(*cpu_data)); + cpu_data->util_threshold = max_capacity >> UTIL_THRESHOLD_SHIFT; for (i = 0; i < NR_RECENT; i++) cpu_data->recent_idx[i] = -1; -- cgit v1.2.3 From 4edc13ae891ab368a3026cde1dbf263331bc6e77 Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Fri, 9 Dec 2022 17:25:23 +0800 Subject: cpuidle-haltpoll: select haltpoll governor The haltpoll cpuidle driver should select the haltpoll governor, so as to ensure that they work together. Signed-off-by: Li RongQing [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki --- drivers/cpuidle/Kconfig | 1 + 1 file changed, 1 insertion(+) (limited to 'drivers/cpuidle') diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig index ff71dd662880..cac5997dca50 100644 --- a/drivers/cpuidle/Kconfig +++ b/drivers/cpuidle/Kconfig @@ -74,6 +74,7 @@ endmenu config HALTPOLL_CPUIDLE tristate "Halt poll cpuidle driver" depends on X86 && KVM_GUEST + select CPU_IDLE_GOV_HALTPOLL default y help This option enables halt poll cpuidle driver, which allows to poll -- cgit v1.2.3 From 716ff71ae234fd3ef1286aac8d8a19a0ed2d509d Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Fri, 6 Jan 2023 12:03:42 +0800 Subject: cpuidle-haltpoll: Replace default_idle() with arch_cpu_idle() When a KVM guest has MWAIT, mwait_idle() is used as the default idle function. However, the cpuidle-haltpoll driver calls default_idle() from default_enter_idle() directly and that one uses HLT instead of MWAIT, which may affect performance adversely, because MWAIT is preferred to HLT as explained by the changelog of commit aebef63cf7ff ("x86: Remove vendor checks from prefer_mwait_c1_over_halt"). Make default_enter_idle() call arch_cpu_idle(), which can use MWAIT, instead of default_idle() to address this issue. Suggested-by: Thomas Gleixner Suggested-by: Rafael J. Wysocki Signed-off-by: Li RongQing [ rjw: Changelog rewrite ] Signed-off-by: Rafael J. Wysocki --- arch/x86/kernel/process.c | 1 + drivers/cpuidle/cpuidle-haltpoll.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) (limited to 'drivers/cpuidle') diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 40d156a31676..00a831d56c50 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -721,6 +721,7 @@ void arch_cpu_idle(void) { x86_idle(); } +EXPORT_SYMBOL_GPL(arch_cpu_idle); /* * We use this if we don't have any better idle routine.. diff --git a/drivers/cpuidle/cpuidle-haltpoll.c b/drivers/cpuidle/cpuidle-haltpoll.c index 3a39a7f48b77..e66df22f9695 100644 --- a/drivers/cpuidle/cpuidle-haltpoll.c +++ b/drivers/cpuidle/cpuidle-haltpoll.c @@ -32,7 +32,7 @@ static int default_enter_idle(struct cpuidle_device *dev, local_irq_enable(); return index; } - default_idle(); + arch_cpu_idle(); return index; } -- cgit v1.2.3 From 7787943a3a8ade6594a68db28c166adbb1d3708c Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 6 Feb 2023 20:33:06 +0100 Subject: cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies Some ARMv4 processors don't support suspend, which leads to a build failure with the tegra and qualcomm cpuidle driver: WARNING: unmet direct dependencies detected for ARM_CPU_SUSPEND Depends on [n]: ARCH_SUSPEND_POSSIBLE [=n] Selected by [y]: - ARM_TEGRA_CPUIDLE [=y] && CPU_IDLE [=y] && (ARM [=y] || ARM64) && (ARCH_TEGRA [=n] || COMPILE_TEST [=y]) && !ARM64 && MMU [=y] arch/arm/kernel/sleep.o: in function `__cpu_suspend': (.text+0x68): undefined reference to `cpu_sa110_suspend_size' (.text+0x68): undefined reference to `cpu_fa526_suspend_size' Add an explicit dependency to make randconfig builds avoid this combination. Fixes: faae6c9f2e68 ("cpuidle: tegra: Enable compile testing") Fixes: a871be6b8eee ("cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver") Link: https://lore.kernel.org/all/20211013160125.772873-1-arnd@kernel.org/ Cc: All applicable Reviewed-by: Dmitry Osipenko Signed-off-by: Arnd Bergmann Acked-by: Thierry Reding Signed-off-by: Rafael J. Wysocki --- drivers/cpuidle/Kconfig.arm | 2 ++ 1 file changed, 2 insertions(+) (limited to 'drivers/cpuidle') diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm index 747aa537389b..f0714a32921e 100644 --- a/drivers/cpuidle/Kconfig.arm +++ b/drivers/cpuidle/Kconfig.arm @@ -102,6 +102,7 @@ config ARM_MVEBU_V7_CPUIDLE config ARM_TEGRA_CPUIDLE bool "CPU Idle Driver for NVIDIA Tegra SoCs" depends on (ARCH_TEGRA || COMPILE_TEST) && !ARM64 && MMU + depends on ARCH_SUSPEND_POSSIBLE select ARCH_NEEDS_CPU_IDLE_COUPLED if SMP select ARM_CPU_SUSPEND help @@ -110,6 +111,7 @@ config ARM_TEGRA_CPUIDLE config ARM_QCOM_SPM_CPUIDLE bool "CPU Idle Driver for Qualcomm Subsystem Power Manager (SPM)" depends on (ARCH_QCOM || COMPILE_TEST) && !ARM64 && MMU + depends on ARCH_SUSPEND_POSSIBLE select ARM_CPU_SUSPEND select CPU_IDLE_MULTIPLE_DRIVERS select DT_IDLE_STATES -- cgit v1.2.3 From e898b07deb3ce801b21113979a05f4b3166c7cb3 Mon Sep 17 00:00:00 2001 From: Thomas Weißschuh Date: Tue, 7 Feb 2023 19:55:19 +0000 Subject: cpuidle: sysfs: make kobj_type structures constant MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definitions to prevent modification at runtime. Signed-off-by: Thomas Weißschuh Signed-off-by: Rafael J. Wysocki --- drivers/cpuidle/sysfs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'drivers/cpuidle') diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c index 2b496a53cbca..48948b171749 100644 --- a/drivers/cpuidle/sysfs.c +++ b/drivers/cpuidle/sysfs.c @@ -200,7 +200,7 @@ static void cpuidle_sysfs_release(struct kobject *kobj) complete(&kdev->kobj_unregister); } -static struct kobj_type ktype_cpuidle = { +static const struct kobj_type ktype_cpuidle = { .sysfs_ops = &cpuidle_sysfs_ops, .release = cpuidle_sysfs_release, }; @@ -447,7 +447,7 @@ static void cpuidle_state_sysfs_release(struct kobject *kobj) complete(&state_obj->kobj_unregister); } -static struct kobj_type ktype_state_cpuidle = { +static const struct kobj_type ktype_state_cpuidle = { .sysfs_ops = &cpuidle_state_sysfs_ops, .default_groups = cpuidle_state_default_groups, .release = cpuidle_state_sysfs_release, @@ -594,7 +594,7 @@ static struct attribute *cpuidle_driver_default_attrs[] = { }; ATTRIBUTE_GROUPS(cpuidle_driver_default); -static struct kobj_type ktype_driver_cpuidle = { +static const struct kobj_type ktype_driver_cpuidle = { .sysfs_ops = &cpuidle_driver_sysfs_ops, .default_groups = cpuidle_driver_default_groups, .release = cpuidle_driver_sysfs_release, -- cgit v1.2.3 From 41204a607679ccca7eabff9f2871b969d6ef2ce3 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 7 Feb 2023 20:59:59 +0100 Subject: cpuidle: driver: Update microsecond values of state parameters as needed If the cpuidle driver provides the target residency and exit latency in nanoseconds, the corresponding values in microseconds need to be set to reflect the provided numbers in order for the sysfs interface to show them correctly, so make __cpuidle_driver_init() do that. Signed-off-by: Rafael J. Wysocki Tested-by: Artem Bityutskiy --- drivers/cpuidle/driver.c | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'drivers/cpuidle') diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index f70aa17e2a8e..d9cda7f6ccb9 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -183,11 +183,15 @@ static void __cpuidle_driver_init(struct cpuidle_driver *drv) s->target_residency_ns = s->target_residency * NSEC_PER_USEC; else if (s->target_residency_ns < 0) s->target_residency_ns = 0; + else + s->target_residency = div_u64(s->target_residency_ns, NSEC_PER_USEC); if (s->exit_latency > 0) s->exit_latency_ns = s->exit_latency * NSEC_PER_USEC; else if (s->exit_latency_ns < 0) s->exit_latency_ns = 0; + else + s->exit_latency = div_u64(s->exit_latency_ns, NSEC_PER_USEC); } } -- cgit v1.2.3 From f9901f64536c39f699e6ed228c5e64e7e7ce8708 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Wed, 25 Jan 2023 12:34:18 +0100 Subject: cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT The runtime Power Management of CPU topology is not compatible with PREEMPT_RT: 1. Core cpuidle path disables IRQs. 2. Core cpuidle calls cpuidle-psci. 3. cpuidle-psci in __psci_enter_domain_idle_state() calls pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use spinlocks (which are sleeping on PREEMPT_RT). Deep sleep modes are not a priority of Realtime kernels because the latencies might become unpredictable. On the other hand the PSCI CPU idle power domain is a parent of other devices and power domain controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). Disable the idle callbacks in cpuidle-psci and mark the domain as always on. This is a trade-off between making PREEMPT_RT working and still having a proper power domain hierarchy in the system. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Ulf Hansson Tested-by: Adrien Thierry Signed-off-by: Rafael J. Wysocki --- drivers/cpuidle/Kconfig.arm | 8 ++++++++ drivers/cpuidle/cpuidle-psci-domain.c | 7 +++++-- drivers/cpuidle/cpuidle-psci.c | 3 +++ 3 files changed, 16 insertions(+), 2 deletions(-) (limited to 'drivers/cpuidle') diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm index f0714a32921e..a1ee475d180d 100644 --- a/drivers/cpuidle/Kconfig.arm +++ b/drivers/cpuidle/Kconfig.arm @@ -24,6 +24,14 @@ config ARM_PSCI_CPUIDLE It provides an idle driver that is capable of detecting and managing idle states through the PSCI firmware interface. + The driver has limitations when used with PREEMPT_RT: + - If the idle states are described with the non-hierarchical layout, + all idle states are still available. + + - If the idle states are described with the hierarchical layout, + only the idle states defined per CPU are available, but not the ones + being shared among a group of CPUs (aka cluster idle states). + config ARM_PSCI_CPUIDLE_DOMAIN bool "PSCI CPU idle Domain" depends on ARM_PSCI_CPUIDLE diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c index c80cf9ddabd8..6ad2954948a5 100644 --- a/drivers/cpuidle/cpuidle-psci-domain.c +++ b/drivers/cpuidle/cpuidle-psci-domain.c @@ -64,8 +64,11 @@ static int psci_pd_init(struct device_node *np, bool use_osi) pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN; - /* Allow power off when OSI has been successfully enabled. */ - if (use_osi) + /* + * Allow power off when OSI has been successfully enabled. + * PREEMPT_RT is not yet ready to enter domain idle states. + */ + if (use_osi && !IS_ENABLED(CONFIG_PREEMPT_RT)) pd->power_off = psci_pd_power_off; else pd->flags |= GENPD_FLAG_ALWAYS_ON; diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c index 57bc3e3ae391..68467a325dcb 100644 --- a/drivers/cpuidle/cpuidle-psci.c +++ b/drivers/cpuidle/cpuidle-psci.c @@ -231,6 +231,9 @@ static int psci_dt_cpu_init_topology(struct cpuidle_driver *drv, if (!psci_has_osi_support()) return 0; + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + return 0; + data->dev = psci_dt_attach_cpu(cpu); if (IS_ERR_OR_NULL(data->dev)) return PTR_ERR_OR_ZERO(data->dev); -- cgit v1.2.3