From 5b59c69ec54849f23b51d18b0a609c4f793bc35a Mon Sep 17 00:00:00 2001 From: Tony Camuso Date: Fri, 25 Apr 2014 14:19:29 -0400 Subject: ACPI / PAD: call schedule() when need_resched() is true The purpose of the acpi_pad driver is to implement the "processor power aggregator" device as described in the ACPI 4.0 spec section 8.5. It takes requests from the BIOS (via ACPI) to put a specified number of CPUs into idle, in order to save power, until further notice. It does this by creating high-priority threads that try to keep the CPUs in a high C-state (using the monitor/mwait CPU instructions). The mwait() call is in a loop that checks periodically if the thread should end and a few other things. It was discovered through testing that the power_saving threads were causing the system to consume more power than the system was consuming before the threads were created. A counter in the main loop of power_saving_thread() revealed that it was spinning. The mwait() instruction was not keeping the CPU in a high C state very much if at all. Here is a simplification of the loop in function power_saving_thread() in drivers/acpi/acpi_pad.c while (!kthread_should_stop()) { : try_to_freeze() : while (!need_resched()) { : if (!need_resched()) __mwait(power_saving_mwait_eax, 1); : if (jiffies > expire_time) { do_sleep = 1; break; } } } If need_resched() returns true, then mwait() is not called. It was returning true because of things like timer interrupts, as in the following sequence. hrtimer_interrupt->__run_hrtimer->tick_sched_timer-> update_process_times-> rcu_check_callbacks->rcu_pending->__rcu_pending->set_need_resched Kernels 3.5.0-rc2+ do not exhibit this problem, because a patch to try_to_freeze() in include/linux/freezer.h introduces a call to might_sleep(), which ultimately calls schedule() to clear the reschedule flag and allows the the loop to execute the call to mwait(). However, the changes to try_to_freeze are unrelated to acpi_pad, and it does not seem like a good idea to rely on an unrelated patch in a function that could later be changed and reintroduce this bug. Therefore, it seems better to make an explicit call to schedule() in the outer loop when the need_resched flag is set. Reported-and-tested-by: Stuart Hayes Signed-off-by: Tony Camuso Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpi_pad.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c index 37d73024b82e..e20708f2b8e5 100644 --- a/drivers/acpi/acpi_pad.c +++ b/drivers/acpi/acpi_pad.c @@ -215,8 +215,15 @@ static int power_saving_thread(void *data) * borrow CPU time from this CPU and cause RT task use > 95% * CPU time. To make 'avoid starvation' work, takes a nap here. */ - if (do_sleep) + if (unlikely(do_sleep)) schedule_timeout_killable(HZ * idle_pct / 100); + + /* If an external event has set the need_resched flag, then + * we need to deal with it, or this loop will continue to + * spin without calling __mwait(). + */ + if (unlikely(need_resched())) + schedule(); } exit_round_robin(tsk_index); -- cgit v1.2.3 From 247dba58a19a34f01c363b3aec4d2c21cfb87d8e Mon Sep 17 00:00:00 2001 From: Baoquan He Date: Mon, 5 May 2014 12:48:25 +0800 Subject: ACPI / ia64: introduce variable acpi_lapic into ia64 This variable was defined and assigned in x86, is used to indicate whether LAPIC exists in MADT. Now introduce it into ia64 to help make correct judgment when get information for ACPI processor later. Signed-off-by: Baoquan He Signed-off-by: Rafael J. Wysocki --- arch/ia64/include/asm/acpi.h | 1 + arch/ia64/kernel/acpi.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/arch/ia64/include/asm/acpi.h b/arch/ia64/include/asm/acpi.h index d651102a4d45..b47821931ca6 100644 --- a/arch/ia64/include/asm/acpi.h +++ b/arch/ia64/include/asm/acpi.h @@ -85,6 +85,7 @@ ia64_acpi_release_global_lock (unsigned int *lock) ((Acq) = ia64_acpi_release_global_lock(&facs->global_lock)) #ifdef CONFIG_ACPI +extern int acpi_lapic; #define acpi_disabled 0 /* ACPI always enabled on IA64 */ #define acpi_noirq 0 /* ACPI always enabled on IA64 */ #define acpi_pci_disabled 0 /* ACPI PCI always enabled on IA64 */ diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c index 0d407b300762..615ef81def49 100644 --- a/arch/ia64/kernel/acpi.c +++ b/arch/ia64/kernel/acpi.c @@ -56,6 +56,7 @@ #define PREFIX "ACPI: " +int acpi_lapic; unsigned int acpi_cpei_override; unsigned int acpi_cpei_phys_cpuid; @@ -676,6 +677,8 @@ int __init early_acpi_boot_init(void) if (ret < 1) printk(KERN_ERR PREFIX "Error parsing MADT - no LAPIC entries\n"); + else + acpi_lapic = 1; #ifdef CONFIG_SMP if (available_cpus == 0) { -- cgit v1.2.3 From c401eb8ee374a5fc2b56042c0072ce51a0beb0dc Mon Sep 17 00:00:00 2001 From: Baoquan He Date: Mon, 5 May 2014 12:48:26 +0800 Subject: ACPI / processor: Check if LAPIC is present during initialization In acpi_processor_get_info(), ACPI processor info is initialized including ID, namely CPU index. Currently, on a UP system running an SMP kerenl with no LAPIC in the MADT, cpu0_initialized is checked to decide whether or not the CPU has been initialized. However, this check may not be sufficient for kdump kernels. Most of time only 1 CPU is supported because of known problems in kdump kernels. So say the multiple CPUs are present in the boot kernel and a crash happens on one specific CPU, say CPU2. Then it jumps into the kdump kernel with "nr_cpus=1" in the command line. In this situation, the kdump kernel will reuse the ACPI resources from the crashed kernel directly. That means all LAPIC instances are enabled in the MADT while only one CPU is in use. In the kdump kernel, x86_cpu_to_apicid contains the correct APIC ID and it's related to the CPU ID. If cpu0_initialized is checked only, 0 will be used as the CPU index instead of that APIC ID, which is not correct. In addition to checking cpu0_initialized, check acpi_lapic. If acpi_lapic is 0, then no LAPIC is available from the MADT and the system should be treated as a UP one without a LAPIC (that is, assign 0 to the CPU index). Otherwise, use the original (valid) CPU index. Signed-off-by: Baoquan He [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpi_processor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 52c81c49cc7d..1c085742644f 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -268,7 +268,7 @@ static int acpi_processor_get_info(struct acpi_device *device) pr->apic_id = apic_id; cpu_index = acpi_map_cpuid(pr->apic_id, pr->acpi_id); - if (!cpu0_initialized) { + if (!cpu0_initialized && !acpi_lapic) { cpu0_initialized = 1; /* Handle UP system running SMP kernel, with no LAPIC in MADT */ if ((cpu_index == -1) && (num_online_cpus() == 1)) -- cgit v1.2.3 From 8da8373447d6a57a5a9f55233d35beb15d92d0d2 Mon Sep 17 00:00:00 2001 From: Toshi Kani Date: Thu, 8 May 2014 07:58:59 -0600 Subject: ACPI / processor: Fix STARTING/DYING action in acpi_cpu_soft_notify() During CPU online/offline testing on a large system, one of the processors got stuck after the message "bad: scheduling from the idle thread!". The problem is that acpi_cpu_soft_notify() calls acpi_bus_get_device() for all action types. CPU_STARTING and CPU_DYING do not allow the notify handlers to sleep. However, acpi_bus_get_device() can sleep in acpi_ut_acquire_mutex(). Change acpi_cpu_soft_notify() to return immediately for CPU_STARTING and CPU_DYING as they have no action in this handler. Signed-off-by: Toshi Kani Signed-off-by: Rafael J. Wysocki --- drivers/acpi/processor_driver.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c index 7f70f3182d50..4fcbd670415c 100644 --- a/drivers/acpi/processor_driver.c +++ b/drivers/acpi/processor_driver.c @@ -121,6 +121,13 @@ static int acpi_cpu_soft_notify(struct notifier_block *nfb, struct acpi_processor *pr = per_cpu(processors, cpu); struct acpi_device *device; + /* + * CPU_STARTING and CPU_DYING must not sleep. Return here since + * acpi_bus_get_device() may sleep. + */ + if (action == CPU_STARTING || action == CPU_DYING) + return NOTIFY_DONE; + if (!pr || acpi_bus_get_device(pr->handle, &device)) return NOTIFY_DONE; -- cgit v1.2.3 From 4ff248f3bf830ba22c988abb099e1836fbd3b1d0 Mon Sep 17 00:00:00 2001 From: Manuel Schölling Date: Thu, 22 May 2014 22:56:36 +0200 Subject: ACPI / PAD: Use time_before() for time comparison MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit To be future-proof and for better readability the time comparisons are modified to use time_before() instead of plain, error-prone math. Signed-off-by: Manuel Schölling Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpi_pad.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c index e20708f2b8e5..f148a0580e04 100644 --- a/drivers/acpi/acpi_pad.c +++ b/drivers/acpi/acpi_pad.c @@ -156,12 +156,13 @@ static int power_saving_thread(void *data) while (!kthread_should_stop()) { int cpu; - u64 expire_time; + unsigned long expire_time; try_to_freeze(); /* round robin to cpus */ - if (last_jiffies + round_robin_time * HZ < jiffies) { + expire_time = last_jiffies + round_robin_time * HZ; + if (time_before(expire_time, jiffies)) { last_jiffies = jiffies; round_robin_cpu(tsk_index); } @@ -200,7 +201,7 @@ static int power_saving_thread(void *data) CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); local_irq_enable(); - if (jiffies > expire_time) { + if (time_before(expire_time, jiffies)) { do_sleep = 1; break; } -- cgit v1.2.3