summaryrefslogtreecommitdiff
path: root/drivers/thermal
AgeCommit message (Collapse)AuthorFilesLines
2024-01-19thermal: loongson2: Replace of_device.h with explicit includesRob Herring1-1/+2
The DT of_device.h and of_platform.h date back to the separate of_platform_bus_type before it as merged into the regular platform bus. As part of that merge prepping Arm DT support 13 years ago, they "temporarily" include each other. They also include platform_device.h and of.h. of_device.h isn't needed, but mod_devicetable.h and property.h were implicitly included. Signed-off-by: Rob Herring <robh@kernel.org>
2024-01-18Merge tag 'thermal-6.8-rc1-2' of ↵Linus Torvalds12-126/+1018
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more thermal control updates from Rafael Wysocki: "These add support for debugfs-based diagnostics to the thermal core, simplify the thermal netlink API, fix system-wide PM support in the Intel HFI driver and clean up some code. Specifics: - Add debugfs-based diagnostics support to the thermal core (Daniel Lezcano, Dan Carpenter) - Fix a power allocator thermal governor issue preventing it from resetting cooling devices sometimes (Di Shen) - Simplify the thermal netlink API and clean up related code (Rafael J. Wysocki) - Make the Intel HFI driver support hibernation and deep suspend properly (Ricardo Neri)" * tag 'thermal-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up() thermal: intel: hfi: Add syscore callbacks for system-wide PM thermal: gov_power_allocator: avoid inability to reset a cdev thermal: helpers: Rearrange thermal_cdev_set_cur_state() thermal: netlink: Rework notify API for cooling devices thermal: core: Use kstrdup_const() during cooling device registration thermal/debugfs: Add thermal debugfs information for mitigation episodes thermal/debugfs: Add thermal cooling device debugfs information thermal: netlink: Pass thermal zone pointer to notify routines thermal: netlink: Drop thermal_notify_tz_trip_add/delete() thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down() thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()
2024-01-16Merge branches 'thermal-core' and 'thermal-intel'Rafael J. Wysocki12-126/+1018
Merge additional updates for 6.8-rc1 in the thermal core and in the Intel HFI thermal driver: - Add debugfs-based diagnostics support to the thermal core (Daniel Lezcano, Dan Carpenter). - Fix a power allocator thermal governor issue preventing it from resetting cooling devices sometimes (Di Shen). - Simplify the thermal netlink API and clean up related code (Rafael J. Wysocki). - Make the Intel HFI driver support hibernation and deep suspend properly (Ricardo Neri). * thermal-core: thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up() thermal: gov_power_allocator: avoid inability to reset a cdev thermal: helpers: Rearrange thermal_cdev_set_cur_state() thermal: netlink: Rework notify API for cooling devices thermal: core: Use kstrdup_const() during cooling device registration thermal/debugfs: Add thermal debugfs information for mitigation episodes thermal/debugfs: Add thermal cooling device debugfs information thermal: netlink: Pass thermal zone pointer to notify routines thermal: netlink: Drop thermal_notify_tz_trip_add/delete() thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down() thermal: netlink: Pass pointers to thermal_notify_tz_trip_change() * thermal-intel: thermal: intel: hfi: Add syscore callbacks for system-wide PM
2024-01-12thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up()Dan Carpenter1-1/+2
Add a missing mutex_unlock(&thermal_dbg->lock) to this error path. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: intel: hfi: Add syscore callbacks for system-wide PMRicardo Neri1-0/+28
The kernel allocates a memory buffer and provides its location to the hardware, which uses it to update the HFI table. This allocation occurs during boot and remains constant throughout runtime. When resuming from hibernation, the restore kernel allocates a second memory buffer and reprograms the HFI hardware with the new location as part of a normal boot. The location of the second memory buffer may differ from the one allocated by the image kernel. When the restore kernel transfers control to the image kernel, its HFI buffer becomes invalid, potentially leading to memory corruption if the hardware writes to it (the hardware continues to use the buffer from the restore kernel). It is also possible that the hardware "forgets" the address of the memory buffer when resuming from "deep" suspend. Memory corruption may also occur in such a scenario. To prevent the described memory corruption, disable HFI when preparing to suspend or hibernate. Enable it when resuming. Add syscore callbacks to handle the package of the boot CPU (packages of non-boot CPUs are handled via CPU offline). Syscore ops always run on the boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend and hibernation. Syscore ops only run in these cases. Cc: 6.1+ <stable@vger.kernel.org> # 6.1+ Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> [ rjw: Comment adjustment, subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: gov_power_allocator: avoid inability to reset a cdevDi Shen1-1/+1
Commit 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low") adds an update flag to avoid triggering a thermal event when there is no need, and the thermal cdev is updated once when the temperature is low. But when the trips are writable, and switch_on_temp is set to be a higher value, the cooling device state may not be reset to 0, because last_temperature is smaller than switch_on_temp. For example: First: switch_on_temp=70 control_temp=85; Then userspace change the trip_temp: switch_on_temp=45 control_temp=55 cur_temp=54 Then userspace reset the trip_temp: switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54 At this time, the cooling device state should be reset to 0. However, because cur_temp(57) < switch_on_temp(70) last_temp(54) < switch_on_temp(70) ----> update = false, update is false, the cooling device state can not be reset. Using the observation that tz->passive can also be regarded as the temperature status, set the update flag to the tz->passive value. When the temperature drops below switch_on for the first time, the states of cooling devices can be reset once, and tz->passive is updated to 0. In the next round, because tz->passive is 0, cdev->state will not be updated. By using the tz->passive value as the "update" flag, the issue above can be solved, and the cooling devices can be updated only once when the temperature is low. Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low") Cc: 5.13+ <stable@vger.kernel.org> # 5.13+ Suggested-by: Wei Wang <wvw@google.com> Signed-off-by: Di Shen <di.shen@unisoc.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: helpers: Rearrange thermal_cdev_set_cur_state()Rafael J. Wysocki1-6/+7
Change the code layout in thermal_cdev_set_cur_state() so it returns early on errors which is more consistent with what happens elsewhere. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal: netlink: Rework notify API for cooling devicesRafael J. Wysocki3-15/+18
In analogy with some previous thermal netlink API changes, redefine thermal_notify_cdev_state_update(), thermal_notify_cdev_add() and thermal_notify_cdev_delete() to take a const cdev pointer as their first argument and let them extract the requisite information from there by themselves. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal: core: Use kstrdup_const() during cooling device registrationChristophe JAILLET1-3/+3
Some *thermal_cooling_device_register() calls pass a string literal as the 'type' parameter, so kstrdup_const() can be used instead of kstrdup() to avoid a memory allocation in such cases. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal/debugfs: Add thermal debugfs information for mitigation episodesDaniel Lezcano3-4/+417
The mitigation episodes are recorded. A mitigation episode happens when the first trip point is crossed the way up and then the way down. During this episode other trip points can be crossed also and are accounted for this mitigation episode. The interesting information is the average temperature at the trip point, the undershot and the overshot. The standard deviation of the mitigated temperature will be added later. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ `-- thermal_zones |-- 0 | `-- mitigations `-- 1 `-- mitigations The content of the mitigations file has the following format: ,-Mitigation at 349988258us, duration=130136ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 130136 | 68227 | 62500 | 75625 | | 1 | passive | 75000 | 2000 | 104209 | 74857 | 71666 | 77500 | ,-Mitigation at 272451637us, duration=75000ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 75000 | 68561 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 60714 | 74820 | 70555 | 77500 | ,-Mitigation at 238184119us, duration=27316ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 27316 | 73377 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 19468 | 75284 | 69444 | 77500 | ,-Mitigation at 39863713us, duration=136196ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 136196 | 73922 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 91721 | 74386 | 69444 | 78125 | More information for a better understanding of the thermal behavior will be added after. The idea is to give detailed statistics information about the undershots and overshots, the temperature speed, etc... As all the information in a single file is too much, the idea would be to create a directory named with the mitigation timestamp where all data could be added. Please note this code is immune against trip ordering but not against a trip temperature change while a mitigation is happening. However, this situation should be extremely rare, perhaps not happening and we might question ourselves if something should be done in the core framework for other components first. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal/debugfs: Add thermal cooling device debugfs informationDaniel Lezcano7-6/+490
The thermal framework does not have any debug information except a sysfs stat which is a bit controversial. This one allocates big chunks of memory for every cooling devices with a high number of states and could represent on some systems in production several megabytes of memory for just a portion of it. As the sysfs is limited to a page size, the output is not exploitable with large data array and gets truncated. The patch provides the same information than sysfs except the transitions are dynamically allocated, thus they won't show more events than the ones which actually occurred. There is no longer a size limitation and it opens the field for more debugging information where the debugfs is designed for, not sysfs. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ -- cooling_devices |-- 0 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 1 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 2 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 3 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table `-- 4 |-- clear |-- time_in_state_ms |-- total_trans `-- trans_table The content of the files in the cooling devices directory is the same as the sysfs one except for the trans_table which has the following format: Transition Hits 1->0 246 0->1 246 2->1 632 1->2 632 3->2 98 2->3 98 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-10Merge tag 'thermal-6.8-rc1' of ↵Linus Torvalds17-679/+876
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These add support for the D1/T113s THS controller to the sun8i driver and a DT-based mechanism for platforms to indicate a preference to reboot (instead of shutting down) on crossing a critical trip point, fix issues, make other improvements (in the IPA governor, the Intel HFI driver, the exynos driver and the thermal netlink interface among other places) and clean up code. One long-standing issue addressed here is that trip point crossing notifications sent to user space might be unreliable due to the incorrect handling of trip point hysteresis in the thermal core: multiple notifications might be sent for the same event or there might be events without any notification at all. Specifics: - Add dynamic thresholds for trip point crossing detection to prevent trip point crossing notifications from being sent at incorrect times or not at all in some cases (Rafael J. Wysocki) - Fix synchronization issues related to the resume of thermal zones during a system-wide resume and allow thermal zones to be resumed concurrently (Rafael J. Wysocki) - Modify the thermal zone unregistration to wait for the given zone to go away completely before returning to the caller and rework the sysfs interface for trip points on top of that (Rafael J. Wysocki) - Fix a possible NULL pointer dereference in thermal zone registration error path (Rafael J. Wysocki) - Clean up the IPA thermal governor and modify it (with the help of a new governor callback) to avoid allocating and freeing memory every time its throttling callback is invoked (Lukasz Luba) - Make the IPA thermal governor handle thermal instance weight changes via sysfs correctly (Lukasz Luba) - Update the thermal netlink code to avoid sending messages if there are no recipients (Stanislaw Gruszka) - Convert Mediatek Thermal to the json-schema (Rafał Miłecki) - Fix thermal DT bindings issue on Loongson (Binbin Zhou) - Fix returning NULL instead of -ENODEV during thermal probe on Loogsoon (Binbin Zhou) - Add thermal DT binding for tsens on the SM8650 platform (Neil Armstrong) - Add reboot on the critical trip point crossing option feature (Fabio Estevam) - Use DEFINE_SIMPLE_DEV_PM_OPS do define PM functions for thermal suspend/resume on AmLogic (Uwe Kleine-König) - Add D1/T113s THS controller support to the Sun8i thermal control driver (Maxim Kiselev) - Fix example in the thermal DT binding for QCom SPMI (Johan Hovold) - Fix compilation warning in the tmon utility (Florian Eckert) - Add support for interrupt-based thermal configuration on Exynos along with a set of related cleanups (Mateusz Majewski) - Make the Intel HFI thermal driver enable an HFI instance (eg. processor package) from its first online CPU and disable it when the last CPU in it goes offline (Ricardo Neri) - Fix a kernel-doc warning and a spello in the cpuidle_cooling thermal driver (Randy Dunlap) - Move the .get_temp() thermal zone callback presence check to the thermal zone registration code (Daniel Lezcano) - Use the for_each_trip() macro for trip points table walks in a few places in the thermal core (Rafael J. Wysocki) - Make all trip point updates (via sysfs as well as from the platform firmware) trigger trip change notifications (Rafael J. Wysocki) - Drop redundant code from the thermal core and make one function in it take a const pointer argument (Rafael J. Wysocki)" * tag 'thermal-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) thermal: trip: Constify thermal zone argument of thermal_zone_trip_id() thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline thermal: intel: hfi: Enable an HFI instance from its first online CPU thermal: intel: hfi: Refactor enabling code into helper functions thermal/drivers/exynos: Use set_trips ops thermal/drivers/exynos: Use BIT wherever possible thermal/drivers/exynos: Split initialization of TMU and the thermal zone thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210 thermal/drivers/exynos: Simplify regulator (de)initialization thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts thermal/drivers/exynos: Drop id field thermal/drivers/exynos: Remove an unnecessary field description tools/thermal/tmon: Fix compilation warning for wrong format dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names thermal/drivers/sun8i: Add D1/T113s THS controller support dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions thermal: amlogic: Make amlogic_thermal_disable() return void ...
2024-01-09thermal: netlink: Pass thermal zone pointer to notify routinesRafael J. Wysocki3-28/+28
There are several rountines in the thermal netlink API that take a thermal zone ID or a thermal zone type as their arguments, but from their callers perspective it would be more convenient to pass a thermal zone pointer to them and let them extract the necessary data from the given thermal zone object by themselves. Modify the code accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Drop thermal_notify_tz_trip_add/delete()Rafael J. Wysocki2-46/+1
Because thermal_notify_tz_trip_add/delete() are never used, drop them entirely along with the related code. The addition or removal of trip points is not supported by the thermal core and is unlikely to be supported in the future, so it is also unlikely that these functions will ever be needed. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()Rafael J. Wysocki3-14/+20
Instead of requiring the callers of thermal_notify_tz_trip_up/down() to provide specific values needed to populate struct param in them, make them extract those values from objects passed by the callers via const pointers. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()Rafael J. Wysocki3-16/+17
Instead of requiring the caller of thermal_notify_tz_trip_change() to provide specific values needed to populate struct param in it, make it extract those values from objects passed to it by the caller via const pointers. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-05Merge branch 'thermal-intel'Rafael J. Wysocki1-26/+65
Merge changes in thermal control drivers for Intel platforms for 6.8-rc1: - Make the Intel HFI thermal driver enable an HFI instance (eg. processor package) from its first online CPU and disable it when the last CPU in it goes offline (Ricardo Neri). * thermal-intel: thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline thermal: intel: hfi: Enable an HFI instance from its first online CPU thermal: intel: hfi: Refactor enabling code into helper functions
2024-01-04thermal: trip: Constify thermal zone argument of thermal_zone_trip_id()Rafael J. Wysocki2-2/+2
Because thermal_zone_trip_id() does not update the thermal zone object passed to it, its pointer argument representing the thermal zone can be const, so adjust its definition accordingly. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-01-03thermal: intel: hfi: Disable an HFI instance when all its CPUs go offlineRicardo Neri1-0/+35
In preparation to support hibernation, add functionality to disable an HFI instance during CPU offline. The last CPU of an instance that goes offline will disable such instance. The Intel Software Development Manual states that the operating system must wait for the hardware to set MSR_IA32_PACKAGE_THERM_STATUS[26] after disabling an HFI instance to ensure that it will no longer write on the HFI memory. Some processors, however, do not ever set such bit. Wait a minimum of 2ms to give time hardware to complete any pending memory writes. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-03thermal: intel: hfi: Enable an HFI instance from its first online CPURicardo Neri1-7/+10
Previously, HFI instances were never disabled once enabled. A CPU in an instance only had to check during boot whether another CPU had previously initialized the instance and its corresponding data structure. A subsequent changeset will add functionality to disable instances to support hibernation. Such change will also make possible to disable an HFI instance during runtime via CPU hotplug. Enable an HFI instance from the first of its CPUs that comes online. This covers the boot, CPU hotplug, and resume-from-suspend cases. It also covers systems with one or more HFI instances (i.e., packages). Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-03thermal: intel: hfi: Refactor enabling code into helper functionsRicardo Neri1-21/+22
In preparation for the addition of a suspend notifier, wrap the logic to enable HFI and program its memory buffer into helper functions. Both the CPU hotplug callback and the suspend notifier will use them. This refactoring does not introduce functional changes. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-02Merge tag 'thermal-v6.8-rc1' of ↵Rafael J. Wysocki7-278/+313
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux into thermal Merge thermal control material for 6.8-rc1 from Daniel Lezcano: "- Converted Mediatek Thermal to the json-schema (Rafał Miłecki) - Fixed DT bindings issue on Loongson (Binbin Zhou) - Fixed returning NULL instead of -ENODEV on Loogsoo (Binbin Zhou) - Added the DT binding for the tsens on SM8650 platform (Neil Armstrong) - Added a reboot on critical option feature (Fabio Estevam) - Made usage of DEFINE_SIMPLE_DEV_PM_OPS on AmLogic (Uwe Kleine-König) - Added the D1/T113s THS controller support on Sun8i (Maxim Kiselev) - Fixed example in the DT binding for QCom SPMI (Johan Hovold) - Fixed compilation warning for the tmon utility (Florian Eckert) - Added interrupt based configuration on Exynos along with a set of related cleanups (Mateusz Majewski)" * tag 'thermal-v6.8-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (24 commits) thermal/drivers/exynos: Use set_trips ops thermal/drivers/exynos: Use BIT wherever possible thermal/drivers/exynos: Split initialization of TMU and the thermal zone thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210 thermal/drivers/exynos: Simplify regulator (de)initialization thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts thermal/drivers/exynos: Drop id field thermal/drivers/exynos: Remove an unnecessary field description tools/thermal/tmon: Fix compilation warning for wrong format dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names thermal/drivers/sun8i: Add D1/T113s THS controller support dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions thermal: amlogic: Make amlogic_thermal_disable() return void thermal/thermal_of: Allow rebooting after critical temp reboot: Introduce thermal_zone_device_critical_reboot() thermal/core: Prepare for introduction of thermal reboot dt-bindings: thermal-zones: Document critical-action ...
2024-01-02thermal/drivers/exynos: Use set_trips opsMateusz Majewski1-180/+205
Currently, each trip point defined in the device tree corresponds to a single hardware interrupt. This commit instead switches to using two hardware interrupts, whose values are set dynamically using the set_trips callback. Additionally, the critical temperature threshold is handled specifically. Setting interrupts in this way also fixes a long-standing lockdep warning, which was caused by calling thermal_zone_get_trips with our lock being held. Do note that this requires TMU initialization to be split into two parts, as done by the parent commit: parts of the initialization call into the thermal_zone_device structure and so must be done after its registration, but the initialization is also responsible for setting up calibration, which must be done before thermal_zone_device registration, which will call set_trips for the first time; if the calibration is not done in time, the interrupt values will be silently wrong! Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-10-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Use BIT wherever possibleMateusz Majewski1-12/+12
The original driver did not use that macro and it allows us to make our intentions slightly clearer. Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-9-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Split initialization of TMU and the thermal zoneMateusz Majewski1-34/+50
This will be needed in the future, as the thermal zone subsystem might call our callbacks right after devm_thermal_of_zone_register. Currently we just make get_temp return EAGAIN in such case, but this will not be possible with state-modifying callbacks, for instance set_trips. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-8-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210Mateusz Majewski1-14/+3
Exynos 4210 supports setting a base threshold value, which is added to all trip points. This might be useful, but is not really necessary in our usecase, so we always set it to 0 to simplify the code a bit. Additionally, this change makes it so that we convert the value to the calibrated one in a slightly different place. This is more correct morally, though it does not make any change when single-point calibration is being used (which is the case currently). Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-7-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Simplify regulator (de)initializationMateusz Majewski1-34/+15
We rewrite the initialization to enable the regulator as part of devm, which allows us to not handle the struct instance manually. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-6-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Handle devm_regulator_get_optional return value ↵Mateusz Majewski1-2/+10
correctly Currently, if regulator is required in the SoC, but devm_regulator_get_optional fails for whatever reason, the execution will proceed without propagating the error. Meanwhile there is no reason to output the error in case of -ENODEV. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-5-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to ↵Mateusz Majewski1-20/+9
threaded interrupts The workqueue boilerplate is mostly one-to-one what the threaded interrupts do. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-4-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Drop id fieldMateusz Majewski1-6/+0
We do not use the value, and only Exynos 7 defines this alias anyway. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-3-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Remove an unnecessary field descriptionMateusz Majewski1-1/+0
It seems that the field has been removed in one of the previous commits, but the description has been forgotten. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-2-m.majewski2@samsung.com
2024-01-02thermal/drivers/sun8i: Add D1/T113s THS controller supportMaxim Kiselev1-0/+13
This patch adds a thermal sensor controller support for the D1/T113s, which is similar to the one on H6, but with only one sensor and different scale and offset values. Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231217210629.131486-3-bigunclemax@gmail.com
2024-01-02thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functionsUwe Kleine-König1-5/+6
This macro has the advantage over SIMPLE_DEV_PM_OPS that we don't have to care about when the functions are actually used, so the corresponding __maybe_unused can be dropped. Also make use of pm_ptr() to discard all PM related stuff if CONFIG_PM isn't enabled. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231116112633.668826-3-u.kleine-koenig@pengutronix.de
2024-01-02thermal: amlogic: Make amlogic_thermal_disable() return voidUwe Kleine-König1-4/+4
amlogic_thermal_disable() returned zero unconditionally and amlogic_thermal_remove() already ignores the return value. Make it return no value and modify amlogic_thermal_suspend to not check the value. This patch introduces no semantic changes, but makes it more obvious for a human reader that amlogic_thermal_suspend() cannot fail. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231116112633.668826-2-u.kleine-koenig@pengutronix.de
2024-01-02thermal/thermal_of: Allow rebooting after critical tempFabio Estevam1-0/+6
Currently, the default mechanism is to trigger a shutdown after the critical temperature is reached. In some embedded cases, such behavior does not suit well, as the board may be unattended in the field and rebooting may be a better approach. The bootloader may also check the temperature and only allow the boot to proceed when the temperature is below a certain threshold. Introduce support for allowing a reboot to be triggered after the critical temperature is reached. If the "critical-action" devicetree property is not found, fall back to the shutdown action to preserve the existing default behavior. If a custom ops->critical exists, then it takes preference over critical-actions. Tested on a i.MX8MM board with the following devicetree changes: thermal-zones { cpu-thermal { critical-action = "reboot"; }; }; Signed-off-by: Fabio Estevam <festevam@denx.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231129124330.519423-4-festevam@gmail.com
2024-01-02reboot: Introduce thermal_zone_device_critical_reboot()Fabio Estevam2-0/+8
Introduce thermal_zone_device_critical_reboot() to trigger an emergency reboot. It is a counterpart of thermal_zone_device_critical() with the difference that it will force a reboot instead of shutdown. The motivation for doing this is to allow the thermal subystem to trigger a reboot when the temperature reaches the critical temperature. Signed-off-by: Fabio Estevam <festevam@denx.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231129124330.519423-3-festevam@gmail.com
2024-01-02thermal/core: Prepare for introduction of thermal rebootFabio Estevam1-4/+10
Add some helper functions to make it easier introducing the support for thermal reboot. No functional change. Signed-off-by: Fabio Estevam <festevam@denx.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231129124330.519423-2-festevam@gmail.com
2024-01-02drivers/thermal/loongson2_thermal: Fix incorrect PTR_ERR() judgmentBinbin Zhou1-1/+1
PTR_ERR() returns -ENODEV when thermal-zones are undefined, and we need -ENODEV as the right value for comparison. Otherwise, tz->type is NULL when thermal-zones is undefined, resulting in the following error: [ 12.290030] CPU 1 Unable to handle kernel paging request at virtual address fffffffffffffff1, era == 900000000355f410, ra == 90000000031579b8 [ 12.302877] Oops[#1]: [ 12.305190] CPU: 1 PID: 181 Comm: systemd-udevd Not tainted 6.6.0-rc7+ #5385 [ 12.312304] pc 900000000355f410 ra 90000000031579b8 tp 90000001069e8000 sp 90000001069eba10 [ 12.320739] a0 0000000000000000 a1 fffffffffffffff1 a2 0000000000000014 a3 0000000000000001 [ 12.329173] a4 90000001069eb990 a5 0000000000000001 a6 0000000000001001 a7 900000010003431c [ 12.337606] t0 fffffffffffffff1 t1 54567fd5da9b4fd4 t2 900000010614ec40 t3 00000000000dc901 [ 12.346041] t4 0000000000000000 t5 0000000000000004 t6 900000010614ee20 t7 900000000d00b790 [ 12.354472] t8 00000000000dc901 u0 54567fd5da9b4fd4 s9 900000000402ae10 s0 900000010614ec40 [ 12.362916] s1 90000000039fced0 s2 ffffffffffffffed s3 ffffffffffffffed s4 9000000003acc000 [ 12.362931] s5 0000000000000004 s6 fffffffffffff000 s7 0000000000000490 s8 90000001028b2ec8 [ 12.362938] ra: 90000000031579b8 thermal_add_hwmon_sysfs+0x258/0x300 [ 12.386411] ERA: 900000000355f410 strscpy+0xf0/0x160 [ 12.391626] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 12.397898] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 12.403678] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 12.409859] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) [ 12.415882] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 12.415907] BADV: fffffffffffffff1 [ 12.415911] PRID: 0014a000 (Loongson-64bit, Loongson-2K1000) [ 12.415917] Modules linked in: loongson2_thermal(+) vfat fat uio_pdrv_genirq uio fuse zram zsmalloc [ 12.415950] Process systemd-udevd (pid: 181, threadinfo=00000000358b9718, task=00000000ace72fe3) [ 12.415961] Stack : 0000000000000dc0 54567fd5da9b4fd4 900000000402ae10 9000000002df9358 [ 12.415982] ffffffffffffffed 0000000000000004 9000000107a10aa8 90000001002a3410 [ 12.415999] ffffffffffffffed ffffffffffffffed 9000000107a11268 9000000003157ab0 [ 12.416016] 9000000107a10aa8 ffffff80020fc0c8 90000001002a3410 ffffffffffffffed [ 12.416032] 0000000000000024 ffffff80020cc1e8 900000000402b2a0 9000000003acc000 [ 12.416048] 90000001002a3410 0000000000000000 ffffff80020f4030 90000001002a3410 [ 12.416065] 0000000000000000 9000000002df6808 90000001002a3410 0000000000000000 [ 12.416081] ffffff80020f4030 0000000000000000 90000001002a3410 9000000002df2ba8 [ 12.416097] 00000000000000b4 90000001002a34f4 90000001002a3410 0000000000000002 [ 12.416114] ffffff80020f4030 fffffffffffffff0 90000001002a3410 9000000002df2f30 [ 12.416131] ... [ 12.416138] Call Trace: [ 12.416142] [<900000000355f410>] strscpy+0xf0/0x160 [ 12.416167] [<90000000031579b8>] thermal_add_hwmon_sysfs+0x258/0x300 [ 12.416183] [<9000000003157ab0>] devm_thermal_add_hwmon_sysfs+0x50/0xe0 [ 12.416200] [<ffffff80020cc1e8>] loongson2_thermal_probe+0x128/0x200 [loongson2_thermal] [ 12.416232] [<9000000002df6808>] platform_probe+0x68/0x140 [ 12.416249] [<9000000002df2ba8>] really_probe+0xc8/0x3c0 [ 12.416269] [<9000000002df2f30>] __driver_probe_device+0x90/0x180 [ 12.416286] [<9000000002df3058>] driver_probe_device+0x38/0x160 [ 12.416302] [<9000000002df33a8>] __driver_attach+0xa8/0x200 [ 12.416314] [<9000000002deffec>] bus_for_each_dev+0x8c/0x120 [ 12.416330] [<9000000002df198c>] bus_add_driver+0x10c/0x2a0 [ 12.416346] [<9000000002df46b4>] driver_register+0x74/0x160 [ 12.416358] [<90000000022201a4>] do_one_initcall+0x84/0x220 [ 12.416372] [<90000000022f3ab8>] do_init_module+0x58/0x2c0 [ 12.416386] [<90000000022f6538>] init_module_from_file+0x98/0x100 [ 12.416399] [<90000000022f67f0>] sys_finit_module+0x230/0x3c0 [ 12.416412] [<900000000358f7c8>] do_syscall+0x88/0xc0 [ 12.416431] [<900000000222137c>] handle_syscall+0xbc/0x158 Fixes: e7e3a7c35791 ("thermal/drivers/loongson-2: Add thermal management support") Cc: Yinbo Zhu <zhuyinbo@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/343c14de98216636a47b43e8bfd47b70d0a8e068.1700817227.git.zhoubinbin@loongson.cn
2023-12-29thermal: gov_power_allocator: Support new update callback of weightsLukasz Luba1-6/+9
When the thermal instance's weight is updated from the sysfs the governor update_tz() callback is triggered. Implement proper reaction to this event in the IPA, which would save CPU cycles spent in throttle(). This will speed-up the main throttle() IPA function and clean it up a bit. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29thermal/sysfs: Update governors when the 'weight' has changedLukasz Luba1-0/+5
Support governors update when the thermal instance's weight has changed. This allows to adjust internal state for the governor. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> [ rjw: Add two empty code lines aroung the locking ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29thermal/sysfs: Update instance->weight under tz lockLukasz Luba1-0/+4
User space can change the weight of a thermal instance via sysfs while the .throttle() callback is running for a governor, because weight_store() does not use the zone lock. The IPA governor uses instance weight values for power calculations and caches the sum of them as total_weight, so it gets confused when one of them changes while its .throttle() callback is running. To prevent that from happening, use thermal zone locking in weight_store(). Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29thermal: gov_power_allocator: Simplify checks for valid power actorLukasz Luba1-23/+17
There is a need to check if the cooling device in the thermal zone supports IPA callback and is set for control trip point. Refactor the code which validates the power actor capabilities and make it more consistent in all places. No intentional functional impact. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29thermal: gov_power_allocator: Move memory allocation out of throttle()Lukasz Luba1-71/+136
The new thermal callback allows to react to the change of cooling instances in the thermal zone. Move the memory allocation to that new callback and save CPU cycles in the throttle() code path. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29thermal: gov_power_allocator: Change trace functionsLukasz Luba2-23/+32
Change trace event trace_thermal_power_allocator() to not use dynamic array for requested power and granted power for all power actors. Instead, simplify the trace event and print other simple values. Add new trace event to print power actor information of requested power and granted power. That trace event would be called in a loop for each power actor. The trace data would be easier to parse comparing to the dynamic array implementation. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29thermal: gov_power_allocator: Refactor checks in divvy_up_power()Lukasz Luba1-10/+10
Simplify the code and remove one extra 'if' block. No intentional functional impact. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29thermal: gov_power_allocator: Refactor check_power_actors()Lukasz Luba1-4/+6
In preparation for a subsequent change, rearrange check_power_actors(). No intentional functional impact. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29thermal: core: Add governor callback for thermal zone changeLukasz Luba2-0/+16
Add a new callback to the struct thermal_governor. It can be used for updating governors when there is a change in the thermal zone internals, e.g. thermal cooling device is bind to the thermal zone. That makes possible to move some heavy operations like memory allocations related to the number of cooling instances out of the throttle() callback. Both callback code paths (throttle() and update_tz()) are protected with the same thermal zone lock, which guaranties the consistency. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-28thermal: netlink: Add thermal_group_has_listeners() helperStanislaw Gruszka1-0/+11
Add a helper function to check if there are listeners for thermal_gnl_family multicast groups. For now use it to avoid unnecessary allocations and sending thermal genl messages when there are no recipients. In the future, in conjunction with (not yet implemented) notification of change in the netlink socket group membership, this helper can be used to open/close hardware interfaces based on the presence of user space subscribers. Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-28thermal: netlink: Add enum for mutlicast groups indexesStanislaw Gruszka1-4/+9
Use enum instead of hard-coded numbers for indexing multicast groups. Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-28thermal: core: Resume thermal zones asynchronouslyRafael J. Wysocki1-4/+26
The resume of thermal zones in thermal_pm_notify() is carried out sequentially, which may be a problem if __thermal_zone_device_update() takes a significant time to run for some thermal zones, because some other thermal zones may need to wait for them to resume then and if any other PM notifiers are going to be invoked after the thermal one, they will need to wait for it either. To address this, make thermal_pm_notify() switch the poll_queue delayed work over to a one-shot thermal_zone_device_resume() work function that will restore the original one during the thermal zone resume and queue up poll_queue without a delay for each thermal zone. Link: https://lore.kernel.org/linux-pm/20231120234015.3273143-1-radusolea@google.com/ Reported-by: Radu Solea <radusolea@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>