summaryrefslogtreecommitdiff
path: root/drivers/thermal
AgeCommit message (Collapse)AuthorFilesLines
2024-05-30thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()Rafael J. Wysocki3-6/+19
[ Upstream commit 31a0fa0019b022024cc082ae292951a596b06f8c ] If cdev_dt_seq_show() runs before the first state transition of a cooling device, it will not print any state residency information for it, even though it might be reasonably expected to print residency information for the initial state of the cooling device. For this reason, rearrange the code to get the initial state of a cooling device at the registration time and pass it to thermal_debug_cdev_add(), so that the latter can create a duration record for that state which will allow cdev_dt_seq_show() to print its residency information. Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") Reported-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30thermal/debugfs: Create records for cdev states as they get usedRafael J. Wysocki1-0/+8
[ Upstream commit f4ae18fcb652c6cccc834ded525ac37f91d5cdb1 ] Because thermal_debug_cdev_state_update() only creates a duration record for the old state of a cooling device, if its new state is used for the first time, there will be no record for it and cdev_dt_seq_show() will not print the duration information for it even though it contains code to compute the duration value in that case. Address this by making thermal_debug_cdev_state_update() create a duration record for the new state if there is none. Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") Reported-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30thermal/debugfs: Avoid excessive updates of trip point statisticsRafael J. Wysocki2-8/+2
[ Upstream commit 0a293c77580581c4b058eb40287acadac6ffd14a ] Since thermal_debug_update_temp() is called before invoking thermal_debug_tz_trip_down() for the trips that were crossed by the zone temperature on the way up, it updates the statistics for them as though the current zone temperature was above the low temperature of each of them. However, if a given trip has just been crossed on the way down, the zone temperature is in fact below its low temperature, but this is handled by thermal_debug_tz_trip_down() running after the update of the trip statistics. The remedy is to call thermal_debug_update_temp() after thermal_debug_tz_trip_down() has been invoked for all of the trips in question, but then thermal_debug_tz_trip_up() needs to be adjusted, so it does not update the statistics for the trips that has just been crossed on the way up, as that will be taken care of by thermal_debug_update_temp() down the road. Modify the code accordingly. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30thermal/drivers/tsens: Fix null pointer dereferenceAleksandr Mishin1-1/+1
[ Upstream commit d998ddc86a27c92140b9f7984ff41e3d1d07a48f ] compute_intercept_slope() is called from calibrate_8960() (in tsens-8960.c) as compute_intercept_slope(priv, p1, NULL, ONE_PT_CALIB) which lead to null pointer dereference (if DEBUG or DYNAMIC_DEBUG set). Fix this bug by adding null pointer check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: dfc1193d4dbd ("thermal/drivers/tsens: Replace custom 8960 apis with generic apis") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240411114021.12203-1-amishin@t-argos.ru Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-30thermal/drivers/mediatek/lvts_thermal: Add coeff for mt8192Hsin-Te Yuan1-0/+4
[ Upstream commit 7954c92ede882b0dfd52a5db90291a4151b44c1a ] In order for lvts_raw_to_temp to function properly on mt8192, temperature coefficients for mt8192 need to be added. Fixes: 288732242db4 ("thermal/drivers/mediatek/lvts_thermal: Add mt8192 support") Signed-off-by: Hsin-Te Yuan <yuanhsinte@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240416-lvts_thermal-v2-1-f8a36882cc53@chromium.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17thermal/debugfs: Prevent use-after-free from occurring after cdev removalRafael J. Wysocki1-3/+11
[ Upstream commit d351eb0ab04c3e8109895fc33250cebbce9c11da ] Since thermal_debug_cdev_remove() does not run under cdev->lock, it can run in parallel with thermal_debug_cdev_state_update() and it may free the struct thermal_debugfs object used by the latter after it has been checked against NULL. If that happens, thermal_debug_cdev_state_update() will access memory that has been freed already causing the kernel to crash. Address this by using cdev->lock in thermal_debug_cdev_remove() around the cdev->debugfs value check (in case the same cdev is removed at the same time in two different threads) and its reset to NULL. Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") Cc :6.8+ <stable@vger.kernel.org> # 6.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17thermal/debugfs: Fix two locking issues with thermal zone debugRafael J. Wysocki1-12/+22
[ Upstream commit c7f7c37271787a7f77d7eedc132b0b419a76b4c8 ] With the current thermal zone locking arrangement in the debugfs code, user space can open the "mitigations" file for a thermal zone before the zone's debugfs pointer is set which will result in a NULL pointer dereference in tze_seq_start(). Moreover, thermal_debug_tz_remove() is not called under the thermal zone lock, so it can run in parallel with the other functions accessing the thermal zone's struct thermal_debugfs object. Then, it may clear tz->debugfs after one of those functions has checked it and the struct thermal_debugfs object may be freed prematurely. To address the first problem, pass a pointer to the thermal zone's struct thermal_debugfs object to debugfs_create_file() in thermal_debug_tz_add() and make tze_seq_start(), tze_seq_next(), tze_seq_stop(), and tze_seq_show() retrieve it from s->private instead of a pointer to the thermal zone object. This will ensure that tz_debugfs will be valid across the "mitigations" file accesses until thermal_debugfs_remove_id() called by thermal_debug_tz_remove() removes that file. To address the second problem, use tz->lock in thermal_debug_tz_remove() around the tz->debugfs value check (in case the same thermal zone is removed at the same time in two different threads) and its reset to NULL. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Cc :6.8+ <stable@vger.kernel.org> # 6.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17thermal/debugfs: Free all thermal zone debug memory on zone removalRafael J. Wysocki1-0/+13
[ Upstream commit 72c1afffa4c645fe0e0f1c03e5f34395ed65b5f4 ] Because thermal_debug_tz_remove() does not free all memory allocated for thermal zone diagnostics, some of that memory becomes unreachable after freeing the thermal zone's struct thermal_debugfs object. Address this by making thermal_debug_tz_remove() free all of the memory in question. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Cc :6.8+ <stable@vger.kernel.org> # 6.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27thermal/debugfs: Add missing count increment to thermal_debug_tz_trip_up()Rafael J. Wysocki1-0/+1
[ Upstream commit b552f63cd43735048bbe9bfbb7a9dcfce166fbdd ] The count field in struct trip_stats, representing the number of times the zone temperature was above the trip point, needs to be incremented in thermal_debug_tz_trip_up(), for two reasons. First, if a trip point is crossed on the way up for the first time, thermal_debug_update_temp() called from update_temperature() does not see it because it has not been added to trips_crossed[] array in the thermal zone's struct tz_debugfs object yet. Therefore, when thermal_debug_tz_trip_up() is called after that, the trip point's count value is 0, and the attempt to divide by it during the average temperature computation leads to a divide error which causes the kernel to crash. Setting the count to 1 before the division by incrementing it fixes this problem. Second, if a trip point is crossed on the way up, but it has been crossed on the way up already before, its count value needs to be incremented to make a record of the fact that the zone temperature is above the trip now. Without doing that, if the mitigations applied after crossing the trip cause the zone temperature to drop below its threshold, the count will not be updated for this episode at all and the average temperature in the trip statistics record will be somewhat higher than it should be. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Cc :6.8+ <stable@vger.kernel.org> # 6.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-13thermal/of: Assume polling-delay(-passive) 0 when absentKonrad Dybcio1-4/+8
[ Upstream commit 488164006a281986d95abbc4b26e340c19c4c85b ] Currently, thermal zones associated with providers that have interrupts for signaling hot/critical trips are required to set a polling-delay of 0 to indicate no polling. This feels a bit backwards. Change the code such that "no polling delay" also means "no polling". Suggested-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240125-topic-thermal-v1-2-3c9d4dced138@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-10thermal: gov_power_allocator: Allow binding without trip pointsNikita Travkin1-8/+4
[ Upstream commit da781936e7c301e6197eb6513775748e79fb2575 ] IPA probe function was recently refactored to perform extra error checks and make sure the thermal zone has trip points necessary for the IPA operation. With this change, if a thermal zone is probed such that it has no trip points that IPA can use, IPA will fail and the TZ won't be created. This is the case if a platform defines a TZ without cooling devices and only with "hot"/"critical" trip points, often found on some Qualcomm devices [1]. Documentation across IPA code (notably get_governor_trips() kerneldoc) suggests that IPA is supposed to handle such TZ even if it won't actually do anything. This commit partially reverts the previous change to allow IPA to bind to such "empty" thermal zones. Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip points earlier") Link: arch/arm64/boot/dts/qcom/sc7180.dtsi#n4776 # [1] Signed-off-by: Nikita Travkin <nikita@trvn.ru> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-10thermal: gov_power_allocator: Allow binding without cooling devicesNikita Travkin1-1/+1
[ Upstream commit 1057c4c36ef8b236a2e28edef301da0801338c5f ] IPA was recently refactored to split out memory allocation into a separate funciton. That funciton was made to return -EINVAL if there is zero power_actors and thus no memory to allocate. This causes IPA to fail probing when the thermal zone has no attached cooling devices. Since cooling devices can attach after the thermal zone is created and the governer is attached to it, failing probe due to the lack of cooling devices is incorrect. Change the allocate_actors_buffer() to return success when there is no cooling devices present. Fixes: 912e97c67cc3 ("thermal: gov_power_allocator: Move memory allocation out of throttle()") Signed-off-by: Nikita Travkin <nikita@trvn.ru> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03Revert "thermal: core: Don't update trip points inside the hysteresis range"Daniel Lezcano1-17/+2
commit f67cf45deedb118af302534643627ce59074e8eb upstream. It has been reported the commit cf3986f8c01d3 introduced a regression when the temperature is wavering in the hysteresis region. The mitigation stops leading to an uncontrolled temperature increase until reaching the critical trip point. Here what happens: * 'throttle' is when the current temperature is greater than the trip point temperature * 'target' is the mitigation level * 'passive' is positive when there is a mitigation, zero otherwise * these values are computed in the step_wise governor Configuration: trip point 1: temp=95°C, hyst=5°C (passive) trip point 2: temp=115°C, hyst=0°C (critical) governor: step_wise 1. The temperature crosses the way up the trip point 1 at 95°C - trend=raising - throttle=1, target=1 - passive=1 - set_trips: low=90°C, high=115°C 2. The temperature decreases but stays in the hysteresis region at 93°C - trend=dropping - throttle=0, target=0 - passive=1 Before cf3986f8c01d3 - set_trips: low=90°C, high=95°C After cf3986f8c01d3 - set_trips: low=90°C, high=115°C 3. The temperature increases a bit but stays in the hysteresis region at 94°C (so below the trip point 1 temp 95°C) - trend=raising - throttle=0, target=0 - passive=1 Before cf3986f8c01d3 - set_trips: low=90°C, high=95°C After cf3986f8c01d3 - set_trips: low=90°C, high=115°C 4. The temperature decreases but stays in the hysteresis region at 93°C - trend=dropping - throttle=0, target=THERMAL_NO_TARGET - passive=0 Before cf3986f8c01d3 - set_trips: low=90°C, high=95°C After cf3986f8c01d3 - set_trips: low=90°C, high=115°C At this point, the 'passive' value is zero, there is no mitigation, the temperature is in the hysteresis region, the next trip point is 115°C. As 'passive' is zero, the timer to monitor the thermal zone is disabled. Consequently if the temperature continues to increase, no mitigation will happen and it will reach the 115°C trip point and reboot. Before the optimization, the high boundary would have been 95°C, thus triggering the mitigation again and rearming the polling timer. The optimization make sense but given the current implementation of the step_wise governor collaborating via this 'passive' flag with the core framework it can not work. From a higher perspective it seems like there is a problem between the governor which sets a variable to be used by the core framework. That sounds akward and it would make much more sense if the core framework controls the governor and not the opposite. But as the devil hides in the details, there are some subtilities to be addressed before. Elaborating those would be out of the scope this changelog. So let's stay simple and revert the change first to fixup all broken mobile platforms. This reverts commit cf3986f8c01d3 ("thermal: core: Don't update trip points inside the hysteresis range") and takes a conflict with commit 0c0c4740c9d26 ("0c0c4740c9d2 thermal: trip: Use for_each_trip() in __thermal_zone_set_trips()") in drivers/thermal/thermal_trip.c into account. Fixes: cf3986f8c01d3 ("thermal: core: Don't update trip points inside the hysteresis range") Reported-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Cc: 6.7+ <stable@vger.kernel.org> # 6.7+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-03thermal: devfreq_cooling: Fix perf state when calculate dfc res_utilYe Zhang1-1/+1
commit a26de34b3c77ae3a969654d94be49e433c947e3b upstream. The issue occurs when the devfreq cooling device uses the EM power model and the get_real_power() callback is provided by the driver. The EM power table is sorted ascending,can't index the table by cooling device state,so convert cooling state to performance state by dfc->max_state - dfc->capped_state. Fixes: 615510fe13bd ("thermal: devfreq_cooling: remove old power model and use EM") Cc: 5.11+ <stable@vger.kernel.org> # 5.11+ Signed-off-by: Ye Zhang <ye.zhang@rock-chips.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-03thermal/drivers/mediatek: Fix control buffer enablement on MT7896Frank Wunderlich1-0/+3
[ Upstream commit 371ed6263e2403068b359f0c07188548c2d70827 ] Reading thermal sensor on mt7986 devices returns invalid temperature: bpi-r3 ~ # cat /sys/class/thermal/thermal_zone0/temp -274000 Fix this by adding missing members in mtk_thermal_data struct which were used in mtk_thermal_turn_on_buffer after commit 33140e668b10. Cc: stable@vger.kernel.org Fixes: 33140e668b10 ("thermal/drivers/mediatek: Control buffer enablement tweaks") Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20230907112018.52811-1-linux@fw-web.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03powercap: intel_rapl: Fix locking in TPMI RAPLZhang Rui1-4/+4
[ Upstream commit 1aa09b9379a7a644cd2f75ae0bac82b8783df600 ] The RAPL framework uses CPU hotplug locking to protect the rapl_packages list and rp->lead_cpu to guarantee that 1. the RAPL package device is not unprobed and freed 2. the cached rp->lead_cpu is always valid for operations like powercap sysfs accesses. Current RAPL APIs assume being called from CPU hotplug callbacks which hold the CPU hotplug lock, but TPMI RAPL driver invokes the APIs in the driver's .probe() function without acquiring the CPU hotplug lock. Fix the problem by providing both locked and lockless versions of RAPL APIs. Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.5+ <stable@vger.kernel.org> # 6.5+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03thermal/intel: Fix intel_tcc_get_temp() to support negative CPU temperatureZhang Rui3-14/+14
[ Upstream commit 7251b9e8a007ddd834aa81f8c7ea338884629fec ] CPU temperature can be negative in some cases. Thus the negative CPU temperature should not be considered as a failure. Fix intel_tcc_get_temp() and its users to support negative CPU temperature. Fixes: a3c1f066e1c5 ("thermal/intel: Introduce Intel TCC library") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Cc: 6.3+ <stable@vger.kernel.org> # 6.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-27thermal/drivers/qoriq: Fix getting tmu rangePeng Fan1-4/+8
[ Upstream commit 4d0642074c67ed9928e9d68734ace439aa06e403 ] TMU Version 1 has 4 TTRCRs, while TMU Version >=2 has 16 TTRCRs. So limit the len to 4 will report "invalid range data" for i.MX93. This patch drop the local array with allocated ttrcr array and able to support larger tmu ranges. Fixes: f12d60c81fce ("thermal/drivers/qoriq: Support version 2.1") Tested-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240226003657.3012880-1-peng.fan@oss.nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-27thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error ↵Christophe JAILLET1-1/+3
handling path [ Upstream commit ca93bf607a44c1f009283dac4af7df0d9ae5e357 ] If devm_krealloc() fails, then 'efuse' is leaking. So free it to avoid a leak. Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/481d345233862d58c3c305855a93d0dbc2bbae7e.1706431063.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-22thermal: intel: powerclamp: Remove dead code for target mwait valueSrinivas Pandruvada1-32/+0
After conversion of this driver to use powercap idle_inject core, this driver doesn't use target_mwait value. So remove dead code. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-19thermal: loongson2: Replace of_device.h with explicit includesRob Herring1-1/+2
The DT of_device.h and of_platform.h date back to the separate of_platform_bus_type before it as merged into the regular platform bus. As part of that merge prepping Arm DT support 13 years ago, they "temporarily" include each other. They also include platform_device.h and of.h. of_device.h isn't needed, but mod_devicetable.h and property.h were implicitly included. Signed-off-by: Rob Herring <robh@kernel.org>
2024-01-18Merge tag 'thermal-6.8-rc1-2' of ↵Linus Torvalds12-126/+1018
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more thermal control updates from Rafael Wysocki: "These add support for debugfs-based diagnostics to the thermal core, simplify the thermal netlink API, fix system-wide PM support in the Intel HFI driver and clean up some code. Specifics: - Add debugfs-based diagnostics support to the thermal core (Daniel Lezcano, Dan Carpenter) - Fix a power allocator thermal governor issue preventing it from resetting cooling devices sometimes (Di Shen) - Simplify the thermal netlink API and clean up related code (Rafael J. Wysocki) - Make the Intel HFI driver support hibernation and deep suspend properly (Ricardo Neri)" * tag 'thermal-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up() thermal: intel: hfi: Add syscore callbacks for system-wide PM thermal: gov_power_allocator: avoid inability to reset a cdev thermal: helpers: Rearrange thermal_cdev_set_cur_state() thermal: netlink: Rework notify API for cooling devices thermal: core: Use kstrdup_const() during cooling device registration thermal/debugfs: Add thermal debugfs information for mitigation episodes thermal/debugfs: Add thermal cooling device debugfs information thermal: netlink: Pass thermal zone pointer to notify routines thermal: netlink: Drop thermal_notify_tz_trip_add/delete() thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down() thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()
2024-01-16Merge branches 'thermal-core' and 'thermal-intel'Rafael J. Wysocki12-126/+1018
Merge additional updates for 6.8-rc1 in the thermal core and in the Intel HFI thermal driver: - Add debugfs-based diagnostics support to the thermal core (Daniel Lezcano, Dan Carpenter). - Fix a power allocator thermal governor issue preventing it from resetting cooling devices sometimes (Di Shen). - Simplify the thermal netlink API and clean up related code (Rafael J. Wysocki). - Make the Intel HFI driver support hibernation and deep suspend properly (Ricardo Neri). * thermal-core: thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up() thermal: gov_power_allocator: avoid inability to reset a cdev thermal: helpers: Rearrange thermal_cdev_set_cur_state() thermal: netlink: Rework notify API for cooling devices thermal: core: Use kstrdup_const() during cooling device registration thermal/debugfs: Add thermal debugfs information for mitigation episodes thermal/debugfs: Add thermal cooling device debugfs information thermal: netlink: Pass thermal zone pointer to notify routines thermal: netlink: Drop thermal_notify_tz_trip_add/delete() thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down() thermal: netlink: Pass pointers to thermal_notify_tz_trip_change() * thermal-intel: thermal: intel: hfi: Add syscore callbacks for system-wide PM
2024-01-12thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up()Dan Carpenter1-1/+2
Add a missing mutex_unlock(&thermal_dbg->lock) to this error path. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: intel: hfi: Add syscore callbacks for system-wide PMRicardo Neri1-0/+28
The kernel allocates a memory buffer and provides its location to the hardware, which uses it to update the HFI table. This allocation occurs during boot and remains constant throughout runtime. When resuming from hibernation, the restore kernel allocates a second memory buffer and reprograms the HFI hardware with the new location as part of a normal boot. The location of the second memory buffer may differ from the one allocated by the image kernel. When the restore kernel transfers control to the image kernel, its HFI buffer becomes invalid, potentially leading to memory corruption if the hardware writes to it (the hardware continues to use the buffer from the restore kernel). It is also possible that the hardware "forgets" the address of the memory buffer when resuming from "deep" suspend. Memory corruption may also occur in such a scenario. To prevent the described memory corruption, disable HFI when preparing to suspend or hibernate. Enable it when resuming. Add syscore callbacks to handle the package of the boot CPU (packages of non-boot CPUs are handled via CPU offline). Syscore ops always run on the boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend and hibernation. Syscore ops only run in these cases. Cc: 6.1+ <stable@vger.kernel.org> # 6.1+ Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> [ rjw: Comment adjustment, subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: gov_power_allocator: avoid inability to reset a cdevDi Shen1-1/+1
Commit 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low") adds an update flag to avoid triggering a thermal event when there is no need, and the thermal cdev is updated once when the temperature is low. But when the trips are writable, and switch_on_temp is set to be a higher value, the cooling device state may not be reset to 0, because last_temperature is smaller than switch_on_temp. For example: First: switch_on_temp=70 control_temp=85; Then userspace change the trip_temp: switch_on_temp=45 control_temp=55 cur_temp=54 Then userspace reset the trip_temp: switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54 At this time, the cooling device state should be reset to 0. However, because cur_temp(57) < switch_on_temp(70) last_temp(54) < switch_on_temp(70) ----> update = false, update is false, the cooling device state can not be reset. Using the observation that tz->passive can also be regarded as the temperature status, set the update flag to the tz->passive value. When the temperature drops below switch_on for the first time, the states of cooling devices can be reset once, and tz->passive is updated to 0. In the next round, because tz->passive is 0, cdev->state will not be updated. By using the tz->passive value as the "update" flag, the issue above can be solved, and the cooling devices can be updated only once when the temperature is low. Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low") Cc: 5.13+ <stable@vger.kernel.org> # 5.13+ Suggested-by: Wei Wang <wvw@google.com> Signed-off-by: Di Shen <di.shen@unisoc.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal: helpers: Rearrange thermal_cdev_set_cur_state()Rafael J. Wysocki1-6/+7
Change the code layout in thermal_cdev_set_cur_state() so it returns early on errors which is more consistent with what happens elsewhere. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal: netlink: Rework notify API for cooling devicesRafael J. Wysocki3-15/+18
In analogy with some previous thermal netlink API changes, redefine thermal_notify_cdev_state_update(), thermal_notify_cdev_add() and thermal_notify_cdev_delete() to take a const cdev pointer as their first argument and let them extract the requisite information from there by themselves. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal: core: Use kstrdup_const() during cooling device registrationChristophe JAILLET1-3/+3
Some *thermal_cooling_device_register() calls pass a string literal as the 'type' parameter, so kstrdup_const() can be used instead of kstrdup() to avoid a memory allocation in such cases. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal/debugfs: Add thermal debugfs information for mitigation episodesDaniel Lezcano3-4/+417
The mitigation episodes are recorded. A mitigation episode happens when the first trip point is crossed the way up and then the way down. During this episode other trip points can be crossed also and are accounted for this mitigation episode. The interesting information is the average temperature at the trip point, the undershot and the overshot. The standard deviation of the mitigated temperature will be added later. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ `-- thermal_zones |-- 0 | `-- mitigations `-- 1 `-- mitigations The content of the mitigations file has the following format: ,-Mitigation at 349988258us, duration=130136ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 130136 | 68227 | 62500 | 75625 | | 1 | passive | 75000 | 2000 | 104209 | 74857 | 71666 | 77500 | ,-Mitigation at 272451637us, duration=75000ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 75000 | 68561 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 60714 | 74820 | 70555 | 77500 | ,-Mitigation at 238184119us, duration=27316ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 27316 | 73377 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 19468 | 75284 | 69444 | 77500 | ,-Mitigation at 39863713us, duration=136196ms | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) | | 0 | passive | 65000 | 2000 | 136196 | 73922 | 62500 | 75000 | | 1 | passive | 75000 | 2000 | 91721 | 74386 | 69444 | 78125 | More information for a better understanding of the thermal behavior will be added after. The idea is to give detailed statistics information about the undershots and overshots, the temperature speed, etc... As all the information in a single file is too much, the idea would be to create a directory named with the mitigation timestamp where all data could be added. Please note this code is immune against trip ordering but not against a trip temperature change while a mitigation is happening. However, this situation should be extremely rare, perhaps not happening and we might question ourselves if something should be done in the core framework for other components first. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12thermal/debugfs: Add thermal cooling device debugfs informationDaniel Lezcano7-6/+490
The thermal framework does not have any debug information except a sysfs stat which is a bit controversial. This one allocates big chunks of memory for every cooling devices with a high number of states and could represent on some systems in production several megabytes of memory for just a portion of it. As the sysfs is limited to a page size, the output is not exploitable with large data array and gets truncated. The patch provides the same information than sysfs except the transitions are dynamically allocated, thus they won't show more events than the ones which actually occurred. There is no longer a size limitation and it opens the field for more debugging information where the debugfs is designed for, not sysfs. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ -- cooling_devices |-- 0 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 1 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 2 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 3 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table `-- 4 |-- clear |-- time_in_state_ms |-- total_trans `-- trans_table The content of the files in the cooling devices directory is the same as the sysfs one except for the trans_table which has the following format: Transition Hits 1->0 246 0->1 246 2->1 632 1->2 632 3->2 98 2->3 98 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-10Merge tag 'thermal-6.8-rc1' of ↵Linus Torvalds17-679/+876
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These add support for the D1/T113s THS controller to the sun8i driver and a DT-based mechanism for platforms to indicate a preference to reboot (instead of shutting down) on crossing a critical trip point, fix issues, make other improvements (in the IPA governor, the Intel HFI driver, the exynos driver and the thermal netlink interface among other places) and clean up code. One long-standing issue addressed here is that trip point crossing notifications sent to user space might be unreliable due to the incorrect handling of trip point hysteresis in the thermal core: multiple notifications might be sent for the same event or there might be events without any notification at all. Specifics: - Add dynamic thresholds for trip point crossing detection to prevent trip point crossing notifications from being sent at incorrect times or not at all in some cases (Rafael J. Wysocki) - Fix synchronization issues related to the resume of thermal zones during a system-wide resume and allow thermal zones to be resumed concurrently (Rafael J. Wysocki) - Modify the thermal zone unregistration to wait for the given zone to go away completely before returning to the caller and rework the sysfs interface for trip points on top of that (Rafael J. Wysocki) - Fix a possible NULL pointer dereference in thermal zone registration error path (Rafael J. Wysocki) - Clean up the IPA thermal governor and modify it (with the help of a new governor callback) to avoid allocating and freeing memory every time its throttling callback is invoked (Lukasz Luba) - Make the IPA thermal governor handle thermal instance weight changes via sysfs correctly (Lukasz Luba) - Update the thermal netlink code to avoid sending messages if there are no recipients (Stanislaw Gruszka) - Convert Mediatek Thermal to the json-schema (Rafał Miłecki) - Fix thermal DT bindings issue on Loongson (Binbin Zhou) - Fix returning NULL instead of -ENODEV during thermal probe on Loogsoon (Binbin Zhou) - Add thermal DT binding for tsens on the SM8650 platform (Neil Armstrong) - Add reboot on the critical trip point crossing option feature (Fabio Estevam) - Use DEFINE_SIMPLE_DEV_PM_OPS do define PM functions for thermal suspend/resume on AmLogic (Uwe Kleine-König) - Add D1/T113s THS controller support to the Sun8i thermal control driver (Maxim Kiselev) - Fix example in the thermal DT binding for QCom SPMI (Johan Hovold) - Fix compilation warning in the tmon utility (Florian Eckert) - Add support for interrupt-based thermal configuration on Exynos along with a set of related cleanups (Mateusz Majewski) - Make the Intel HFI thermal driver enable an HFI instance (eg. processor package) from its first online CPU and disable it when the last CPU in it goes offline (Ricardo Neri) - Fix a kernel-doc warning and a spello in the cpuidle_cooling thermal driver (Randy Dunlap) - Move the .get_temp() thermal zone callback presence check to the thermal zone registration code (Daniel Lezcano) - Use the for_each_trip() macro for trip points table walks in a few places in the thermal core (Rafael J. Wysocki) - Make all trip point updates (via sysfs as well as from the platform firmware) trigger trip change notifications (Rafael J. Wysocki) - Drop redundant code from the thermal core and make one function in it take a const pointer argument (Rafael J. Wysocki)" * tag 'thermal-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) thermal: trip: Constify thermal zone argument of thermal_zone_trip_id() thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline thermal: intel: hfi: Enable an HFI instance from its first online CPU thermal: intel: hfi: Refactor enabling code into helper functions thermal/drivers/exynos: Use set_trips ops thermal/drivers/exynos: Use BIT wherever possible thermal/drivers/exynos: Split initialization of TMU and the thermal zone thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210 thermal/drivers/exynos: Simplify regulator (de)initialization thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts thermal/drivers/exynos: Drop id field thermal/drivers/exynos: Remove an unnecessary field description tools/thermal/tmon: Fix compilation warning for wrong format dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names thermal/drivers/sun8i: Add D1/T113s THS controller support dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions thermal: amlogic: Make amlogic_thermal_disable() return void ...
2024-01-09thermal: netlink: Pass thermal zone pointer to notify routinesRafael J. Wysocki3-28/+28
There are several rountines in the thermal netlink API that take a thermal zone ID or a thermal zone type as their arguments, but from their callers perspective it would be more convenient to pass a thermal zone pointer to them and let them extract the necessary data from the given thermal zone object by themselves. Modify the code accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Drop thermal_notify_tz_trip_add/delete()Rafael J. Wysocki2-46/+1
Because thermal_notify_tz_trip_add/delete() are never used, drop them entirely along with the related code. The addition or removal of trip points is not supported by the thermal core and is unlikely to be supported in the future, so it is also unlikely that these functions will ever be needed. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()Rafael J. Wysocki3-14/+20
Instead of requiring the callers of thermal_notify_tz_trip_up/down() to provide specific values needed to populate struct param in them, make them extract those values from objects passed by the callers via const pointers. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()Rafael J. Wysocki3-16/+17
Instead of requiring the caller of thermal_notify_tz_trip_change() to provide specific values needed to populate struct param in it, make it extract those values from objects passed to it by the caller via const pointers. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-05Merge branch 'thermal-intel'Rafael J. Wysocki1-26/+65
Merge changes in thermal control drivers for Intel platforms for 6.8-rc1: - Make the Intel HFI thermal driver enable an HFI instance (eg. processor package) from its first online CPU and disable it when the last CPU in it goes offline (Ricardo Neri). * thermal-intel: thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline thermal: intel: hfi: Enable an HFI instance from its first online CPU thermal: intel: hfi: Refactor enabling code into helper functions
2024-01-04thermal: trip: Constify thermal zone argument of thermal_zone_trip_id()Rafael J. Wysocki2-2/+2
Because thermal_zone_trip_id() does not update the thermal zone object passed to it, its pointer argument representing the thermal zone can be const, so adjust its definition accordingly. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-01-03thermal: intel: hfi: Disable an HFI instance when all its CPUs go offlineRicardo Neri1-0/+35
In preparation to support hibernation, add functionality to disable an HFI instance during CPU offline. The last CPU of an instance that goes offline will disable such instance. The Intel Software Development Manual states that the operating system must wait for the hardware to set MSR_IA32_PACKAGE_THERM_STATUS[26] after disabling an HFI instance to ensure that it will no longer write on the HFI memory. Some processors, however, do not ever set such bit. Wait a minimum of 2ms to give time hardware to complete any pending memory writes. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-03thermal: intel: hfi: Enable an HFI instance from its first online CPURicardo Neri1-7/+10
Previously, HFI instances were never disabled once enabled. A CPU in an instance only had to check during boot whether another CPU had previously initialized the instance and its corresponding data structure. A subsequent changeset will add functionality to disable instances to support hibernation. Such change will also make possible to disable an HFI instance during runtime via CPU hotplug. Enable an HFI instance from the first of its CPUs that comes online. This covers the boot, CPU hotplug, and resume-from-suspend cases. It also covers systems with one or more HFI instances (i.e., packages). Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-03thermal: intel: hfi: Refactor enabling code into helper functionsRicardo Neri1-21/+22
In preparation for the addition of a suspend notifier, wrap the logic to enable HFI and program its memory buffer into helper functions. Both the CPU hotplug callback and the suspend notifier will use them. This refactoring does not introduce functional changes. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-02Merge tag 'thermal-v6.8-rc1' of ↵Rafael J. Wysocki7-278/+313
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux into thermal Merge thermal control material for 6.8-rc1 from Daniel Lezcano: "- Converted Mediatek Thermal to the json-schema (Rafał Miłecki) - Fixed DT bindings issue on Loongson (Binbin Zhou) - Fixed returning NULL instead of -ENODEV on Loogsoo (Binbin Zhou) - Added the DT binding for the tsens on SM8650 platform (Neil Armstrong) - Added a reboot on critical option feature (Fabio Estevam) - Made usage of DEFINE_SIMPLE_DEV_PM_OPS on AmLogic (Uwe Kleine-König) - Added the D1/T113s THS controller support on Sun8i (Maxim Kiselev) - Fixed example in the DT binding for QCom SPMI (Johan Hovold) - Fixed compilation warning for the tmon utility (Florian Eckert) - Added interrupt based configuration on Exynos along with a set of related cleanups (Mateusz Majewski)" * tag 'thermal-v6.8-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (24 commits) thermal/drivers/exynos: Use set_trips ops thermal/drivers/exynos: Use BIT wherever possible thermal/drivers/exynos: Split initialization of TMU and the thermal zone thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210 thermal/drivers/exynos: Simplify regulator (de)initialization thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts thermal/drivers/exynos: Drop id field thermal/drivers/exynos: Remove an unnecessary field description tools/thermal/tmon: Fix compilation warning for wrong format dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names thermal/drivers/sun8i: Add D1/T113s THS controller support dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions thermal: amlogic: Make amlogic_thermal_disable() return void thermal/thermal_of: Allow rebooting after critical temp reboot: Introduce thermal_zone_device_critical_reboot() thermal/core: Prepare for introduction of thermal reboot dt-bindings: thermal-zones: Document critical-action ...
2024-01-02thermal/drivers/exynos: Use set_trips opsMateusz Majewski1-180/+205
Currently, each trip point defined in the device tree corresponds to a single hardware interrupt. This commit instead switches to using two hardware interrupts, whose values are set dynamically using the set_trips callback. Additionally, the critical temperature threshold is handled specifically. Setting interrupts in this way also fixes a long-standing lockdep warning, which was caused by calling thermal_zone_get_trips with our lock being held. Do note that this requires TMU initialization to be split into two parts, as done by the parent commit: parts of the initialization call into the thermal_zone_device structure and so must be done after its registration, but the initialization is also responsible for setting up calibration, which must be done before thermal_zone_device registration, which will call set_trips for the first time; if the calibration is not done in time, the interrupt values will be silently wrong! Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-10-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Use BIT wherever possibleMateusz Majewski1-12/+12
The original driver did not use that macro and it allows us to make our intentions slightly clearer. Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-9-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Split initialization of TMU and the thermal zoneMateusz Majewski1-34/+50
This will be needed in the future, as the thermal zone subsystem might call our callbacks right after devm_thermal_of_zone_register. Currently we just make get_temp return EAGAIN in such case, but this will not be possible with state-modifying callbacks, for instance set_trips. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-8-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210Mateusz Majewski1-14/+3
Exynos 4210 supports setting a base threshold value, which is added to all trip points. This might be useful, but is not really necessary in our usecase, so we always set it to 0 to simplify the code a bit. Additionally, this change makes it so that we convert the value to the calibrated one in a slightly different place. This is more correct morally, though it does not make any change when single-point calibration is being used (which is the case currently). Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-7-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Simplify regulator (de)initializationMateusz Majewski1-34/+15
We rewrite the initialization to enable the regulator as part of devm, which allows us to not handle the struct instance manually. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-6-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Handle devm_regulator_get_optional return value ↵Mateusz Majewski1-2/+10
correctly Currently, if regulator is required in the SoC, but devm_regulator_get_optional fails for whatever reason, the execution will proceed without propagating the error. Meanwhile there is no reason to output the error in case of -ENODEV. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-5-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to ↵Mateusz Majewski1-20/+9
threaded interrupts The workqueue boilerplate is mostly one-to-one what the threaded interrupts do. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-4-m.majewski2@samsung.com
2024-01-02thermal/drivers/exynos: Drop id fieldMateusz Majewski1-6/+0
We do not use the value, and only Exynos 7 defines this alias anyway. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20231201095625.301884-3-m.majewski2@samsung.com