summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-27ALSA: hda/tas2781: correct the register for pow calibrated dataShenghao Ding1-2/+2
commit 0b6f0ff01a4a8c1b66c600263465976d57dcc1a3 upstream. Calibrated data was written into an incorrect register, which cause speaker protection sometimes malfuctions Fixes: 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver") Signed-off-by: Shenghao Ding <shenghao-ding@ti.com> Cc: <stable@vger.kernel.org> Message-ID: <20240406132010.341-1-shenghao-ding@ti.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27ALSA: seq: ump: Fix conversion from MIDI2 to MIDI1 UMP messagesTakashi Iwai1-1/+1
commit f25f17dc5c6a5e3f2014d44635f0c0db45224efe upstream. The conversion from MIDI2 to MIDI1 UMP messages had a leftover artifact (superfluous bit shift), and this resulted in the bogus type check, leading to empty outputs. Let's fix it. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Cc: <stable@vger.kernel.org> Link: https://github.com/alsa-project/alsa-utils/issues/262 Message-ID: <20240419100442.14806-1-tiwai@suse.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27net/mlx5: E-switch, store eswitch pointer before registering devlink_paramShay Drory2-6/+7
[ Upstream commit 0553e753ea9ee724acaf6b3dfc7354702af83567 ] Next patch will move devlink register to be first. Therefore, whenever mlx5 will register a param, the user will be notified. In order to notify the user, devlink is using the get() callback of the param. Hence, resources that are being used by the get() callback must be set before the devlink param is registered. Therefore, store eswitch pointer inside mdev before registering the param. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240409190820.227554-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27block: propagate partition scanning errors to the BLKRRPART ioctlChristoph Hellwig3-11/+23
[ Upstream commit 752863bddacab6b5c5164b1df8c8b2e3a175ee28 ] Commit 4601b4b130de ("block: reopen the device in blkdev_reread_part") lost the propagation of I/O errors from the low-level read of the partition table to the user space caller of the BLKRRPART. Apparently some user space relies on, so restore the propagation. This isn't exactly pretty as other block device open calls explicitly do not are about these errors, so add a new BLK_OPEN_STRICT_SCAN to opt into the error propagation. Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part") Reported-by: Saranya Muruganandam <saranyamohan@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20240417144743.2277601-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQEric Biggers1-3/+3
[ Upstream commit 9543f6e26634537997b6e909c20911b7bf4876de ] Fix cpuid_deps[] to list the correct dependencies for GFNI, VAES, and VPCLMULQDQ. These features don't depend on AVX512, and there exist CPUs that support these features but not AVX512. GFNI actually doesn't even depend on AVX. This prevents GFNI from being unnecessarily disabled if AVX is disabled to mitigate the GDS vulnerability. This also prevents all three features from being unnecessarily disabled if AVX512VL (or its dependency AVX512F) were to be disabled, but it looks like there isn't any case where this happens anyway. Fixes: c128dbfa0f87 ("x86/cpufeatures: Enable new SSE/AVX/AVX512 CPU features") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20240417060434.47101-1-ebiggers@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27x86/bugs: Fix BHI retpoline checkJosh Poimboeuf1-4/+7
[ Upstream commit 69129794d94c544810e68b2b4eaa7e44063f9bf2 ] Confusingly, X86_FEATURE_RETPOLINE doesn't mean retpolines are enabled, as it also includes the original "AMD retpoline" which isn't a retpoline at all. Also replace cpu_feature_enabled() with boot_cpu_has() because this is before alternatives are patched and cpu_feature_enabled()'s fallback path is slower than plain old boot_cpu_has(). Fixes: ec9404e40e8f ("x86/bhi: Add BHI mitigation knob") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/ad3807424a3953f0323c011a643405619f2a4927.1712944776.git.jpoimboe@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27selftests/powerpc/papr-vpd: Fix missing variable initializationNathan Lynch1-1/+1
[ Upstream commit 210cfef579260ed6c3b700e7baeae51a5e183f43 ] The "close handle without consuming VPD" testcase has inconsistent results because it fails to initialize the location code object it passes to ioctl() to create a VPD handle. Initialize the location code to the empty string as intended. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 9118c5d32bdd ("powerpc/selftests: Add test for papr-vpd") Reported-by: Geetika Moolchandani <geetika@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240404-papr-vpd-test-uninit-lc-v2-1-37bff46c65a5@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27clk: mediatek: mt7988-infracfg: fix clocks for 2nd PCIe portDaniel Golle1-1/+1
[ Upstream commit d3e8a91a848a5941e3c31ecebd6b2612b37e01a6 ] Due to what seems to be an undocumented oddity in MediaTek's MT7988 SoC design the CLK_INFRA_PCIE_PERI_26M_CK_P2 clock requires CLK_INFRA_PCIE_PERI_26M_CK_P3 to be enabled. This currently leads to PCIe port 2 not working in Linux. Reflect the apparent relationship in the clk driver to make sure PCIe port 2 of the MT7988 SoC works. Fixes: 4b4719437d85f ("clk: mediatek: add drivers for MT7988 SoC") Suggested-by: Sam Shih <sam.shih@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/1da2506a51f970706bf4ec9509dd04e0471065e5.1710367453.git.daniel@makrotopia.org Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27clk: mediatek: Do a runtime PM get on controllers during probePin-yen Lin1-0/+15
[ Upstream commit 2f7b1d8b5505efb0057cd1ab85fca206063ea4c3 ] mt8183-mfgcfg has a mutual dependency with genpd during the probing stage, which leads to a deadlock in the following call stack: CPU0: genpd_lock --> clk_prepare_lock genpd_power_off_work_fn() genpd_lock() generic_pm_domain::power_off() clk_unprepare() clk_prepare_lock() CPU1: clk_prepare_lock --> genpd_lock clk_register() __clk_core_init() clk_prepare_lock() clk_pm_runtime_get() genpd_lock() Do a runtime PM get at the probe function to make sure clk_register() won't acquire the genpd lock. Instead of only modifying mt8183-mfgcfg, do this on all mediatek clock controller probings because we don't believe this would cause any regression. Verified on MT8183 and MT8192 Chromebooks. Fixes: acddfc2c261b ("clk: mediatek: Add MT8183 clock support") Signed-off-by: Pin-yen Lin <treapking@chromium.org> Link: https://lore.kernel.org/r/20240312115249.3341654-1-treapking@chromium.org Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27clk: Get runtime PM before walking tree for clk_summaryStephen Boyd1-2/+12
[ Upstream commit 9d1e795f754db1ac3344528b7af0b17b8146f321 ] Similar to the previous commit, we should make sure that all devices are runtime resumed before printing the clk_summary through debugfs. Failure to do so would result in a deadlock if the thread is resuming a device to print clk state and that device is also runtime resuming in another thread, e.g the screen is turning on and the display driver is starting up. We remove the calls to clk_pm_runtime_{get,put}() in this path because they're superfluous now that we know the devices are runtime resumed. This also squashes a bug where the return value of clk_pm_runtime_get() wasn't checked, leading to an RPM count underflow on error paths. Fixes: 1bb294a7981c ("clk: Enable/Disable runtime PM for clk_summary") Cc: Taniya Das <quic_tdas@quicinc.com> Cc: Douglas Anderson <dianders@chromium.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240325184204.745706-6-sboyd@kernel.org Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27clk: Get runtime PM before walking tree during disable_unusedStephen Boyd1-12/+105
[ Upstream commit e581cf5d216289ef292d1a4036d53ce90e122469 ] Doug reported [1] the following hung task: INFO: task swapper/0:1 blocked for more than 122 seconds. Not tainted 5.15.149-21875-gf795ebc40eb8 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00000008 Call trace: __switch_to+0xf4/0x1f4 __schedule+0x418/0xb80 schedule+0x5c/0x10c rpm_resume+0xe0/0x52c rpm_resume+0x178/0x52c __pm_runtime_resume+0x58/0x98 clk_pm_runtime_get+0x30/0xb0 clk_disable_unused_subtree+0x58/0x208 clk_disable_unused_subtree+0x38/0x208 clk_disable_unused_subtree+0x38/0x208 clk_disable_unused_subtree+0x38/0x208 clk_disable_unused_subtree+0x38/0x208 clk_disable_unused+0x4c/0xe4 do_one_initcall+0xcc/0x2d8 do_initcall_level+0xa4/0x148 do_initcalls+0x5c/0x9c do_basic_setup+0x24/0x30 kernel_init_freeable+0xec/0x164 kernel_init+0x28/0x120 ret_from_fork+0x10/0x20 INFO: task kworker/u16:0:9 blocked for more than 122 seconds. Not tainted 5.15.149-21875-gf795ebc40eb8 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u16:0 state:D stack: 0 pid: 9 ppid: 2 flags:0x00000008 Workqueue: events_unbound deferred_probe_work_func Call trace: __switch_to+0xf4/0x1f4 __schedule+0x418/0xb80 schedule+0x5c/0x10c schedule_preempt_disabled+0x2c/0x48 __mutex_lock+0x238/0x488 __mutex_lock_slowpath+0x1c/0x28 mutex_lock+0x50/0x74 clk_prepare_lock+0x7c/0x9c clk_core_prepare_lock+0x20/0x44 clk_prepare+0x24/0x30 clk_bulk_prepare+0x40/0xb0 mdss_runtime_resume+0x54/0x1c8 pm_generic_runtime_resume+0x30/0x44 __genpd_runtime_resume+0x68/0x7c genpd_runtime_resume+0x108/0x1f4 __rpm_callback+0x84/0x144 rpm_callback+0x30/0x88 rpm_resume+0x1f4/0x52c rpm_resume+0x178/0x52c __pm_runtime_resume+0x58/0x98 __device_attach+0xe0/0x170 device_initial_probe+0x1c/0x28 bus_probe_device+0x3c/0x9c device_add+0x644/0x814 mipi_dsi_device_register_full+0xe4/0x170 devm_mipi_dsi_device_register_full+0x28/0x70 ti_sn_bridge_probe+0x1dc/0x2c0 auxiliary_bus_probe+0x4c/0x94 really_probe+0xcc/0x2c8 __driver_probe_device+0xa8/0x130 driver_probe_device+0x48/0x110 __device_attach_driver+0xa4/0xcc bus_for_each_drv+0x8c/0xd8 __device_attach+0xf8/0x170 device_initial_probe+0x1c/0x28 bus_probe_device+0x3c/0x9c deferred_probe_work_func+0x9c/0xd8 process_one_work+0x148/0x518 worker_thread+0x138/0x350 kthread+0x138/0x1e0 ret_from_fork+0x10/0x20 The first thread is walking the clk tree and calling clk_pm_runtime_get() to power on devices required to read the clk hardware via struct clk_ops::is_enabled(). This thread holds the clk prepare_lock, and is trying to runtime PM resume a device, when it finds that the device is in the process of resuming so the thread schedule()s away waiting for the device to finish resuming before continuing. The second thread is runtime PM resuming the same device, but the runtime resume callback is calling clk_prepare(), trying to grab the prepare_lock waiting on the first thread. This is a classic ABBA deadlock. To properly fix the deadlock, we must never runtime PM resume or suspend a device with the clk prepare_lock held. Actually doing that is near impossible today because the global prepare_lock would have to be dropped in the middle of the tree, the device runtime PM resumed/suspended, and then the prepare_lock grabbed again to ensure consistency of the clk tree topology. If anything changes with the clk tree in the meantime, we've lost and will need to start the operation all over again. Luckily, most of the time we're simply incrementing or decrementing the runtime PM count on an active device, so we don't have the chance to schedule away with the prepare_lock held. Let's fix this immediate problem that can be triggered more easily by simply booting on Qualcomm sc7180. Introduce a list of clk_core structures that have been registered, or are in the process of being registered, that require runtime PM to operate. Iterate this list and call clk_pm_runtime_get() on each of them without holding the prepare_lock during clk_disable_unused(). This way we can be certain that the runtime PM state of the devices will be active and resumed so we can't schedule away while walking the clk tree with the prepare_lock held. Similarly, call clk_pm_runtime_put() without the prepare_lock held to properly drop the runtime PM reference. We remove the calls to clk_pm_runtime_{get,put}() in this path because they're superfluous now that we know the devices are runtime resumed. Reported-by: Douglas Anderson <dianders@chromium.org> Closes: https://lore.kernel.org/all/20220922084322.RFC.2.I375b6b9e0a0a5348962f004beb3dafee6a12dfbb@changeid/ [1] Closes: https://issuetracker.google.com/328070191 Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Fixes: 9a34b45397e5 ("clk: Add support for runtime PM") Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240325184204.745706-5-sboyd@kernel.org Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27clk: Initialize struct clk_core kref earlierStephen Boyd1-15/+13
[ Upstream commit 9d05ae531c2cff20d5d527f04e28d28e04379929 ] Initialize this kref once we allocate memory for the struct clk_core so that we can reuse the release function to free any memory associated with the structure. This mostly consolidates code, but also clarifies that the kref lifetime exists once the container structure (struct clk_core) is allocated instead of leaving it in a half-baked state for most of __clk_core_init(). Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240325184204.745706-4-sboyd@kernel.org Stable-dep-of: e581cf5d2162 ("clk: Get runtime PM before walking tree during disable_unused") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27clk: Remove prepare_lock hold assertion in __clk_release()Stephen Boyd1-2/+0
[ Upstream commit 8358a76cfb47c9a5af627a0c4e7168aa14fa25f6 ] Removing this assertion lets us move the kref_put() call outside the prepare_lock section. We don't need to hold the prepare_lock here to free memory and destroy the clk_core structure. We've already unlinked the clk from the clk tree and by the time the release function runs nothing holds a reference to the clk_core anymore so anything with the pointer can't access the memory that's being freed anyway. Way back in commit 496eadf821c2 ("clk: Use lockdep asserts to find missing hold of prepare_lock") we didn't need to have this assertion either. Fixes: 496eadf821c2 ("clk: Use lockdep asserts to find missing hold of prepare_lock") Cc: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240325184204.745706-2-sboyd@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27interconnect: Don't access req_list while it's being manipulatedMike Tipton1-0/+8
[ Upstream commit de1bf25b6d771abdb52d43546cf57ad775fb68a1 ] The icc_lock mutex was split into separate icc_lock and icc_bw_lock mutexes in [1] to avoid lockdep splats. However, this didn't adequately protect access to icc_node::req_list. The icc_set_bw() function will eventually iterate over req_list while only holding icc_bw_lock, but req_list can be modified while only holding icc_lock. This causes races between icc_set_bw(), of_icc_get(), and icc_put(). Example A: CPU0 CPU1 ---- ---- icc_set_bw(path_a) mutex_lock(&icc_bw_lock); icc_put(path_b) mutex_lock(&icc_lock); aggregate_requests() hlist_for_each_entry(r, ... hlist_del(... <r = invalid pointer> Example B: CPU0 CPU1 ---- ---- icc_set_bw(path_a) mutex_lock(&icc_bw_lock); path_b = of_icc_get() of_icc_get_by_index() mutex_lock(&icc_lock); path_find() path_init() aggregate_requests() hlist_for_each_entry(r, ... hlist_add_head(... <r = invalid pointer> Fix this by ensuring icc_bw_lock is always held before manipulating icc_node::req_list. The additional places icc_bw_lock is held don't perform any memory allocations, so we should still be safe from the original lockdep splats that motivated the separate locks. [1] commit af42269c3523 ("interconnect: Fix locking for runpm vs reclaim") Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com> Fixes: af42269c3523 ("interconnect: Fix locking for runpm vs reclaim") Reviewed-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20240305225652.22872-1-quic_mdtipton@quicinc.com Signed-off-by: Georgi Djakov <djakov@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27interconnect: qcom: x1e80100: Remove inexistent ACV_PERF BCMKonrad Dybcio1-26/+0
[ Upstream commit 59097a2a5ecadb0f025232c665fd11c8ae1e1f58 ] Booting the kernel on X1E results in a message like: [ 2.561524] qnoc-x1e80100 interconnect-0: ACV_PERF could not find RPMh address And indeed, taking a look at cmd-db, no such BCM exists. Remove it. Fixes: 9f196772841e ("interconnect: qcom: Add X1E80100 interconnect provider driver") Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Mike Tipton <quic_mdtipton@quicinc.com> Link: https://lore.kernel.org/r/20240302-topic-faux_bcm_x1e-v1-1-c40fab7c4bc5@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27platform/x86/amd/pmc: Extend Framework 13 quirk to more BIOSesMario Limonciello1-0/+9
[ Upstream commit f609e7b1b49e4d15cf107d2069673ee63860c398 ] BIOS 03.05 still hasn't fixed the spurious IRQ1 issue. As it's still being worked on there is still a possibility that it won't need to apply to future BIOS releases. Add a quirk for BIOS 03.05 as well. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240410141046.433-1-mario.limonciello@amd.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27thermal/debugfs: Add missing count increment to thermal_debug_tz_trip_up()Rafael J. Wysocki1-0/+1
[ Upstream commit b552f63cd43735048bbe9bfbb7a9dcfce166fbdd ] The count field in struct trip_stats, representing the number of times the zone temperature was above the trip point, needs to be incremented in thermal_debug_tz_trip_up(), for two reasons. First, if a trip point is crossed on the way up for the first time, thermal_debug_update_temp() called from update_temperature() does not see it because it has not been added to trips_crossed[] array in the thermal zone's struct tz_debugfs object yet. Therefore, when thermal_debug_tz_trip_up() is called after that, the trip point's count value is 0, and the attempt to divide by it during the average temperature computation leads to a divide error which causes the kernel to crash. Setting the count to 1 before the division by incrementing it fixes this problem. Second, if a trip point is crossed on the way up, but it has been crossed on the way up already before, its count value needs to be incremented to make a record of the fact that the zone temperature is above the trip now. Without doing that, if the mitigations applied after crossing the trip cause the zone temperature to drop below its threshold, the count will not be updated for this episode at all and the average temperature in the trip statistics record will be somewhat higher than it should be. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Cc :6.8+ <stable@vger.kernel.org> # 6.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27ALSA: hda/realtek: Fix volumn control of ThinkBook 16P Gen4Huayu Zhang1-2/+2
[ Upstream commit dca5f4dfa925b51becee65031869e917e6229620 ] change HDA & AMP configuration from ALC287_FIXUP_CS35L41_I2C_2 to ALC287_FIXUP_MG_RTKC_CSAMP_CS35L41_I2C_THINKPAD for ThinkBook 16P Gen4 models to fix volumn control issue (cannot fully mute). Signed-off-by: Huayu Zhang <zhanghuayu1233@qq.com> Fixes: 6214e24cae9b ("ALSA: hda/realtek: Add quirks for Lenovo Thinkbook 16P laptops") Message-ID: <tencent_37EB880C5E5BD99D21C16B288115C4545F06@qq.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27drm/radeon: make -fstrict-flex-arrays=3 happyAlex Deucher1-2/+6
[ Upstream commit 0ba753bc7e79e49556e81b0d09b2de1aa558553b ] The driver parses a union where the layout up through the first array is the same, however, the array has different sizes depending on the elements in the union. Be explicit to fix the UBSAN checker. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3323 Fixes: df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27drm/panel: visionox-rm69299: don't unregister DSI deviceDmitry Baryshkov1-2/+0
[ Upstream commit 9e4d3f4f34455abbaa9930bf6b7575a5cd081496 ] The DSI device for the panel was registered by the DSI host, so it is an error to unregister it from the panel driver. Drop the call to mipi_dsi_device_unregister(). Fixes: c7f66d32dd43 ("drm/panel: add support for rm69299 visionox panel") Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240404-drop-panel-unregister-v1-1-9f56953c5fb9@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27thunderbolt: Reset topology created by the boot firmwareSanath S5-18/+38
commit 59a54c5f3dbde00b8ad30aef27fe35b1fe07bf5c upstream. Boot firmware (typically BIOS) might have created tunnels of its own. The tunnel configuration that it does might be sub-optimal. For instance it may only support HBR2 monitors so the DisplayPort tunnels it created may limit Linux graphics drivers. In addition there is an issue on some AMD based systems where the BIOS does not allocate enough PCIe resources for future topology extension. By resetting the USB4 topology the PCIe links will be reset as well allowing Linux to re-allocate. This aligns the behavior with Windows Connection Manager. We already issued host router reset for USB4 v2 routers, now extend it to USB4 v1 routers as well. For pre-USB4 (that's Apple systems) we leave it as is and continue to discover the existing tunnels. Suggested-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Sanath S <Sanath.S@amd.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27thunderbolt: Make tb_switch_reset() support Thunderbolt 2, 3 and USB4 routersSanath S2-14/+111
commit ec8162b3f0683ae08a21f20517cf49272b07ee0b upstream. Currently tb_switch_reset() only did something for Thunderbolt 1 devices. Expand this to support all generations, including USB4, and both host and device routers. Signed-off-by: Sanath S <Sanath.S@amd.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27thunderbolt: Introduce tb_path_deactivate_hop()Sanath S2-0/+14
commit b35c1d7b11da8c08b14147bbe87c2c92f7a83f8b upstream. This function can be used to clear path config space of an adapter. Make it available for other files in this driver. Signed-off-by: Sanath S <Sanath.S@amd.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27thunderbolt: Introduce tb_port_reset()Sanath S5-0/+97
commit 01da6b99d49f60b1edead44e33569b1a2e9f49b7 upstream. Introduce a function that issues Downstream Port Reset to a USB4 port. This supports Thunderbolt 2, 3 and USB4 routers. Signed-off-by: Sanath S <Sanath.S@amd.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVELokesh Gidra1-3/+3
commit c0205eaf3af9f5db14d4b5ee4abacf4a583c3c50 upstream. Commit d7a08838ab74 ("mm: userfaultfd: fix unexpected change to src_folio when UFFDIO_MOVE fails") moved the src_folio->{mapping, index} changing to after clearing the page-table and ensuring that it's not pinned. This avoids failure of swapout+migration and possibly memory corruption. However, the commit missed fixing it in the huge-page case. Link: https://lkml.kernel.org/r/20240404171726.2302435-1-lokeshgidra@google.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27drm/v3d: Don't increment `enabled_ns` twiceMaíra Canal1-4/+0
[ Upstream commit 35f4f8c9fc972248055096d63b782060e473311b ] The commit 509433d8146c ("drm/v3d: Expose the total GPU usage stats on sysfs") introduced the calculation of global GPU stats. For the regards, it used the already existing infrastructure provided by commit 09a93cc4f7d1 ("drm/v3d: Implement show_fdinfo() callback for GPU usage stats"). While adding global GPU stats calculation ability, the author forgot to delete the existing one. Currently, the value of `enabled_ns` is incremented twice by the end of the job, when it should be added just once. Therefore, delete the leftovers from commit 509433d8146c ("drm/v3d: Expose the total GPU usage stats on sysfs"). Fixes: 509433d8146c ("drm/v3d: Expose the total GPU usage stats on sysfs") Reported-by: Tvrtko Ursulin <tursulin@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240403203517.731876-2-mcanal@igalia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27drm: nv04: Fix out of bounds accessMikhail Kobuk1-6/+7
[ Upstream commit cf92bb778eda7830e79452c6917efa8474a30c1e ] When Output Resource (dcb->or) value is assigned in fabricate_dcb_output(), there may be out of bounds access to dac_users array in case dcb->or is zero because ffs(dcb->or) is used as index there. The 'or' argument of fabricate_dcb_output() must be interpreted as a number of bit to set, not value. Utilize macros from 'enum nouveau_or' in calls instead of hardcoding. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 2e5702aff395 ("drm/nouveau: fabricate DCB encoder table for iMac G4") Fixes: 670820c0e6a9 ("drm/nouveau: Workaround incorrect DCB entry on a GeForce3 Ti 200.") Signed-off-by: Mikhail Kobuk <m.kobuk@ispras.ru> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240411110854.16701-1-m.kobuk@ispras.ru Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27iommufd: Add config needed for iommufd_fail_nthMuhammad Usama Anjum1-0/+2
[ Upstream commit 2760c51b8040d7cffedc337939e7475a17cc4b19 ] Add FAULT_INJECTION_DEBUG_FS and FAILSLAB configurations to the kconfig fragment for the iommfd selftests. These kconfigs are needed by the iommufd_fail_nth test. Fixes: a9af47e382a4 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP") Link: https://lore.kernel.org/r/20240325090048.1423908-1-usama.anjum@collabora.com Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27iommufd: Add missing IOMMUFD_DRIVER kconfig for the selftestJason Gunthorpe1-0/+1
[ Upstream commit 8541323285994528ad5be2c1bdc759e6c83b936e ] Some kconfigs don't automatically include this symbol which results in sub functions for some of the dirty tracking related things that are non-functional. Thus the test suite will fail. select IOMMUFD_DRIVER in the IOMMUFD_TEST kconfig to fix it. Fixes: a9af47e382a4 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP") Link: https://lore.kernel.org/r/20240327182050.GA1363414@ziepe.ca Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27s390/cio: fix race condition during online processingPeter Oberparleiter1-5/+8
[ Upstream commit 2d8527f2f911fab84aec04df4788c0c23af3df48 ] A race condition exists in ccw_device_set_online() that can cause the online process to fail, leaving the affected device in an inconsistent state. As a result, subsequent attempts to set that device online fail with return code ENODEV. The problem occurs when a path verification request arrives after a wait for final device state completed, but before the result state is evaluated. Fix this by ensuring that the CCW-device lock is held between determining final state and checking result state. Note that since: commit 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") path verification requests are much more likely to occur during boot, resulting in an increased chance of this race condition occurring. Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27s390/qdio: handle deferred cc1Peter Oberparleiter1-5/+23
[ Upstream commit 607638faf2ff1cede37458111496e7cc6c977f6f ] A deferred condition code 1 response indicates that I/O was not started and should be retried. The current QDIO implementation handles a cc1 response as I/O error, resulting in a failed QDIO setup. This can happen for example when a path verification request arrives at the same time as QDIO setup I/O is started. Fix this by retrying the QDIO setup I/O when a cc1 response is received. Note that since commit 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") commit 5ef1dc40ffa6 ("s390/cio: fix invalid -EBUSY on ccw_device_start") deferred cc1 responses are much more likely to occur. See the commit message of the latter for more background information. Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27perf lock contention: Add a missing NULL checkNamhyung Kim1-1/+4
[ Upstream commit f3408580bac8ce5cd76e7391e529c0a22e7c7eb2 ] I got a report for a failure in BPF verifier on a recent kernel with perf lock contention command. It checks task->sighand->siglock without checking if sighand is NULL or not. Let's add one. ; if (&curr->sighand->siglock == (void *)lock) 265: (79) r1 = *(u64 *)(r0 +2624) ; frame1: R0_w=trusted_ptr_task_struct(off=0,imm=0) ; R1_w=rcu_ptr_or_null_sighand_struct(off=0,imm=0) 266: (b7) r2 = 0 ; frame1: R2_w=0 267: (0f) r1 += r2 R1 pointer arithmetic on rcu_ptr_or_null_ prohibited, null-check it first processed 164 insns (limit 1000000) max_states_per_insn 1 total_states 15 peak_states 15 mark_read 5 -- END PROG LOAD LOG -- libbpf: prog 'contention_end': failed to load: -13 libbpf: failed to load object 'lock_contention_bpf' libbpf: failed to load BPF skeleton 'lock_contention_bpf': -13 Failed to load lock-contention BPF skeleton lock contention BPF setup failed lock contention did not detect any lock contention Fixes: 1811e82767dcc ("perf lock contention: Track and show siglock with address") Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240409225542.1870999-1-namhyung@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27perf annotate: Make sure to call symbol__annotate2() in TUINamhyung Kim2-1/+4
[ Upstream commit 2b8dbf69ec60faf6c7db49e57d7f316409ccec92 ] The symbol__annotate2() initializes some data structures needed by TUI. It has a logic to prevent calling it multiple times by checking if it has the annotated source. But data type profiling uses a different code (symbol__annotate) to allocate the annotated lines in advance. So TUI missed to call symbol__annotate2() when it shows the annotation browser. Make symbol__annotate() reentrant and handle that situation properly. This fixes a crash in the annotation browser started by perf report in TUI like below. $ perf report -s type,sym --tui # and press 'a' key and then move down Fixes: 81e57deec325 ("perf report: Support data type profiling") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240405211800.1412920-2-namhyung@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27RDMA/mlx5: Fix port number for counter query in multi-port configurationMichael Guralnik1-1/+2
[ Upstream commit be121ffb384f53e966ee7299ffccc6eeb61bc73d ] Set the correct port when querying PPCNT in multi-port configuration. Distinguish between cases where switchdev mode was enabled to multi-port configuration and don't overwrite the queried port to 1 in multi-port case. Fixes: 74b30b3ad5ce ("RDMA/mlx5: Set local port to one when accessing counters") Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://lore.kernel.org/r/9bfcc8ade958b760a51408c3ad654a01b11f7d76.1712134988.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27RDMA/cm: Print the old state when cm_destroy_id gets timeoutMark Zhang1-4/+7
[ Upstream commit b68e1acb5834ed1a2ad42d9d002815a8bae7c0b6 ] The old state is helpful for debugging, as the current state is always IB_CM_IDLE when timeout happens. Fixes: 96d9cbe2f2ff ("RDMA/cm: add timeout to cm_destroy_id wait") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/20240322112049.2022994-1-markzhang@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27RDMA/rxe: Fix the problem "mutex_destroy missing"Yanjun.Zhu1-0/+2
[ Upstream commit 481047d7e8391d3842ae59025806531cdad710d9 ] When a mutex lock is not used any more, the function mutex_destroy should be called to mark the mutex lock uninitialized. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yanjun.Zhu <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20240314065140.27468-1-yanjun.zhu@linux.dev Reviewed-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27NFSD: fix endianness issue in nfsd4_encode_fattr4Vasily Gorbik1-24/+23
[ Upstream commit f488138b526715c6d2568d7329c4477911be4210 ] The nfs4 mount fails with EIO on 64-bit big endian architectures since v6.7. The issue arises from employing a union in the nfsd4_encode_fattr4() function to overlay a 32-bit array with a 64-bit values based bitmap, which does not function as intended. Address the endianness issue by utilizing bitmap_from_arr32() to copy 32-bit attribute masks into a bitmap in an endianness-agnostic manner. Cc: stable@vger.kernel.org Fixes: fce7913b13d0 ("NFSD: Use a bitmask loop to encode FATTR4 results") Link: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2060217 Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27net: ethernet: ti: am65-cpsw-nuss: cleanup DMA Channels before using themSiddharth Vadapalli1-0/+18
[ Upstream commit c24cd679b075b0e953ea167b0aa2b2d59e4eba7f ] The TX and RX DMA Channels used by the driver to exchange data with CPSW are not guaranteed to be in a clean state during driver initialization. The Bootloader could have used the same DMA Channels without cleaning them up in the event of failure. Thus, reset and disable the DMA Channels to ensure that they are in a clean state before using them. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Reported-by: Schuyler Patton <spatton@ti.com> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/20240417095425.2253876-1-s-vadapalli@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27net: ravb: Allow RX loop to move past DMA mapping errorsPaul Barker1-12/+13
[ Upstream commit a892493a343494bd6bab9d098593932077ff3c43 ] The RX loops in ravb_rx_gbeth() and ravb_rx_rcar() skip to the next loop iteration if a zero-length descriptor is seen (indicating a DMA mapping error). However, the current RX descriptor index `priv->cur_rx[q]` was incremented at the end of the loop and so would not be incremented when we skip to the next loop iteration. This would cause the loop to keep seeing the same zero-length descriptor instead of moving on to the next descriptor. As the loop counter `i` still increments, the loop would eventually terminate so there is no risk of being stuck here forever - but we should still fix this to avoid wasting cycles. To fix this, the RX descriptor index is incremented at the top of the loop, in the for statement itself. The assignments of `entry` and `desc` are brought into the loop to avoid the need for duplication. Fixes: d8b48911fd24 ("ravb: fix ring memory allocation") Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27net: ravb: Count packets instead of descriptors in R-Car RX pathPaul Barker1-13/+8
[ Upstream commit def52db470df28d6f43cacbd21137f03b9502073 ] The units of "work done" in the RX path should be packets instead of descriptors. Descriptors which are used by the hardware to record error conditions or are empty in the case of a DMA mapping error should not count towards our RX work budget. Also make the limit variable unsigned as it can never be negative. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27ravb: Group descriptor types used in Rx ringNiklas Söderlund2-30/+33
[ Upstream commit 4123c3fbf8632e5c553222bf1c10b3a3e0a8dc06 ] The Rx ring can either be made up of normal or extended descriptors, not a mix of the two at the same time. Make this explicit by grouping the two variables in a rx_ring union. The extension of the storage for more than one queue of normal descriptors from a single to NUM_RX_QUEUE queues have no practical effect. But aids in making the code readable as the code that uses it already piggyback on other members of struct ravb_private that are arrays of max length NUM_RX_QUEUE, e.g. rx_desc_dma. This will also make further refactoring easier. While at it, rename the normal descriptor Rx ring to make it clear it's not strictly related to the GbEthernet E-MAC IP found in RZ/G2L, normal descriptors could be used on R-Car SoCs too. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: def52db470df ("net: ravb: Count packets instead of descriptors in R-Car RX path") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27net: ethernet: mtk_eth_soc: fix WED + wifi resetFelix Fietkau1-5/+1
[ Upstream commit 94667949ec3bbb2218c46ad0a0e7274c8832e494 ] The WLAN + WED reset sequence relies on being able to receive interrupts from the card, in order to synchronize individual steps with the firmware. When WED is stopped, leave interrupts running and rely on the driver turning off unwanted ones. WED DMA also needs to be disabled before resetting. Fixes: f78cd9c783e0 ("net: ethernet: mtk_wed: update mtk_wed_stop") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20240416082330.82564-1-nbd@nbd.name Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27net/sched: Fix mirred deadlock on device recursionEric Dumazet3-0/+8
[ Upstream commit 0f022d32c3eca477fbf79a205243a6123ed0fe11 ] When the mirred action is used on a classful egress qdisc and a packet is mirrored or redirected to self we hit a qdisc lock deadlock. See trace below. [..... other info removed for brevity....] [ 82.890906] [ 82.890906] ============================================ [ 82.890906] WARNING: possible recursive locking detected [ 82.890906] 6.8.0-05205-g77fadd89fe2d-dirty #213 Tainted: G W [ 82.890906] -------------------------------------------- [ 82.890906] ping/418 is trying to acquire lock: [ 82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at: __dev_queue_xmit+0x1778/0x3550 [ 82.890906] [ 82.890906] but task is already holding lock: [ 82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at: __dev_queue_xmit+0x1778/0x3550 [ 82.890906] [ 82.890906] other info that might help us debug this: [ 82.890906] Possible unsafe locking scenario: [ 82.890906] [ 82.890906] CPU0 [ 82.890906] ---- [ 82.890906] lock(&sch->q.lock); [ 82.890906] lock(&sch->q.lock); [ 82.890906] [ 82.890906] *** DEADLOCK *** [ 82.890906] [..... other info removed for brevity....] Example setup (eth0->eth0) to recreate tc qdisc add dev eth0 root handle 1: htb default 30 tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth0 Another example(eth0->eth1->eth0) to recreate tc qdisc add dev eth0 root handle 1: htb default 30 tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth1 tc qdisc add dev eth1 root handle 1: htb default 30 tc filter add dev eth1 handle 1: protocol ip prio 2 matchall \ action mirred egress redirect dev eth0 We fix this by adding an owner field (CPU id) to struct Qdisc set after root qdisc is entered. When the softirq enters it a second time, if the qdisc owner is the same CPU, the packet is dropped to break the loop. Reported-by: Mingshuai Ren <renmingshuai@huawei.com> Closes: https://lore.kernel.org/netdev/20240314111713.5979-1-renmingshuai@huawei.com/ Fixes: 3bcb846ca4cf ("net: get rid of spin_trylock() in net_tx_action()") Fixes: e578d9c02587 ("net: sched: use counter to break reclassify loops") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20240415210728.36949-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27netfilter: nf_tables: fix memleak in map from abort pathPablo Neira Ayuso1-2/+14
[ Upstream commit 86a1471d7cde792941109b93b558b5dc078b9ee9 ] The delete set command does not rely on the transaction object for element removal, therefore, a combination of delete element + delete set from the abort path could result in restoring twice the refcount of the mapping. Check for inactive element in the next generation for the delete element command in the abort path, skip restoring state if next generation bit has been already cleared. This is similar to the activate logic using the set walk iterator. [ 6170.286929] ------------[ cut here ]------------ [ 6170.286939] WARNING: CPU: 6 PID: 790302 at net/netfilter/nf_tables_api.c:2086 nf_tables_chain_destroy+0x1f7/0x220 [nf_tables] [ 6170.287071] Modules linked in: [...] [ 6170.287633] CPU: 6 PID: 790302 Comm: kworker/6:2 Not tainted 6.9.0-rc3+ #365 [ 6170.287768] RIP: 0010:nf_tables_chain_destroy+0x1f7/0x220 [nf_tables] [ 6170.287886] Code: df 48 8d 7d 58 e8 69 2e 3b df 48 8b 7d 58 e8 80 1b 37 df 48 8d 7d 68 e8 57 2e 3b df 48 8b 7d 68 e8 6e 1b 37 df 48 89 ef eb c4 <0f> 0b 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 0f [ 6170.287895] RSP: 0018:ffff888134b8fd08 EFLAGS: 00010202 [ 6170.287904] RAX: 0000000000000001 RBX: ffff888125bffb28 RCX: dffffc0000000000 [ 6170.287912] RDX: 0000000000000003 RSI: ffffffffa20298ab RDI: ffff88811ebe4750 [ 6170.287919] RBP: ffff88811ebe4700 R08: ffff88838e812650 R09: fffffbfff0623a55 [ 6170.287926] R10: ffffffff8311d2af R11: 0000000000000001 R12: ffff888125bffb10 [ 6170.287933] R13: ffff888125bffb10 R14: dead000000000122 R15: dead000000000100 [ 6170.287940] FS: 0000000000000000(0000) GS:ffff888390b00000(0000) knlGS:0000000000000000 [ 6170.287948] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6170.287955] CR2: 00007fd31fc00710 CR3: 0000000133f60004 CR4: 00000000001706f0 [ 6170.287962] Call Trace: [ 6170.287967] <TASK> [ 6170.287973] ? __warn+0x9f/0x1a0 [ 6170.287986] ? nf_tables_chain_destroy+0x1f7/0x220 [nf_tables] [ 6170.288092] ? report_bug+0x1b1/0x1e0 [ 6170.287986] ? nf_tables_chain_destroy+0x1f7/0x220 [nf_tables] [ 6170.288092] ? report_bug+0x1b1/0x1e0 [ 6170.288104] ? handle_bug+0x3c/0x70 [ 6170.288112] ? exc_invalid_op+0x17/0x40 [ 6170.288120] ? asm_exc_invalid_op+0x1a/0x20 [ 6170.288132] ? nf_tables_chain_destroy+0x2b/0x220 [nf_tables] [ 6170.288243] ? nf_tables_chain_destroy+0x1f7/0x220 [nf_tables] [ 6170.288366] ? nf_tables_chain_destroy+0x2b/0x220 [nf_tables] [ 6170.288483] nf_tables_trans_destroy_work+0x588/0x590 [nf_tables] Fixes: 591054469b3e ("netfilter: nf_tables: revisit chain/object refcounting from elements") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27gpiolib: swnode: Remove wrong header inclusionAndy Shevchenko1-1/+0
[ Upstream commit 69ffed4b62523bbc85511f150500329d28aba356 ] The flags in the software node properties are supposed to be the GPIO lookup flags, which are provided by gpio/machine.h, as the software nodes are the kernel internal thing and doesn't need to rely to any of ABIs. Fixes: e7f9ff5dc90c ("gpiolib: add support for software nodes") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27netfilter: nf_tables: restore set elements when delete set failsPablo Neira Ayuso5-20/+45
[ Upstream commit e79b47a8615d42c68aaeb68971593333667382ed ] From abort path, nft_mapelem_activate() needs to restore refcounters to the original state. Currently, it uses the set->ops->walk() to iterate over these set elements. The existing set iterator skips inactive elements in the next generation, this does not work from the abort path to restore the original state since it has to skip active elements instead (not inactive ones). This patch moves the check for inactive elements to the set iterator callback, then it reverses the logic for the .activate case which needs to skip active elements. Toggle next generation bit for elements when delete set command is invoked and call nft_clear() from .activate (abort) path to restore the next generation bit. The splat below shows an object in mappings memleak: [43929.457523] ------------[ cut here ]------------ [43929.457532] WARNING: CPU: 0 PID: 1139 at include/net/netfilter/nf_tables.h:1237 nft_setelem_data_deactivate+0xe4/0xf0 [nf_tables] [...] [43929.458014] RIP: 0010:nft_setelem_data_deactivate+0xe4/0xf0 [nf_tables] [43929.458076] Code: 83 f8 01 77 ab 49 8d 7c 24 08 e8 37 5e d0 de 49 8b 6c 24 08 48 8d 7d 50 e8 e9 5c d0 de 8b 45 50 8d 50 ff 89 55 50 85 c0 75 86 <0f> 0b eb 82 0f 0b eb b3 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 [43929.458081] RSP: 0018:ffff888140f9f4b0 EFLAGS: 00010246 [43929.458086] RAX: 0000000000000000 RBX: ffff8881434f5288 RCX: dffffc0000000000 [43929.458090] RDX: 00000000ffffffff RSI: ffffffffa26d28a7 RDI: ffff88810ecc9550 [43929.458093] RBP: ffff88810ecc9500 R08: 0000000000000001 R09: ffffed10281f3e8f [43929.458096] R10: 0000000000000003 R11: ffff0000ffff0000 R12: ffff8881434f52a0 [43929.458100] R13: ffff888140f9f5f4 R14: ffff888151c7a800 R15: 0000000000000002 [43929.458103] FS: 00007f0c687c4740(0000) GS:ffff888390800000(0000) knlGS:0000000000000000 [43929.458107] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [43929.458111] CR2: 00007f58dbe5b008 CR3: 0000000123602005 CR4: 00000000001706f0 [43929.458114] Call Trace: [43929.458118] <TASK> [43929.458121] ? __warn+0x9f/0x1a0 [43929.458127] ? nft_setelem_data_deactivate+0xe4/0xf0 [nf_tables] [43929.458188] ? report_bug+0x1b1/0x1e0 [43929.458196] ? handle_bug+0x3c/0x70 [43929.458200] ? exc_invalid_op+0x17/0x40 [43929.458211] ? nft_setelem_data_deactivate+0xd7/0xf0 [nf_tables] [43929.458271] ? nft_setelem_data_deactivate+0xe4/0xf0 [nf_tables] [43929.458332] nft_mapelem_deactivate+0x24/0x30 [nf_tables] [43929.458392] nft_rhash_walk+0xdd/0x180 [nf_tables] [43929.458453] ? __pfx_nft_rhash_walk+0x10/0x10 [nf_tables] [43929.458512] ? rb_insert_color+0x2e/0x280 [43929.458520] nft_map_deactivate+0xdc/0x1e0 [nf_tables] [43929.458582] ? __pfx_nft_map_deactivate+0x10/0x10 [nf_tables] [43929.458642] ? __pfx_nft_mapelem_deactivate+0x10/0x10 [nf_tables] [43929.458701] ? __rcu_read_unlock+0x46/0x70 [43929.458709] nft_delset+0xff/0x110 [nf_tables] [43929.458769] nft_flush_table+0x16f/0x460 [nf_tables] [43929.458830] nf_tables_deltable+0x501/0x580 [nf_tables] Fixes: 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27netfilter: nf_tables: missing iterator type in lookup walkPablo Neira Ayuso2-1/+3
[ Upstream commit efefd4f00c967d00ad7abe092554ffbb70c1a793 ] Add missing decorator type to lookup expression and tighten WARN_ON_ONCE check in pipapo to spot earlier that this is unset. Fixes: 29b359cf6d95 ("netfilter: nft_set_pipapo: walk over current view on netlink dump") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27s390/ism: Properly fix receive message buffer allocationGerd Bayer1-9/+28
[ Upstream commit 83781384a96b95e2b6403d3c8a002b2c89031770 ] Since [1], dma_alloc_coherent() does not accept requests for GFP_COMP anymore, even on archs that may be able to fulfill this. Functionality that relied on the receive buffer being a compound page broke at that point: The SMC-D protocol, that utilizes the ism device driver, passes receive buffers to the splice processor in a struct splice_pipe_desc with a single entry list of struct pages. As the buffer is no longer a compound page, the splice processor now rejects requests to handle more than a page worth of data. Replace dma_alloc_coherent() and allocate a buffer with folio_alloc and create a DMA map for it with dma_map_page(). Since only receive buffers on ISM devices use DMA, qualify the mapping as FROM_DEVICE. Since ISM devices are available on arch s390, only, and on that arch all DMA is coherent, there is no need to introduce and export some kind of dma_sync_to_cpu() method to be called by the SMC-D protocol layer. Analogously, replace dma_free_coherent by a two step dma_unmap_page, then folio_put to free the receive buffer. [1] https://lore.kernel.org/all/20221113163535.884299-1-hch@lst.de/ Fixes: c08004eede4b ("s390/ism: don't pass bogus GFP_ flags to dma_alloc_coherent") Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27net: dsa: mt7530: fix port mirroring for MT7988 SoC switchArınç ÜNAL1-4/+6
[ Upstream commit 2c606d138518cc69f09c35929abc414a99e3a28f ] The "MT7988A Wi-Fi 7 Generation Router Platform: Datasheet (Open Version) v0.1" document shows bits 16 to 18 as the MIRROR_PORT field of the CPU forward control register. Currently, the MT7530 DSA subdriver configures bits 0 to 2 of the CPU forward control register which breaks the port mirroring feature for the MT7988 SoC switch. Fix this by using the MT7531_MIRROR_PORT_GET() and MT7531_MIRROR_PORT_SET() macros which utilise the correct bits. Fixes: 110c18bfed41 ("net: dsa: mt7530: introduce driver for MT7988 built-in switch") Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Acked-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-27net: dsa: mt7530: fix mirroring frames received on local portArınç ÜNAL2-0/+10
[ Upstream commit d59cf049c8378677053703e724808836f180888e ] This switch intellectual property provides a bit on the ARL global control register which controls allowing mirroring frames which are received on the local port (monitor port). This bit is unset after reset. This ability must be enabled to fully support the port mirroring feature on this switch intellectual property. Therefore, this patch fixes the traffic not being reflected on a port, which would be configured like below: tc qdisc add dev swp0 clsact tc filter add dev swp0 ingress matchall skip_sw \ action mirred egress mirror dev swp0 As a side note, this configuration provides the hairpinning feature for a single port. Fixes: 37feab6076aa ("net: dsa: mt7530: add support for port mirroring") Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>