summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2024-03-28drm/i915: Pre-populate the cursor physical dma addressVille Syrjälä3-3/+12
Calling i915_gem_object_get_dma_address() from the vblank evade critical section triggers might_sleep(). While we know that we've already pinned the framebuffer and thus i915_gem_object_get_dma_address() will in fact not sleep in this case, it seems reasonable to keep the unconditional might_sleep() for maximum coverage. So let's instead pre-populate the dma address during fb pinning, which all happens before we enter the vblank evade critical section. We can use u32 for the dma address as this class of hardware doesn't support >32bit addresses. Cc: stable@vger.kernel.org Fixes: 0225a90981c8 ("drm/i915: Make cursor plane registers unlocked") Reported-by: Borislav Petkov <bp@alien8.de> Closes: https://lore.kernel.org/intel-gfx/20240227100342.GAZd2zfmYcPS_SndtO@fat_crate.local/ Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240325175738.3440-1-ville.syrjala@linux.intel.com Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> (cherry picked from commit c1289a5c3594cf04caa94ebf0edeb50c62009f1f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/gt: Reset queue_priority_hint on parkingChris Wilson2-3/+3
Originally, with strict in order execution, we could complete execution only when the queue was empty. Preempt-to-busy allows replacement of an active request that may complete before the preemption is processed by HW. If that happens, the request is retired from the queue, but the queue_priority_hint remains set, preventing direct submission until after the next CS interrupt is processed. This preempt-to-busy race can be triggered by the heartbeat, which will also act as the power-management barrier and upon completion allow us to idle the HW. We may process the completion of the heartbeat, and begin parking the engine before the CS event that restores the queue_priority_hint, causing us to fail the assertion that it is MIN. <3>[ 166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 166.210781] Dumping ftrace buffer: <0>[ 166.210795] --------------------------------- ... <0>[ 167.302811] drm_fdin-1097 2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 } <0>[ 167.302861] drm_fdin-1097 2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0: preempting last=1217:2, prio=0, hint=2147483646 <0>[ 167.302928] drm_fdin-1097 2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 1217:2, current 0 <0>[ 167.302992] drm_fdin-1097 2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659 <0>[ 167.303044] drm_fdin-1097 2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0: context:3 schedule-in, ccid:40 <0>[ 167.303095] drm_fdin-1097 2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 } <0>[ 167.303159] kworker/-89 11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence c90:2, current 2 <0>[ 167.303208] kworker/-89 11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:c90 unpin <0>[ 167.303272] kworker/-89 11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 1217:2, current 2 <0>[ 167.303321] kworker/-89 11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:1217 unpin <0>[ 167.303384] kworker/-89 11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 3:4660, current 4660 <0>[ 167.303434] kworker/-89 11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0: context:1216 retire runtime: { total:56028ns, avg:56028ns } <0>[ 167.303484] kworker/-89 11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked <0>[ 167.303534] <idle>-0 5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0: semaphore yield: 00000040 <0>[ 167.303583] kworker/-89 11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0: context:1217 retire runtime: { total:325575ns, avg:0ns } <0>[ 167.303756] kworker/-89 11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0: context:c90 retire runtime: { total:0ns, avg:0ns } <0>[ 167.303806] kworker/-89 11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 167.303811] --------------------------------- <4>[ 167.304722] ------------[ cut here ]------------ <2>[ 167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283! <4>[ 167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G W 6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1 <4>[ 167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 <4>[ 167.304738] Workqueue: i915-unordered retire_work_handler [i915] <4>[ 167.304839] RIP: 0010:__engine_park+0x3fd/0x680 [i915] <4>[ 167.304937] Code: 00 48 c7 c2 b0 e5 86 a0 48 8d 3d 00 00 00 00 e8 79 48 d4 e0 bf 01 00 00 00 e8 ef 0a d4 e0 31 f6 bf 09 00 00 00 e8 03 49 c0 e0 <0f> 0b 0f 0b be 01 00 00 00 e8 f5 61 fd ff 31 c0 e9 34 fd ff ff 48 <4>[ 167.304940] RSP: 0018:ffffc9000059fce0 EFLAGS: 00010246 <4>[ 167.304942] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000006 <4>[ 167.304944] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 <4>[ 167.304946] RBP: ffff8881330ca1b0 R08: 0000000000000001 R09: 0000000000000001 <4>[ 167.304947] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881330ca000 <4>[ 167.304948] R13: ffff888110f02aa0 R14: ffff88812d1d0205 R15: ffff88811277d4f0 <4>[ 167.304950] FS: 0000000000000000(0000) GS:ffff88844f780000(0000) knlGS:0000000000000000 <4>[ 167.304952] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 167.304953] CR2: 00007fc362200c40 CR3: 000000013306e003 CR4: 0000000000770ef0 <4>[ 167.304955] PKRU: 55555554 <4>[ 167.304957] Call Trace: <4>[ 167.304958] <TASK> <4>[ 167.305573] ____intel_wakeref_put_last+0x1d/0x80 [i915] <4>[ 167.305685] i915_request_retire.part.0+0x34f/0x600 [i915] <4>[ 167.305800] retire_requests+0x51/0x80 [i915] <4>[ 167.305892] intel_gt_retire_requests_timeout+0x27f/0x700 [i915] <4>[ 167.305985] process_scheduled_works+0x2db/0x530 <4>[ 167.305990] worker_thread+0x18c/0x350 <4>[ 167.305993] kthread+0xfe/0x130 <4>[ 167.305997] ret_from_fork+0x2c/0x50 <4>[ 167.306001] ret_from_fork_asm+0x1b/0x30 <4>[ 167.306004] </TASK> It is necessary for the queue_priority_hint to be lower than the next request submission upon waking up, as we rely on the hint to decide when to kick the tasklet to submit that first request. Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10154 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318135906.716055-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit 98850e96cf811dc2d0a7d0af491caff9f5d49c1e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/vma: Fix UAF on destroy against retire raceJanusz Krzysztofik1-7/+43
Object debugging tools were sporadically reporting illegal attempts to free a still active i915 VMA object when parking a GT believed to be idle. [161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915] [161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0 ... [161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1 [161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 [161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915] [161.360592] RIP: 0010:debug_print_object+0x80/0xb0 ... [161.361347] debug_object_free+0xeb/0x110 [161.361362] i915_active_fini+0x14/0x130 [i915] [161.361866] release_references+0xfe/0x1f0 [i915] [161.362543] i915_vma_parked+0x1db/0x380 [i915] [161.363129] __gt_park+0x121/0x230 [i915] [161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915] That has been tracked down to be happening when another thread is deactivating the VMA inside __active_retire() helper, after the VMA's active counter has been already decremented to 0, but before deactivation of the VMA's object is reported to the object debugging tool. We could prevent from that race by serializing i915_active_fini() with __active_retire() via ref->tree_lock, but that wouldn't stop the VMA from being used, e.g. from __i915_vma_retire() called at the end of __active_retire(), after that VMA has been already freed by a concurrent i915_vma_destroy() on return from the i915_active_fini(). Then, we should rather fix the issue at the VMA level, not in i915_active. Since __i915_vma_parked() is called from __gt_park() on last put of the GT's wakeref, the issue could be addressed by holding the GT wakeref long enough for __active_retire() to complete before that wakeref is released and the GT parked. I believe the issue was introduced by commit d93939730347 ("drm/i915: Remove the vma refcount") which moved a call to i915_active_fini() from a dropped i915_vma_release(), called on last put of the removed VMA kref, to i915_vma_parked() processing path called on last put of a GT wakeref. However, its visibility to the object debugging tool was suppressed by a bug in i915_active that was fixed two weeks later with commit e92eb246feb9 ("drm/i915/active: Fix missing debug object activation"). A VMA associated with a request doesn't acquire a GT wakeref by itself. Instead, it depends on a wakeref held directly by the request's active intel_context for a GT associated with its VM, and indirectly on that intel_context's engine wakeref if the engine belongs to the same GT as the VMA's VM. Those wakerefs are released asynchronously to VMA deactivation. Fix the issue by getting a wakeref for the VMA's GT when activating it, and putting that wakeref only after the VMA is deactivated. However, exclude global GTT from that processing path, otherwise the GPU never goes idle. Since __i915_vma_retire() may be called from atomic contexts, use async variant of wakeref put. Also, to avoid circular locking dependency, take care of acquiring the wakeref before VM mutex when both are needed. v7: Add inline comments with justifications for: - using untracked variants of intel_gt_pm_get/put() (Nirmoy), - using async variant of _put(), - not getting the wakeref in case of a global GTT, - always getting the first wakeref outside vm->mutex. v6: Since __i915_vma_active/retire() callbacks are not serialized, storing a wakeref tracking handle inside struct i915_vma is not safe, and there is no other good place for that. Use untracked variants of intel_gt_pm_get/put_async(). v5: Replace "tile" with "GT" across commit description (Rodrigo), - avoid mentioning multi-GT case in commit description (Rodrigo), - explain why we need to take a temporary wakeref unconditionally inside i915_vma_pin_ww() (Rodrigo). v4: Refresh on top of commit 5e4e06e4087e ("drm/i915: Track gt pm wakerefs") (Andi), - for more easy backporting, split out removal of former insufficient workarounds and move them to separate patches (Nirmoy). - clean up commit message and description a bit. v3: Identify root cause more precisely, and a commit to blame, - identify and drop former workarounds, - update commit message and description. v2: Get the wakeref before VM mutex to avoid circular locking dependency, - drop questionable Fixes: tag. Fixes: d93939730347 ("drm/i915: Remove the vma refcount") Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875 Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: stable@vger.kernel.org # v5.19+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240305143747.335367-6-janusz.krzysztofik@linux.intel.com (cherry picked from commit f3c71b2ded5c4367144a810ef25f998fd1d6c381) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915: Do not print 'pxp init failed with 0' when it succeedJosé Roberto de Souza1-1/+1
It is misleading, if the intention was to also print something in case it succeed it should have a different string. Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Fixes: 698e19da2914 ("drm/i915: Skip pxp init if gt is wedged") Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240320210547.71937-1-jose.souza@intel.com (cherry picked from commit d437099ab21cd4c6ce5d578b765df642d759c929) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915: Do not match JSL in ehl_combo_pll_div_frac_wa_needed()Jonathon Hall1-1/+1
Since commit 0c65dc062611 ("drm/i915/jsl: s/JSL/JASPERLAKE for platform/subplatform defines"), boot freezes on a Jasper Lake tablet (Librem 11), usually with graphical corruption on the eDP display, but sometimes just a black screen. This commit was included in 6.6 and later. That commit was intended to refactor EHL and JSL macros, but the change to ehl_combo_pll_div_frac_wa_needed() started matching JSL incorrectly when it was only intended to match EHL. It replaced: return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) && IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) || with: return (((IS_ELKHARTLAKE(i915) || IS_JASPERLAKE(i915)) && IS_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) || Remove IS_JASPERLAKE() to fix the regression. Signed-off-by: Jonathon Hall <jonathon.hall@puri.sm> Cc: stable@vger.kernel.org Fixes: 0c65dc062611 ("drm/i915/jsl: s/JSL/JASPERLAKE for platform/subplatform defines") Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240313135424.3731410-1-jonathon.hall@puri.sm Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 1ef48859317b2a77672dea8682df133abf9c44ed) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/hwmon: Fix locking inversion in sysfs getterJanusz Krzysztofik1-18/+19
In i915 hwmon sysfs getter path we now take a hwmon_lock, then acquire an rpm wakeref. That results in lock inversion: <4> [197.079335] ====================================================== <4> [197.085473] WARNING: possible circular locking dependency detected <4> [197.091611] 6.8.0-rc7-Patchwork_129026v7-gc4dc92fb1152+ #1 Not tainted <4> [197.098096] ------------------------------------------------------ <4> [197.104231] prometheus-node/839 is trying to acquire lock: <4> [197.109680] ffffffff82764d80 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x9a/0x350 <4> [197.116939] but task is already holding lock: <4> [197.122730] ffff88811b772a40 (&hwmon->hwmon_lock){+.+.}-{3:3}, at: hwm_energy+0x4b/0x100 [i915] <4> [197.131543] which lock already depends on the new lock. ... <4> [197.507922] Chain exists of: fs_reclaim --> &gt->reset.mutex --> &hwmon->hwmon_lock <4> [197.518528] Possible unsafe locking scenario: <4> [197.524411] CPU0 CPU1 <4> [197.528916] ---- ---- <4> [197.533418] lock(&hwmon->hwmon_lock); <4> [197.537237] lock(&gt->reset.mutex); <4> [197.543376] lock(&hwmon->hwmon_lock); <4> [197.549682] lock(fs_reclaim); ... <4> [197.632548] Call Trace: <4> [197.634990] <TASK> <4> [197.637088] dump_stack_lvl+0x64/0xb0 <4> [197.640738] check_noncircular+0x15e/0x180 <4> [197.652968] check_prev_add+0xe9/0xce0 <4> [197.656705] __lock_acquire+0x179f/0x2300 <4> [197.660694] lock_acquire+0xd8/0x2d0 <4> [197.673009] fs_reclaim_acquire+0xa1/0xd0 <4> [197.680478] __kmalloc+0x9a/0x350 <4> [197.689063] acpi_ns_internalize_name.part.0+0x4a/0xb0 <4> [197.694170] acpi_ns_get_node_unlocked+0x60/0xf0 <4> [197.720608] acpi_ns_get_node+0x3b/0x60 <4> [197.724428] acpi_get_handle+0x57/0xb0 <4> [197.728164] acpi_has_method+0x20/0x50 <4> [197.731896] acpi_pci_set_power_state+0x43/0x120 <4> [197.736485] pci_power_up+0x24/0x1c0 <4> [197.740047] pci_pm_default_resume_early+0x9/0x30 <4> [197.744725] pci_pm_runtime_resume+0x2d/0x90 <4> [197.753911] __rpm_callback+0x3c/0x110 <4> [197.762586] rpm_callback+0x58/0x70 <4> [197.766064] rpm_resume+0x51e/0x730 <4> [197.769542] rpm_resume+0x267/0x730 <4> [197.773020] rpm_resume+0x267/0x730 <4> [197.776498] rpm_resume+0x267/0x730 <4> [197.779974] __pm_runtime_resume+0x49/0x90 <4> [197.784055] __intel_runtime_pm_get+0x19/0xa0 [i915] <4> [197.789070] hwm_energy+0x55/0x100 [i915] <4> [197.793183] hwm_read+0x9a/0x310 [i915] <4> [197.797124] hwmon_attr_show+0x36/0x120 <4> [197.800946] dev_attr_show+0x15/0x60 <4> [197.804509] sysfs_kf_seq_show+0xb5/0x100 Acquire the wakeref before the lock and hold it as long as the lock is also held. Follow that pattern across the whole source file where similar lock inversion can happen. v2: Keep hardware read under the lock so the whole operation of updating energy from hardware is still atomic (Guenter), - instead, acquire the rpm wakeref before the lock and hold it as long as the lock is held, - use the same aproach for other similar places across the i915_hwmon.c source file (Rodrigo). Fixes: 1b44019a93e2 ("drm/i915/guc: Disable PL1 power limit when loading GuC firmware") Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: <stable@vger.kernel.org> # v6.5+ Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240311203500.518675-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit 71b218771426ea84c0e0148a2b7ac52c1f76e792) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/dsb: Fix DSB vblank waits when using VRRVille Syrjälä1-0/+14
Looks like the undelayed vblank gets signalled exactly when the active period ends. That is a problem for DSB+VRR when we are already in vblank and expect DSB to start executing as soon as we send the push. Instead of starting, the DSB just keeps on waiting for the undelayed vblank which won't signal until the end of the next frame's active period, which is far too late. The end result is that DSB won't have even started executing by the time the flips/etc. have completed. We then wait for an extra 1ms, after which we terminate the DSB and report a timeout: [drm] *ERROR* [CRTC:80:pipe A] DSB 0 timed out waiting for idle (current head=0xfedf4000, head=0x0, tail=0x1080) To fix this let's configure DSB to use the so called VRR "safe window" instead of the undelayed vblank to trigger the DSB vblank logic, when VRR is enabled. Cc: stable@vger.kernel.org Fixes: 34d8311f4a1c ("drm/i915/dsb: Re-instate DSB for LUT updates") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9927 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306040806.21697-3-ville.syrjala@linux.intel.com Reviewed-by: Animesh Manna <animesh.manna@intel.com> (cherry picked from commit 41429d9b68367596eb3d6d5961e6295c284622a7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/vrr: Generate VRR "safe window" for DSBVille Syrjälä2-4/+5
Looks like TRANS_CHICKEN bit 31 means something totally different depending on the platform: TGL: generate VRR "safe window" for DSB ADL/DG2: make TRANS_SET_CONTEXT_LATENCY effective with VRR So far we've only set this on ADL/DG2, but when using DSB+VRR we also need to set it on TGL. And a quick test on MTL says it doesn't need this bit for either of those purposes, even though it's still documented as valid in bspec. Cc: stable@vger.kernel.org Fixes: 34d8311f4a1c ("drm/i915/dsb: Re-instate DSB for LUT updates") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9927 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306040806.21697-2-ville.syrjala@linux.intel.com Reviewed-by: Animesh Manna <animesh.manna@intel.com> (cherry picked from commit 810e4519a1b34b5a0ff0eab32e5b184f533c5ee9) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/display/debugfs: Fix duplicate checks in i915_drrs_statusBhanuprakash Modem1-3/+2
Remove duplicate checks for debugfs entry "DRRS capable:". Fixes: 20af10845864 ("drm/i915/display/debugfs: New entry "DRRS capable" to i915_drrs_status") Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Cc: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com> Signed-off-by: Bhanuprakash Modem <bhanuprakash.modem@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240227123833.2799647-2-bhanuprakash.modem@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 3d81fceb60f20fe2ceed2198636ee6dc9ef46775) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/drrs: Refactor CPU transcoder DRRS checkBhanuprakash Modem3-10/+14
Rename cpu_transcoder_has_drrs() to intel_cpu_transcoder_has_drrs() and move it to intel_drrs.[ch]. V2: - Move helpers to intel_drrs.[ch] (Jani) - Fix commit message (Jani) Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Cc: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com> Signed-off-by: Bhanuprakash Modem <bhanuprakash.modem@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228055502.2857819-1-bhanuprakash.modem@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 2d04f8158548103c082190c8dbf6a19097e2423e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/mtl: Update workaround 14018575942Tejas Upadhyay1-0/+1
Applying WA 14018575942 only on Compute engine has impact on some apps like chrome. Updating this WA to apply on Render engine as well as it is helping with performance on Chrome. Note: There is no concern from media team thus not applying WA on media engines. We will revisit if any issues reported from media team. V2(Matt): - Use correct WA number Fixes: 668f37e1ee11 ("drm/i915/mtl: Update workaround 14018778641") Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228103738.2018458-1-tejas.upadhyay@intel.com (cherry picked from commit 71271280175aa0ed6673e40cce7c01296bcd05f6) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/dsi: Go back to the previous INIT_OTP/DISPLAY_ON order, mostlyVille Syrjälä2-7/+39
Reinstate commit 88b065943cb5 ("drm/i915/dsi: Do display on sequence later on icl+"), for the most part. Turns out some machines (eg. Chuwi Minibook X) really do need that updated order. It is also the order the Windows driver uses. However we can't just undo the revert since that would again break Lenovo 82TQ. After staring at the VBT sequences for both machines I've concluded that the Lenovo 82TQ sequences look somewhat broken: - INIT_OTP is not present at all - what should be in INIT_OTP is found in DISPLAY_ON - what should be in DISPLAY_ON is found in BACKLIGHT_ON (along with the actual backlight stuff) The Chuwi Minibook X on the other hand has a full complement of sequences in its VBT. So let's try to deal with the broken sequences in the Lenovo 82TQ VBT by simply swapping the (non-existent) INIT_OTP sequence with the DISPLAY_ON sequence. Thus we execute DISPLAY_ON when intending to execute INIT_OTP, and execute nothing at all when intending to execute DISPLAY_ON. That should be 100% equivalent to the revert, for such broken VBTs. Cc: stable@vger.kernel.org Fixes: 6992eb815d08 ("Revert "drm/i915/dsi: Do display on sequence later on icl+"") References: https://gitlab.freedesktop.org/drm/intel/-/issues/10071 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10334 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240305083659.8396-1-ville.syrjala@linux.intel.com Acked-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 94ae4612ea336bfc3c12b3fc68467c6711a4f39b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915/display: Disable AuxCCS framebuffers if built for XeJuha-Pekka Heikkila1-0/+3
AuxCCS framebuffers don't work on Xe driver hence disable them from plane capabilities until they are fixed. FlatCCS framebuffers work and they are left enabled. CCS is left untouched for i915 driver. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/933 Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Tested-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Fixes: 44e694958b95 ("drm/xe/display: Implement display support") Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240228140225.858145-1-juhapekka.heikkila@gmail.com (cherry picked from commit b7232a730fbf043f54fb46fbf4a6e92936770e79) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28drm/i915: Stop doing double audio enable/disable on SDVO and g4x+ DPVille Syrjälä2-6/+0
Looks like I misplaced a few hunks when I moved the audio enable/disable out from the encoder enable/disable hooks. So we are now doing a double audio enable/disable on SDVO and g4x+ DP. Probably harmless as doing it twice shouldn't really change anything, but let's do it just once, as intended. Fixes: cff742cc6851 ("drm/i915: Hoist the encoder->audio_{enable,disable}() calls higher up") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240226193251.29619-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 315bd0a0825776d6c66d474bf572db64fa019ad8) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28x86/efistub: Reinstate soft limit for initrd loadingArd Biesheuvel1-0/+1
Commit 8117961d98fb2 ("x86/efi: Disregard setup header of loaded image") dropped the memcopy of the image's setup header into the boot_params struct provided to the core kernel, on the basis that EFI boot does not need it and should rely only on a single protocol to interface with the boot chain. It is also a prerequisite for being able to increase the section alignment to 4k, which is needed to enable memory protections when running in the boot services. So only the setup_header fields that matter to the core kernel are populated explicitly, and everything else is ignored. One thing was overlooked, though: the initrd_addr_max field in the setup_header is not used by the core kernel, but it is used by the EFI stub itself when it loads the initrd, where its default value of INT_MAX is used as the soft limit for memory allocation. This means that, in the old situation, the initrd was virtually always loaded in the lower 2G of memory, but now, due to initrd_addr_max being 0x0, the initrd may end up anywhere in memory. This should not be an issue principle, as most systems can deal with this fine. However, it does appear to tickle some problems in older UEFI implementations, where the memory ends up being corrupted, resulting in errors when unpacking the initramfs. So set the initrd_addr_max field to INT_MAX like it was before. Fixes: 8117961d98fb2 ("x86/efi: Disregard setup header of loaded image") Reported-by: Radek Podgorny <radek@podgorny.cz> Closes: https://lore.kernel.org/all/a99a831a-8ad5-4cb0-bff9-be637311f771@podgorny.cz Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-03-28efi/libstub: Cast away type warning in use of max()Ard Biesheuvel1-1/+1
Avoid a type mismatch warning in max() by switching to max_t() and providing the type explicitly. Fixes: 3cb4a4827596abc82e ("efi/libstub: fix efi_random_alloc() ...") Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-03-28drm/i915: Add includes for BUG_ON/BUILD_BUG_ON in i915_memcpy.cJoonas Lahtinen1-0/+2
Add standalone includes for BUG_ON and BUILD_BUG_ON to avoid build failure after linux-next include refactoring. Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240308144643.137831-1-joonas.lahtinen@linux.intel.com (cherry picked from commit 4df6ac223cad36e7384ed00fe6efc114279f0df6) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28Octeontx2-af: fix pause frame configuration in GMP modeHariprasad Kelam1-0/+5
The Octeontx2 MAC block (CGX) has separate data paths (SMU and GMP) for different speeds, allowing for efficient data transfer. The previous patch which added pause frame configuration has a bug due to which pause frame feature is not working in GMP mode. This patch fixes the issue by configurating appropriate registers. Fixes: f7e086e754fe ("octeontx2-af: Pause frame configuration at cgx") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240326052720.4441-1-hkelam@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-28net: lan743x: Add set RFE read fifo threshold for PCI1x1x chipsRaju Lakkaraju2-0/+22
PCI11x1x Rev B0 devices might drop packets when receiving back to back frames at 2.5G link speed. Change the B0 Rev device's Receive filtering Engine FIFO threshold parameter from its hardware default of 4 to 3 dwords to prevent the problem. Rev C0 and later hardware already defaults to 3 dwords. Fixes: bb4f6bffe33c ("net: lan743x: Add PCI11010 / PCI11414 device IDs") Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240326065805.686128-1-Raju.Lakkaraju@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-28drm/qxl: remove unused variable from `qxl_process_single_command()`Miguel Ojeda1-3/+1
Clang 14 in an (essentially) defconfig loongarch64 build for next-20240327 reports [1]: drivers/gpu/drm/qxl/qxl_ioctl.c:148:14: error: variable 'num_relocs' set but not used [-Werror,-Wunused-but-set-variable] The variable was originally used in the `out_free_bos` label, but commit 74d9a6335dce ("drm/qxl: Simplify cleaning qxl processing command") removed the use that happened in that label. Thus remove the unused variable. Fixes: 74d9a6335dce ("drm/qxl: Simplify cleaning qxl processing command") Closes: https://lore.kernel.org/lkml/CANiq72kqqQfUxLkHJYqeBAhpc6YcX7bfR96gmmbF=j8hEOykqw@mail.gmail.com/ [1] Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/r/20240327175556.233126-2-ojeda@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28drm/qxl: remove unused `count` variable from `qxl_surface_id_alloc()`Miguel Ojeda1-2/+0
Clang 14 in an (essentially) defconfig loongarch64 build for next-20240326 reports [1]: drivers/gpu/drm/qxl/qxl_cmd.c:424:6: error: variable 'count' set but not used [-Werror,-Wunused-but-set-variable] The variable is already unused in the version that got into the tree. Thus remove the unused variable. Fixes: f64122c1f6ad ("drm: add new QXL driver. (v1.4)") Closes: https://lore.kernel.org/lkml/CANiq72mjc5t4n25SQvYSrOEhxxpXYPZ4pPzneSJHEnc3qApu2Q@mail.gmail.com/ [1] Closes: https://lore.kernel.org/all/20240327163331.GB1153323@dev-arch.thelio-3990X/ Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/r/20240327175556.233126-1-ojeda@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-03-28net: bcmasp: Remove phy_{suspend/resume}Justin Chen1-14/+1
phy_{suspend/resume} is redundant. It gets called from phy_{stop/start}. Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-28net: bcmasp: Bring up unimac after PHY link upJustin Chen1-9/+19
The unimac requires the PHY RX clk during reset or it may be put into a bad state. Bring up the unimac after link up to ensure the PHY RX clk exists. Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-28net: phy: qcom: at803x: fix kernel panic with at8031_probeChristian Marangi1-1/+3
On reworking and splitting the at803x driver, in splitting function of at803x PHYs it was added a NULL dereference bug where priv is referenced before it's actually allocated and then is tried to write to for the is_1000basex and is_fiber variables in the case of at8031, writing on the wrong address. Fix this by correctly setting priv local variable only after at803x_probe is called and actually allocates priv in the phydev struct. Reported-by: William Wortel <wwortel@dorpstraat.com> Cc: <stable@vger.kernel.org> Fixes: 25d2ba94005f ("net: phy: at803x: move specific at8031 probe mode check to dedicated probe") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240325190621.2665-1-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-28drm/i915: add bug.h include to i915_memcpy.cDave Airlie1-0/+1
This is stopping me building here for some reason, /home/airlied/devel/kernel/dim/drm-fixes/drivers/gpu/drm/i915/i915_memcpy.c: In function ‘i915_unaligned_memcpy_from_wc’: /home/airlied/devel/kernel/dim/drm-fixes/drivers/gpu/drm/i915/i915_memcpy.c:33:25: error: implicit declaration of function ‘BUG_ON’; did you mean ‘CI_BUG_ON’? [-Werror=implicit-function-declaration] 33 | #define CI_BUG_ON(expr) BUG_ON(expr) | ^~~~~~ /home/airlied/devel/kernel/dim/drm-fixes/drivers/gpu/drm/i915/i915_memcpy.c:144:9: note: in expansion of macro ‘CI_BUG_ON’ 144 | CI_BUG_ON(!i915_has_memcpy_from_wc()); | ^~~~~~~~~ engage maintainer overrides :-) Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-28iommu: Validate the PASID in iommu_attach_device_pasid()Jason Gunthorpe1-1/+10
The SVA code checks that the PASID is valid for the device when assigning the PASID to the MM, but the normal PAGING related path does not check it. Devices that don't support PASID or PASID values too large for the device should not invoke the driver callback. The drivers should rely on the core code for this enforcement. Fixes: 16603704559c7a68 ("iommu: Add attach/detach_dev_pasid iommu interfaces") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v1-460705442b30+659-iommu_check_pasid_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-03-28Merge tag 'amd-drm-fixes-6.9-2024-03-27' of ↵Dave Airlie35-272/+328
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.9-2024-03-27: amdgpu: - SMU 14.0.1 updates - DCN 3.5.x updates - VPE fix - eDP panel flickering fix - Suspend fix - PSR fix - DCN 3.0+ fix - VCN 4.0.6 updates - debugfs fix amdkfd: - DMA-Buf fix - GFX 9.4.2 TLB flush fix - CP interrupt fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328025342.8700-1-alexander.deucher@amd.com
2024-03-28Merge tag 'wireless-2024-03-27' of ↵Jakub Kicinski12-53/+105
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.9-rc2 The first fixes for v6.9. Ping-Ke Shih now maintains a separate tree for Realtek drivers, document that in the MAINTAINERS. Plenty of fixes for both to stack and iwlwifi. Our kunit tests were working only on um architecture but that's fixed now. * tag 'wireless-2024-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (21 commits) MAINTAINERS: wifi: mwifiex: add Francesco as reviewer kunit: fix wireless test dependencies wifi: iwlwifi: mvm: include link ID when releasing frames wifi: iwlwifi: mvm: handle debugfs names more carefully wifi: iwlwifi: mvm: guard against invalid STA ID on removal wifi: iwlwifi: read txq->read_ptr under lock wifi: iwlwifi: fw: don't always use FW dump trig wifi: iwlwifi: mvm: rfi: fix potential response leaks wifi: mac80211: correctly set active links upon TTLM wifi: iwlwifi: mvm: Configure the link mapping for non-MLD FW wifi: iwlwifi: mvm: consider having one active link wifi: iwlwifi: mvm: pick the version of SESSION_PROTECTION_NOTIF wifi: mac80211: fix prep_connection error path wifi: cfg80211: fix rdev_dump_mpp() arguments order wifi: iwlwifi: mvm: disable MLO for the time being wifi: cfg80211: add a flag to disable wireless extensions wifi: mac80211: fix ieee80211_bss_*_flags kernel-doc wifi: mac80211: check/clear fast rx for non-4addr sta VLAN changes wifi: mac80211: fix mlme_link_id_dbg() MAINTAINERS: wifi: add git tree for Realtek WiFi drivers ... ==================== Link: https://lore.kernel.org/r/20240327191346.1A1EAC433C7@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-27drm/vmwgfx: Create debugfs ttm_resource_manager entry only if neededJocelyn Falempe1-6/+9
The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the corresponding ttm_resource_manager is not allocated. This leads to a crash when trying to read from this file. Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file only when the corresponding ttm_resource_manager is allocated. crash> bt PID: 3133409 TASK: ffff8fe4834a5000 CPU: 3 COMMAND: "grep" #0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3 #1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a #2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1 #3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1 #4 [ffffb954506b3c70] no_context at ffffffffb2a7e913 #5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c #6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887 #7 [ffffb954506b3d40] page_fault at ffffffffb360116e [exception RIP: ttm_resource_manager_debug+0x11] RIP: ffffffffc04afd11 RSP: ffffb954506b3df0 RFLAGS: 00010246 RAX: ffff8fe41a6d1200 RBX: 0000000000000000 RCX: 0000000000000940 RDX: 0000000000000000 RSI: ffffffffc04b4338 RDI: 0000000000000000 RBP: ffffb954506b3e08 R8: ffff8fee3ffad000 R9: 0000000000000000 R10: ffff8fe41a76a000 R11: 0000000000000001 R12: 00000000ffffffff R13: 0000000000000001 R14: ffff8fe5bb6f3900 R15: ffff8fe41a6d1200 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm] #9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3 RIP: 00007f4c4eda8985 RSP: 00007ffdbba9e9f8 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 000000000037e000 RCX: 00007f4c4eda8985 RDX: 000000000037e000 RSI: 00007f4c41573000 RDI: 0000000000000003 RBP: 000000000037e000 R8: 0000000000000000 R9: 000000000037fe30 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4c41573000 R13: 0000000000000003 R14: 00007f4c41572010 R15: 0000000000000003 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Fixes: af4a25bbe5e7 ("drm/vmwgfx: Add debugfs entries for various ttm resource managers") Cc: <stable@vger.kernel.org> Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240312093551.196609-1-jfalempe@redhat.com
2024-03-27Fix build errors due to new UIO_MEM_DMA_COHERENT messLinus Torvalds3-4/+4
Commit 576882ef5e7f ("uio: introduce UIO_MEM_DMA_COHERENT type") introduced a new use-case for 'struct uio_mem' where the 'mem' field now contains a kernel virtual address when 'memtype' is set to UIO_MEM_DMA_COHERENT. That in turn causes build errors, because 'mem' is of type 'phys_addr_t', and a virtual address is a pointer type. When the code just blindly uses cast to mix the two, it caused problems when phys_addr_t isn't the same size as a pointer - notably on 32-bit architectures with PHYS_ADDR_T_64BIT. The proper thing to do would probably be to use a union member, and not have any casts, and make the 'mem' member be a union of 'mem.physaddr' and 'mem.vaddr', based on 'memtype'. This is not that proper thing. This is just fixing the ugly casts to be even uglier, but at least not cause build errors on 32-bit platforms with 64-bit physical addresses. Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 576882ef5e7f ("uio: introduce UIO_MEM_DMA_COHERENT type") Fixes: 7722151e4651 ("uio_pruss: UIO_MEM_DMA_COHERENT conversion") Fixes: 019947805a8d ("uio_dmem_genirq: UIO_MEM_DMA_COHERENT conversion") Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Chris Leech <cleech@redhat.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linuxfoundation.org>
2024-03-27thermal: devfreq_cooling: Fix perf state when calculate dfc res_utilYe Zhang1-1/+1
The issue occurs when the devfreq cooling device uses the EM power model and the get_real_power() callback is provided by the driver. The EM power table is sorted ascending,can't index the table by cooling device state,so convert cooling state to performance state by dfc->max_state - dfc->capped_state. Fixes: 615510fe13bd ("thermal: devfreq_cooling: remove old power model and use EM") Cc: 5.11+ <stable@vger.kernel.org> # 5.11+ Signed-off-by: Ye Zhang <ye.zhang@rock-chips.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27drm/amdgpu: fix deadlock while reading mqd from debugfsJohannes Weiner1-17/+29
An errant disk backup on my desktop got into debugfs and triggered the following deadlock scenario in the amdgpu debugfs files. The machine also hard-resets immediately after those lines are printed (although I wasn't able to reproduce that part when reading by hand): [ 1318.016074][ T1082] ====================================================== [ 1318.016607][ T1082] WARNING: possible circular locking dependency detected [ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 Not tainted [ 1318.017598][ T1082] ------------------------------------------------------ [ 1318.018096][ T1082] tar/1082 is trying to acquire lock: [ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80 [ 1318.019084][ T1082] [ 1318.019084][ T1082] but task is already holding lock: [ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.020607][ T1082] [ 1318.020607][ T1082] which lock already depends on the new lock. [ 1318.020607][ T1082] [ 1318.022081][ T1082] [ 1318.022081][ T1082] the existing dependency chain (in reverse order) is: [ 1318.023083][ T1082] [ 1318.023083][ T1082] -> #2 (reservation_ww_class_mutex){+.+.}-{3:3}: [ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0 [ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90 [ 1318.025161][ T1082] dma_resv_lockdep+0x18a/0x330 [ 1318.025683][ T1082] do_one_initcall+0x6a/0x350 [ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.026728][ T1082] kernel_init+0x15/0x1a0 [ 1318.027242][ T1082] ret_from_fork+0x2c/0x40 [ 1318.027759][ T1082] ret_from_fork_asm+0x11/0x20 [ 1318.028281][ T1082] [ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}: [ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330 [ 1318.029790][ T1082] do_one_initcall+0x6a/0x350 [ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310 [ 1318.030722][ T1082] kernel_init+0x15/0x1a0 [ 1318.031168][ T1082] ret_from_fork+0x2c/0x40 [ 1318.031598][ T1082] ret_from_fork_asm+0x11/0x20 [ 1318.032011][ T1082] [ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}: [ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680 [ 1318.033141][ T1082] lock_acquire+0xcd/0x2c0 [ 1318.033487][ T1082] __might_fault+0x58/0x80 [ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu] [ 1318.034181][ T1082] full_proxy_read+0x55/0x80 [ 1318.034487][ T1082] vfs_read+0xa7/0x360 [ 1318.034788][ T1082] ksys_read+0x70/0xf0 [ 1318.035085][ T1082] do_syscall_64+0x94/0x180 [ 1318.035375][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e [ 1318.035664][ T1082] [ 1318.035664][ T1082] other info that might help us debug this: [ 1318.035664][ T1082] [ 1318.036487][ T1082] Chain exists of: [ 1318.036487][ T1082] &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex [ 1318.036487][ T1082] [ 1318.037310][ T1082] Possible unsafe locking scenario: [ 1318.037310][ T1082] [ 1318.037838][ T1082] CPU0 CPU1 [ 1318.038101][ T1082] ---- ---- [ 1318.038350][ T1082] lock(reservation_ww_class_mutex); [ 1318.038590][ T1082] lock(reservation_ww_class_acquire); [ 1318.038839][ T1082] lock(reservation_ww_class_mutex); [ 1318.039083][ T1082] rlock(&mm->mmap_lock); [ 1318.039328][ T1082] [ 1318.039328][ T1082] *** DEADLOCK *** [ 1318.039328][ T1082] [ 1318.040029][ T1082] 1 lock held by tar/1082: [ 1318.040259][ T1082] #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] [ 1318.040560][ T1082] [ 1318.040560][ T1082] stack backtrace: [ 1318.041053][ T1082] CPU: 22 PID: 1082 Comm: tar Not tainted 6.8.0-rc7-00015-ge0c8221b72c0 #17 3316c85d50e282c5643b075d1f01a4f6365e39c2 [ 1318.041329][ T1082] Hardware name: Gigabyte Technology Co., Ltd. B650 AORUS PRO AX/B650 AORUS PRO AX, BIOS F20 12/14/2023 [ 1318.041614][ T1082] Call Trace: [ 1318.041895][ T1082] <TASK> [ 1318.042175][ T1082] dump_stack_lvl+0x4a/0x80 [ 1318.042460][ T1082] check_noncircular+0x145/0x160 [ 1318.042743][ T1082] __lock_acquire+0x14bf/0x2680 [ 1318.043022][ T1082] lock_acquire+0xcd/0x2c0 [ 1318.043301][ T1082] ? __might_fault+0x40/0x80 [ 1318.043580][ T1082] ? __might_fault+0x40/0x80 [ 1318.043856][ T1082] __might_fault+0x58/0x80 [ 1318.044131][ T1082] ? __might_fault+0x40/0x80 [ 1318.044408][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu 8fe2afaa910cbd7654c8cab23563a94d6caebaab] [ 1318.044749][ T1082] full_proxy_read+0x55/0x80 [ 1318.045042][ T1082] vfs_read+0xa7/0x360 [ 1318.045333][ T1082] ksys_read+0x70/0xf0 [ 1318.045623][ T1082] do_syscall_64+0x94/0x180 [ 1318.045913][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.046201][ T1082] ? lockdep_hardirqs_on+0x7d/0x100 [ 1318.046487][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.046773][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047057][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047337][ T1082] ? do_syscall_64+0xa0/0x180 [ 1318.047611][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e [ 1318.047887][ T1082] RIP: 0033:0x7f480b70a39d [ 1318.048162][ T1082] Code: 91 ba 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb b2 e8 18 a3 01 00 0f 1f 84 00 00 00 00 00 80 3d a9 3c 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 53 48 83 [ 1318.048769][ T1082] RSP: 002b:00007ffde77f5c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 1318.049083][ T1082] RAX: ffffffffffffffda RBX: 0000000000000800 RCX: 00007f480b70a39d [ 1318.049392][ T1082] RDX: 0000000000000800 RSI: 000055c9f2120c00 RDI: 0000000000000008 [ 1318.049703][ T1082] RBP: 0000000000000800 R08: 000055c9f2120a94 R09: 0000000000000007 [ 1318.050011][ T1082] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c9f2120c00 [ 1318.050324][ T1082] R13: 0000000000000008 R14: 0000000000000008 R15: 0000000000000800 [ 1318.050638][ T1082] </TASK> amdgpu_debugfs_mqd_read() holds a reservation when it calls put_user(), which may fault and acquire the mmap_sem. This violates the established locking order. Bounce the mqd data through a kernel buffer to get put_user() out of the illegal section. Fixes: 445d85e3c1df ("drm/amdgpu: add debugfs interface for reading MQDs") Cc: stable@vger.kernel.org # v6.5+ Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amdgpu: enable UMSCH 4.0.6Lang Yu3-4/+16
Share same codes with 4.0.5 and enable collaborate mode for VPE. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amdgpu/umsch: update UMSCH 4.0 FW interfaceLang Yu2-12/+21
Align with FW changes. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Set DCN351 BB and IP the same as DCN35Xi Liu1-5/+1
[WHY & HOW] DCN351 and DCN35 should use the same bounding box and IP settings. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Jun Lei <jun.lei@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Xi Liu <xi.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Fix bounds check for dcn35 DcfClocksRoman Li1-1/+1
[Why] NumFclkLevelsEnabled is used for DcfClocks bounds check instead of designated NumDcfClkLevelsEnabled. That can cause array index out-of-bounds access. [How] Use designated variable for dcn35 DcfClocks bounds check. Fixes: a8edc9cc0b14 ("drm/amd/display: Fix array-index-out-of-bounds in dcn35_clkmgr") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Remove MPC rate control logic from DCN30 and aboveGeorge Shen6-155/+41
[Why] MPC flow rate control is not needed for DCN30 and above. Current logic that uses it can result in underflow for certain edge cases (such as DSC N422 + ODM combine + 422 left edge pixel). [How] Remove MPC flow rate control logic and programming for DCN30 and above. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: fix a dereference of a NULL pointerWenjing Liu1-2/+4
[why&how] In some platform out_transfer_func may not be popualted. We need to check for null before dereferencing it. Fixes: d2dea1f14038 ("drm/amd/display: Generalize new minimal transition path") Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Send DTBCLK disable message on first commitTaimur Hassan1-0/+5
[Why] Previous patch to allow DTBCLK disable didn't address boot case. Driver thinks DTBCLK is disabled by default, so we don't send disable message to PMFW. DTBCLK is then enabled at idle desktop on boot, burning power. [How] Set dtbclk_en to true on boot so that disable message is sent during first commit. Fixes: 27750e176a4f ("drm/amd/display: Allow DTBCLK disable for DCN35") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <syed.hassan@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Update dcn351 to latest dcn35 configSung Joon Kim3-7/+15
[why & how] There were some fixes in dcn35 that need to be ported over to dcn351 to prevent any regression. Signed-off-by: Sung Joon Kim <sungkim@amd.com> Reviewed-by: Liu, Xi (Alex) <xiliu102@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: fix IPX enablementHamza Mahfooz2-4/+6
We need to re-enable idle power optimizations after entering PSR. Since, we get kicked out of idle power optimizations before entering PSR (entering PSR requires us to write to DCN registers, which isn't allowed while we are in IPS). Fixes: a9b1a4f684b3 ("drm/amd/display: Add more checks for exiting idle in DC") Tested-by: Mark Broadworth <mark.broadworth@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd: Flush GFXOFF requests in prepare stageMario Limonciello1-0/+2
If the system hasn't entered GFXOFF when suspend starts it can cause hangs accessing GC and RLC during the suspend stage. Cc: <stable@vger.kernel.org> # 6.1.y: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback") Cc: <stable@vger.kernel.org> # 6.1.y: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks") Cc: <stable@vger.kernel.org> # 6.1.y: 2ceec37b0e3d ("drm/amd: Add missing kernel doc for prepare_suspend()") Cc: <stable@vger.kernel.org> # 6.1.y: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend") Cc: <stable@vger.kernel.org> # 6.6.y: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback") Cc: <stable@vger.kernel.org> # 6.6.y: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks") Cc: <stable@vger.kernel.org> # 6.6.y: 2ceec37b0e3d ("drm/amd: Add missing kernel doc for prepare_suspend()") Cc: <stable@vger.kernel.org> # 6.6.y: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend") Cc: <stable@vger.kernel.org> # 6.1+ Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132 Fixes: ab4750332dbe ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amdkfd: range check cp bad op exception interruptsJonathan Kim3-3/+6
Due to a CP interrupt bug, bad packet garbage exception codes are raised. Do a range check so that the debugger and runtime do not receive garbage codes. Update the user api to guard exception code type checking as well. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Tested-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27Revert "drm/amd/display: Fix sending VSC (+ colorimetry) packets for DP/eDP ↵Harry Wentland2-13/+8
displays without PSR" This causes flicker on a bunch of eDP panels. The info_packet code also caused regressions on other OSes that we haven't' seen on Linux yet, but that is likely due to the fact that we haven't had a chance to test those environments on Linux. We'll need to revisit this. This reverts commit 202260f64519e591b5cd99626e441b6559f571a3. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3207 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3151 Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-03-27drm/amdkfd: fix TLB flush after unmap for GFX9.4.2Eric Huang1-1/+1
TLB flush after unmap accidentially was removed on gfx9.4.2. It is to add it back. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-03-27drm/amdgpu/vpe: power on vpe when hw_initPeyton Lee1-0/+6
To fix mode2 reset failure. Should power on VPE when hw_init. Signed-off-by: Peyton Lee <peytolee@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: increase bb clock for DCN351Xi Liu1-14/+76
[Why and how] Bounding box clocks for DCN351 should be increased as per request Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Xi Liu <xi.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Prevent crash when disable streamChris Park1-1/+2
[Why] Disabling stream encoder invokes a function that no longer exists. [How] Check if the function declaration is NULL in disable stream encoder. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amd/display: Increase Z8 watermark times.Natanel Roizenman2-4/+4
Increase Z8 watermark times from 210->250us and 320->350us. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Natanel Roizenman <natanel.roizenman@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-27drm/amdkfd: Check cgroup when returning DMABuf infoMukul Joshi1-2/+2
Check cgroup permissions when returning DMA-buf info and based on cgroup info return the GPU id of the GPU that have access to the BO. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>