summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2024-01-19Merge tag 'drm-xe-next-fixes-2024-01-16' of ↵Dave Airlie18-91/+136
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver Changes: - Fix for definition of wakeref_t - Fix for an error code aliasing - Fix for VM_UNBIND_ALL in the case there are no bound VMAs - Fixes for a number of __iomem address space mismatches reported by sparse - Fixes for the assignment of exec_queue priority - A Fix for skip_guc_pc not taking effect - Workaround for a build problem on GCC 11 - A couple of fixes for error paths - Fix a Flat CCS compression metadata copy issue - Fix a misplace array bounds checking - Don't have display support depend on EXPERT (as discussed on IRC) Signed-off-by: Dave Airlie <airlied@redhat.com> From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240116102204.106520-1-thomas.hellstrom@linux.intel.com
2024-01-19nouveau/vmm: don't set addr on the fail path to avoid warningDave Airlie1-0/+3
nvif_vmm_put gets called if addr is set, but if the allocation fails we don't need to call put, otherwise we get a warning like [523232.435671] ------------[ cut here ]------------ [523232.435674] WARNING: CPU: 8 PID: 1505697 at drivers/gpu/drm/nouveau/nvif/vmm.c:68 nvif_vmm_put+0x72/0x80 [nouveau] [523232.435795] Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep sunrpc binfmt_misc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common iwlmvm nfit libnvdimm vfat fat x86_pkg_temp_thermal intel_powerclamp mac80211 snd_soc_avs snd_soc_hda_codec coretemp snd_hda_ext_core snd_soc_core snd_hda_codec_realtek kvm_intel snd_hda_codec_hdmi snd_compress snd_hda_codec_generic ac97_bus snd_pcm_dmaengine snd_hda_intel libarc4 snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm iwlwifi snd_hda_core btusb snd_hwdep btrtl snd_seq btintel irqbypass btbcm rapl snd_seq_device eeepc_wmi btmtk intel_cstate iTCO_wdt cfg80211 snd_pcm asus_wmi bluetooth intel_pmc_bxt iTCO_vendor_support snd_timer ledtrig_audio pktcdvd snd mei_me [523232.435828] sparse_keymap intel_uncore i2c_i801 platform_profile wmi_bmof mei pcspkr ioatdma soundcore i2c_smbus rfkill idma64 dca joydev acpi_tad loop zram nouveau drm_ttm_helper ttm video drm_exec drm_gpuvm gpu_sched crct10dif_pclmul i2c_algo_bit nvme crc32_pclmul crc32c_intel drm_display_helper polyval_clmulni nvme_core polyval_generic e1000e mxm_wmi cec ghash_clmulni_intel r8169 sha512_ssse3 nvme_common wmi pinctrl_sunrisepoint uas usb_storage ip6_tables ip_tables fuse [523232.435849] CPU: 8 PID: 1505697 Comm: gnome-shell Tainted: G W 6.6.0-rc7-nvk-uapi+ #12 [523232.435851] Hardware name: System manufacturer System Product Name/ROG STRIX X299-E GAMING II, BIOS 1301 09/24/2021 [523232.435852] RIP: 0010:nvif_vmm_put+0x72/0x80 [nouveau] [523232.435934] Code: 00 00 48 89 e2 be 02 00 00 00 48 c7 04 24 00 00 00 00 48 89 44 24 08 e8 fc bf ff ff 85 c0 75 0a 48 c7 43 08 00 00 00 00 eb b3 <0f> 0b eb f2 e8 f5 c9 b2 e6 0f 1f 44 00 00 90 90 90 90 90 90 90 90 [523232.435936] RSP: 0018:ffffc900077ffbd8 EFLAGS: 00010282 [523232.435937] RAX: 00000000fffffffe RBX: ffffc900077ffc00 RCX: 0000000000000010 [523232.435938] RDX: 0000000000000010 RSI: ffffc900077ffb38 RDI: ffffc900077ffbd8 [523232.435940] RBP: ffff888e1c4f2140 R08: 0000000000000000 R09: 0000000000000000 [523232.435940] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888503811800 [523232.435941] R13: ffffc900077ffca0 R14: ffff888e1c4f2140 R15: ffff88810317e1e0 [523232.435942] FS: 00007f933a769640(0000) GS:ffff88905fa00000(0000) knlGS:0000000000000000 [523232.435943] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [523232.435944] CR2: 00007f930bef7000 CR3: 00000005d0322001 CR4: 00000000003706e0 [523232.435945] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [523232.435946] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [523232.435964] Call Trace: [523232.435965] <TASK> [523232.435966] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436051] ? __warn+0x81/0x130 [523232.436055] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436138] ? report_bug+0x171/0x1a0 [523232.436142] ? handle_bug+0x3c/0x80 [523232.436144] ? exc_invalid_op+0x17/0x70 [523232.436145] ? asm_exc_invalid_op+0x1a/0x20 [523232.436149] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436230] ? nvif_vmm_put+0x64/0x80 [nouveau] [523232.436342] nouveau_vma_del+0x80/0xd0 [nouveau] [523232.436506] nouveau_vma_new+0x1a0/0x210 [nouveau] [523232.436671] nouveau_gem_object_open+0x1d0/0x1f0 [nouveau] [523232.436835] drm_gem_handle_create_tail+0xd1/0x180 [523232.436840] drm_prime_fd_to_handle_ioctl+0x12e/0x200 [523232.436844] ? __pfx_drm_prime_fd_to_handle_ioctl+0x10/0x10 [523232.436847] drm_ioctl_kernel+0xd3/0x180 [523232.436849] drm_ioctl+0x26d/0x4b0 [523232.436851] ? __pfx_drm_prime_fd_to_handle_ioctl+0x10/0x10 [523232.436855] nouveau_drm_ioctl+0x5a/0xb0 [nouveau] [523232.437032] __x64_sys_ioctl+0x94/0xd0 [523232.437036] do_syscall_64+0x5d/0x90 [523232.437040] ? syscall_exit_to_user_mode+0x2b/0x40 [523232.437044] ? do_syscall_64+0x6c/0x90 [523232.437046] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Reported-by: Faith Ekstrand <faith.ekstrand@collabora.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240117213852.295565-1-airlied@gmail.com
2024-01-19Merge tag 'amd-drm-fixes-6.8-2024-01-18' of ↵Dave Airlie67-373/+554
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-6.8-2024-01-18: amdgpu: - DSC fixes - DC resource pool fixes - OTG fix - DML2 fixes - Aux fix - GFX10 RLC firmware handling fix - Revert a broken workaround for SMU 13.0.2 - DC writeback fix - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW version amdkfd: - Fix dma-buf exports using GEM handles Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240118223127.4904-1-alexander.deucher@amd.com
2024-01-19drm/amdgpu: Enable GFXOFF for Compute on GFX11Ori Messinger1-4/+2
On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware issue, which has since been fixed after version 64. This patch only re-enables GFXOFF for GFX version 11 if the GPU's MES KIQ firmware version is newer than version 64. V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64. V3: Add parentheses to avoid GCC warning for parentheses: "suggest parentheses around comparison in operand of ‘&’" V4: Remove "V3" from commit title V5: Change commit description and insert 'Acked-by' Signed-off-by: Ori Messinger <Ori.Messinger@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-01-19drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for ↵Srinivasan Shanmugam1-3/+3
writeback requests. Return value of 'to_amdgpu_crtc' which is container_of(...) can't be null, so it's null check 'acrtc' is dropped. Fixing the below: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9302 amdgpu_dm_atomic_commit_tail() error: we previously assumed 'acrtc' could be null (see line 9299) Added 'new_crtc_state' NULL check for function 'drm_atomic_get_new_crtc_state' that retrieves the new state for a CRTC, while enabling writeback requests. Cc: stable@vger.kernel.org Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-19drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"Christian König4-65/+1
Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the HW in an active state and is an unbalanced use of the IP callbacks. Using the IP callbacks like this can lead to memory leaks, double free and imbalanced reference counters. Leaving the HW in an active state can lead to DMA accesses to memory now freed by the driver. Both is a complete no-go for driver unload so completely revert the workaround for now. This reverts commit f5c7e7797060255dbc8160734ccc5ad6183c5e04. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-01-19drm/amdkfd: init drm_client with funcs hookFlora Cui1-1/+4
otherwise drm_client_dev_unregister() would try to kfree(&adev->kfd.client). Fixes: 1819200166ce ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-19drm/amd/display: Fix a switch statement in ↵Christophe JAILLET1-1/+1
populate_dml_output_cfg_from_stream_state() It is likely that the statement related to 'dml_edp' is misplaced. So move it in the correct "case SIGNAL_TYPE_EDP". Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-01-19drm/amdgpu: Fix the null pointer when load rlc firmwareMa Jun1-9/+6
If the RLC firmware is invalid because of wrong header size, the pointer to the rlc firmware is released in function amdgpu_ucode_request. There will be a null pointer error in subsequent use. So skip validation to fix it. Fixes: 3da9b71563cb ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX10") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-01-19drm/amd/display: Align the returned error code with legacy DPWayne Lin1-0/+5
[Why] For usb4 connector, AUX transaction is handled by dmub utilizing a differnt code path comparing to legacy DP connector. If the usb4 DP connector is disconnected, AUX access will report EBUSY and cause igt@kms_dp_aux_dev fail. [How] Align the error code with the one reported by legacy DP as EIO. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-19drm/amd/display: Fix DML2 watermark calculationOvidiu Bunea1-7/+7
[Why] core_mode_programming in DML2 should output watermark calculations to locals, but it incorrectly uses mode_lib [How] update code to match HW DML2 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-19drm/amd/display: Clear OPTC mem select on disableIlya Bakoulin2-0/+6
[Why] Not clearing the memory select bits prior to OPTC disable can cause DSC corruption issues when attempting to reuse a memory instance for another OPTC that enables ODM. [How] Clear the memory select bits prior to disabling an OPTC. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ilya Bakoulin <ilya.bakoulin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-19drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/ANicholas Kazlauskas1-12/+9
[Why] We can experience DENTIST hangs during optimize_bandwidth or TDRs if FIFO is toggled and hangs. [How] Port the DCN35 fixes to DCN314. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-19drm/amd/display: Add logging resource checksCharlene Liu3-3/+10
[Why] When mapping resources, resources could be unavailable. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Sung joon Kim <sungjoon.kim@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Charlene Liu <charlene.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-19drm/amd/display: Init link enc resources in dc_state only if res_pool presentsDillon Varone1-1/+2
[Why & How] res_pool is not initialized in all situations such as virtual environments, and therefore link encoder resources should not be initialized if res_pool is NULL. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Martin Leung <martin.leung@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-19drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'Srinivasan Shanmugam1-2/+6
In link_set_dsc_pps_packet(), 'struct display_stream_compressor *dsc' was dereferenced in a DC_LOGGER_INIT(dsc->ctx->logger); before the 'dsc' NULL pointer check. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_dpms.c:905 link_set_dsc_pps_packet() warn: variable dereferenced before check 'dsc' (see line 903) Cc: stable@vger.kernel.org Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amd/display: Avoid enum conversion warningNathan Chancellor1-2/+3
Clang warns (or errors with CONFIG_WERROR=y) when performing arithmetic with different enumerated types, which is usually a bug: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_dpia_bw.c:548:24: error: arithmetic between different enumeration types ('const enum dc_link_rate' and 'const enum dc_lane_count') [-Werror,-Wenum-enum-conversion] 548 | link_cap->link_rate * link_cap->lane_count * LINK_RATE_REF_FREQ_IN_KHZ * 8; | ~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~ 1 error generated. In this case, there is not a problem because the enumerated types are basically treated as '#define' values. Add an explicit cast to an integral type to silence the warning. Closes: https://github.com/ClangBuiltLinux/linux/issues/1976 Fixes: 5f3bce13266e ("drm/amd/display: Request usb4 bw for mst streams") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amd/pm: Fix smuv13.0.6 current clock reportingLijo Lazar1-1/+3
When current clock is equal to max dpm level clock, the level is not indicated correctly with *. Fix by comparing current clock against dpm level value. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x
2024-01-16drm/amd/pm: Add error log for smu v13.0.6 resetLijo Lazar1-5/+6
For all mode-2 reset fail cases, add error log. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x
2024-01-16drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'Srinivasan Shanmugam1-5/+5
Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next return value still needs NULL check, thus modified from "node" to "rb_node". Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 svm_range_get_range_boundaries() warn: can 'node' even be NULL? Suggested-by: Philip Yang <Philip.Yang@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: drop exp hw support check for GC 9.4.3Alex Deucher1-2/+0
No longer needed. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x
2024-01-16drm/amdgpu: move debug options init prior to amdgpu device initLe Ma1-2/+2
To bring debug options into effect in early initialization phase Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: add debug flag to place fw bo on vram for frontdoor loadingLe Ma4-2/+10
Use debug_mask=0x8 param to help isolating data path issues on new systems in early phase. v2: rename the flag for explicitness (lijo) Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16Revert "drm/amdgpu: add param to specify fw bo location for front-door loading"Le Ma4-10/+2
This reverts commit c572abffe9f50c8ba33060865449313b3f588c35. Will use debug module param instead of independent module param. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amd/display: Fix variable deferencing before NULL check in ↵Srinivasan Shanmugam1-4/+7
edp_setup_replay() In edp_setup_replay(), 'struct dc *dc' & 'struct dmub_replay *replay' was dereferenced before the pointer 'link' & 'replay' NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_edp_panel_control.c:947 edp_setup_replay() warn: variable dereferenced before check 'link' (see line 933) Cc: stable@vger.kernel.org Cc: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: update regGL2C_CTRL4 value in golden settingYifan Zhang1-1/+1
This patch to update regGL2C_CTRL4 in golden setting. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x
2024-01-16drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'Srinivasan Shanmugam1-0/+1
In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' - 'adev->pm.fw' may not be released before return. Using the function release_firmware() to release adev->pm.fw. Thus fixing the below: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 'adev->pm.fw' from request_firmware() not released on lines: 1554. Cc: Monk Liu <Monk.Liu@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: Fix unsigned comparison with less than zero in ↵Srinivasan Shanmugam1-8/+2
vpe_u1_8_from_fraction() The variables 'numerator' and 'denominator', are unsigned 16-bit integer types, that can never be less than 0. Thus fixing the below: drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:62 vpe_u1_8_from_fraction() warn: unsigned 'numerator' is never less than zero. drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:63 vpe_u1_8_from_fraction() warn: unsigned 'denominator' is never less than zero. Cc: Peyton Lee <peytolee@amd.com> Cc: Lang Yu <lang.yu@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Peyton Lee <peyton.lee@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'Srinivasan Shanmugam1-7/+14
The amdgpu_gmc_vram_checking() function in emulation checks whether all of the memory range of shared system memory could be accessed by GPU, from this aspect, -EIO is returned for error scenarios. Fixes the below: drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:919 gmc_v6_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1103 gmc_v7_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1223 gmc_v8_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2344 gmc_v9_0_hw_init() warn: missing error code? 'r' Cc: Xiaojian Du <Xiaojian.Du@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: Do not program VM_L2_CNTL under SRIOVVictor Lu1-4/+6
VM_L2_CNTL* should not be programmed on driver unload under SRIOV. These regs are skipped during SRIOV driver init. Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Vignesh Chander <Vignesh.Chander@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in ↵Srinivasan Shanmugam1-1/+1
'get_platform_power_management_table()' In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect structure type is passed to sizeof() in kzalloc, larger structure types were used, thus using correct type 'struct phm_ppm_table' fixes the below: drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table vs _ATOM_Tonga_PPM_Table' Cc: Eric Huang <JinHuiEric.Huang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: update ATHUB_MISC_CNTL offset for athub v3.3Yifan Zhang1-0/+8
This patch to update ATHUB_MISC_CNTL offset for athub v3.3 v2: correct a typo (Tim) v3: correct patch title (Lang) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: update headers for nbio v7.11Yifan Zhang1-4/+4
This patch is to update headers for nbio v7.11. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu/pm: clarify debugfs pm outputAlex Deucher1-10/+18
On APUs power is SoC power, not just GPU. Clarify that for UVD/VCE/VCN the IP is powered down, not disabled which can confusing and lead to concerns that the IP is actually not available. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: fall back to INPUT power for AVG power via INFO IOCTLAlex Deucher1-1/+6
For backwards compatibility with userspace. Fixes: 47f1724db4fe ("drm/amd: Introduce `AMDGPU_PP_SENSOR_GPU_INPUT_POWER`") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2897 Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: fix avg vs input power reporting on smu7Alex Deucher1-1/+16
Hawaii, Bonaire, Fiji, and Tonga support average power, the others support current power. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdkfd: fixes for HMM mem allocationDafna Hirschfeld1-3/+3
Fix err return value and reset pgmap->type after checking it. Fixes: c83dee9b6394 ("drm/amdkfd: add SPM support for SVM") Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-15drm/xe: display support should not depend on EXPERTJani Nikula1-1/+1
Remove the DRM_XE_DISPLAY config dependency on EXPERT. I can only presume the idea was only experts should be able to disable it, but the effect is the opposite. Reported-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240111104716.3548744-1-jani.nikula@intel.com (cherry picked from commit 1c7531f50eaa425eca8ff726287b8df3a4a51e55) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: Fix bounds checking in __xe_bo_placement_for_flags()Brian Welty1-6/+6
Requesting all memory regions on PVC will fill bo->placements up to XE_BO_MAX_PLACEMENTS. The subsequent call to try_add_stolen() will trip over the bounds checking even though XE_PL_STOLEN is not expected to be used in this case. This is hit with igt@xe_exec_fault_mode@once-basic-prefetch: xe 0000:8c:00.0: [drm] Assertion `*c < (sizeof(bo->placements) / sizeof((bo->placements)[0]) + ((int)(sizeof(struct { int:(-!!(__builtin_types_compatible_p(typeof((bo->placements)), typeof(&(bo->placements)[0])))); }))))` failed! WARNING: CPU: 30 PID: 6161 at drivers/gpu/drm/xe/xe_bo.c:203 __xe_bo_placement_for_flags+0x218/0x240 [xe] Is fixed here by moving the bounds checks closer to where we actually write into the bo->placement array. Fixes: 8c54ee8a8606 ("drm/xe: Ensure that we don't access the placements array out-of-bounds") Link: https://patchwork.freedesktop.org/patch/msgid/20240111002111.10190-1-brian.welty@intel.com Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> (cherry picked from commit 52e3fa3e3ea3ee05e32c1a8d72bb3ae306a4da64) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe/migrate: Fix CCS copy for small VRAM copy chunksThomas Hellström2-50/+80
Since the migrate code is using the identity map for addressing VRAM, copy chunks may become as small as 64K if the VRAM resource is fragmented. However, a chunk size smaller that 1MiB may lead to the *next* chunk's offset into the CCS metadata backup memory may not be page-aligned, and the XY_CTRL_SURF_COPY_BLT command can't handle that, and even if it could, the current code doesn't handle the offset calculaton correctly. To fix this, make sure we align the size of VRAM copy chunks to 1MiB. If the remaining data to copy is smaller than that, that's not a problem, so use the remaining size. If the VRAM copy cunk becomes fragmented due to the size alignment restriction, don't use the identity map, but instead emit PTEs into the page-table like we do for system memory. v2: - Rebase v3: - Future proof somewhat by taking into account the real data size to flat CCS metadata size ratio. (Matt Roper) - Invert a couple of if-statements for better readability. - Fix support for 4K-granularity VRAM sizes. (Tested on DG1). v4: - Fix up code comments - Fix debug printout format typo. v5: - Add a Fixes: tag. Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Fixes: e89b384cde62 ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240110163415.524165-1-thomas.hellstrom@linux.intel.com (cherry picked from commit ef51d7542d143f3fd9a48d4e2c307563661668aa) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: unlock on error path in xe_vm_add_compute_exec_queue()Dan Carpenter1-3/+4
Drop the "&vm->lock" before returning. Fixes: 24f947d58fe5 ("drm/xe: Use DRM GPUVM helpers for external- and evicted objects") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> (cherry picked from commit cf46019e8550a810cc023af7aa020ba43103b44d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe/selftests: Fix an error pointer dereference bugDan Carpenter1-3/+2
Check if "bo" is an error pointer before calling xe_bo_lock() on it. Fixes: d6abc18d6693 ("drm/xe/xe2: Modify xe_bo_test for system memory") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> (cherry picked from commit 88ec23528b32ddb9ce2e8492f2629b0056353697) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe/device: clean up on error in probe()Dan Carpenter1-1/+1
This error path should clean up before returning. Smatch detected this bug: drivers/gpu/drm/xe/xe_device.c:487 xe_device_probe() warn: missing unwind goto? Fixes: 4cb12b71923b ("drm/xe/xe2: Determine bios enablement for flat ccs on igfx") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> (cherry picked from commit c10da95afa68060e13c5f920d96671943a7e54d9) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: Fix build bug for GCC 11Paul E. McKenney1-1/+0
Building drivers/gpu/drm/xe/xe_gt_pagefault.c with GCC 11 results in the following build errors: ./include/linux/fortify-string.h:57:33: error: writing 16 bytes into a region of size 0 [-Werror=stringop-overflow=] 57 | #define __underlying_memcpy __builtin_memcpy | ^ ./include/linux/fortify-string.h:644:9: note: in expansion of macro ‘__underlying_memcpy’ 644 | __underlying_##op(p, q, __fortify_size); \ | ^~~~~~~~~~~~~ ./include/linux/fortify-string.h:689:26: note: in expansion of macro ‘__fortify_memcpy_chk’ 689 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ | ^~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/xe/xe_gt_pagefault.c:340:17: note: in expansion of macro ‘memcpy’ 340 | memcpy(pf_queue->data + pf_queue->tail, msg, len * sizeof(u32)); | ^~~~~~ In file included from drivers/gpu/drm/xe/xe_device_types.h:17, from drivers/gpu/drm/xe/xe_vm_types.h:16, from drivers/gpu/drm/xe/xe_bo.h:13, from drivers/gpu/drm/xe/xe_gt_pagefault.c:16: drivers/gpu/drm/xe/xe_gt_types.h:102:25: note: at offset [1144, 265324] into destination object ‘tile’ of size 8 102 | struct xe_tile *tile; | ^~~~ Fix these by removing -Wstringop-overflow from drm/xe builds. Closes: https://lore.kernel.org/all/45ad1d0f-a10f-483e-848a-76a30252edbe@paulmck-laptop/ Fixes: 7a8bc11782d3 ("drm/xe: Enable W=1 warnings by default") Suggested-by: Stephen Rothwell <sfr@rothwell.id.au> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ This particular warning is broken on GCC11. In future changes it will be moved to the normal C flags in the top level Makefile (out of Makefile.extrawarn), but accounting for the compiler support. Just remove it out of xe's forced extra warnings for now ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit a109d19992294736abd4f4232ea639e03eb1f9e7) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: Check skip_guc_pc before setting SLPC flagVinay Belgaumkar2-1/+9
Don't set SLPC GuC feature ctl flag if skip_guc_pc is true. v2: Skip the freq related sysfs creation as well (Badal) v3: Remove unnecessary parenthesis (Lucas) Fixes: 975e4a3795d4 ("drm/xe: Manually setup C6 when skip_guc_pc is set") Fixes: bef52b5c7a19 ("drm/xe: Create a xe_gt_freq component for raw management and sysfs") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20240108225842.966066-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 69cac0a8f3ef8db4d62441c4a2686ec676c9facd) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: Fix modifying exec_queue priority in xe_migrate_initBrian Welty4-9/+14
After exec_queue has been created, we cannot simply modify q->priority. This needs to be done by the backend via q->ops. However in this case, it would be more efficient to simply pass a flag when creating the exec_queue and set the desired priority upfront during queue creation. To that end: new flag EXEC_QUEUE_FLAG_HIGH_PRIORITY is introduced. The priority field is moved to be with other scheduling properties and is now exec_queue.sched_props.priority. This is no longer set to initial value by the backend, but is now set within __xe_exec_queue_create(). Fixes: b4eecedc75c1 ("drm/xe: Fix potential deadlock handling page faults") Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> (cherry picked from commit a8004af338f6b3319476ecbed63ea49bf393fc1f) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: Fix guc_exec_queue_set_priorityBrian Welty1-1/+1
We need to set q->priority prior to calling guc_exec_queue_add_msg() as that will call init_policies() and sets the scheduling properties to those stored in the exec_queue. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> (cherry picked from commit b16483f9f8120b530327879fa3ea576e897946da) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: Annotate xe_ttm_stolen_mgr::mapping with __iomemThomas Hellström1-2/+2
The pointer points to IO memory, but the __iomem annotation was incorrectly placed. Annotate it correctly, update its usage accordingly and fix the corresponding sparse error. Fixes: d8b52a02cb40 ("drm/xe: Implement stolen memory.") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-5-thomas.hellstrom@linux.intel.com (cherry picked from commit dcddb6f0b06d454c9a3b2b240a43f0e7310c7f7c) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: Annotate multiple mmio pointers with __iomemThomas Hellström1-3/+3
There are a couple of pointers pointing to MMIO space. Annotate them with __iomem and fix the corresponding sparse warnings. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Fixes: 3b0d4a557996 ("drm/xe: Move register MMIO into xe_tile") Fixes: 399a13323f0d ("drm/xe: add 28-bit address support in struct xe_reg") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Koby Elbaz <kelbaz@habana.ai> Cc: Ofir Bitton <obitton@habana.ai> Cc: Moti Haimovski <mhaimovski@habana.ai> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-4-thomas.hellstrom@linux.intel.com (cherry picked from commit 9d612ee52c6096bc70d43f54921ba2831ffbf1ad) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-01-15drm/xe: Annotate xe_mem_region::mapping with __iomemThomas Hellström2-3/+3
The pointer points to IO memory, but the __iomem annotation was incorrectly placed. Annotate it correctly, update its usage accordingly and fix the corresponding sparse error. Fixes: 0887a2e7ab62 ("drm/xe: Make xe_mem_region struct") Cc: Oak Zeng <oak.zeng@intel.com> Cc: Michael J. Ruhl <michael.j.ruhl@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-3-thomas.hellstrom@linux.intel.com (cherry picked from commit 20855b62a30538361e587cfc7c5245f07d4f826a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>