summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
AgeCommit message (Collapse)AuthorFilesLines
2024-02-15drm/amdgpu: Fix implicit assumtion in gfx11 debug flagsRajneesh Bhardwaj1-2/+2
Gfx11 debug flags mask is currently set with an implicit assumption that no other mqd update flags exist. This needs to be fixed with newly introduced flag UPDATE_FLAG_IS_GWS by the previous patch. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-25drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDsAlex Deucher1-0/+1
This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and newer. On GC 9.x and older, this needs to be set to 0. This can lead to hangs in some mixed graphics and compute workloads. Updated firmware is also required for AQL. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-09-12drm/amdkfd: Checkpoint and restore queues on GFX11David Francis1-0/+41
The code in kfd_mqd_manager_v11.c to support criu dump and restore of queue state was missing. Added it; should be equivalent to kfd_mqd_manager_v10.c. CC: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-12drm/amdkfd: Update CU masking for GFX 9.4.3Mukul Joshi1-1/+1
The CU mask passed from user-space will change based on different spatial partitioning mode. As a result, update CU masking code for GFX9.4.3 to work for all partitioning modes. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-01drm/amdkfd: Add missing gfx11 MQD manager callbacksJay Cornwall1-0/+3
mqd_stride function was introduced in commit 2f77b9a242a2 ("drm/amdkfd: Update MQD management on multi XCC setup") but not assigned for gfx11. Fixes a NULL dereference in debugfs. Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.5.x
2023-07-07drm/amdkfd: Use KIQ to unmap HIQMukul Joshi1-1/+21
Currently, we unmap HIQ by directly writing to HQD registers. This doesn't work for GFX9.4.3. Instead, use KIQ to unmap HIQ, similar to how we use KIQ to map HIQ. Using KIQ to unmap HIQ works for all GFX series post GFXv9. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: add debug suspend and resume process queues operationJonathan Kim1-8/+7
In order to inspect waves from the saved context at any point during a debug session, the debugger must be able to preempt queues to trigger context save by suspending them. On queue suspend, the KFD will copy the context save header information so that the debugger can correctly crawl the appropriate size of the saved context. The debugger must then also be allowed to resume suspended queues. A queue that is newly created cannot be suspended because queue ids are recycled after destruction so the debugger needs to know that this has occurred. Query functions will be later added that will clear a given queue of its new queue status. A queue cannot be destroyed while it is suspended to preserve its saved context during debugger inspection. Have queue destruction block while a queue is suspended and unblocked when it is resumed. Likewise, if a queue is about to be destroyed, it cannot be suspended. Return the number of queues successfully suspended or resumed along with a per queue status array where the upper bits per queue status show that the request was invalid (new/destroyed queue suspend request, missing queue) or an error occurred (HWS in a fatal state so it can't suspend or resume queues). Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: apply trap workaround for gfx11Jonathan Kim1-11/+31
Due to a HW bug, waves in only half the shader arrays can enter trap. When starting a debug session, relocate all waves to the first shader array of each shader engine and mask off the 2nd shader array as unavailable. When ending a debug session, re-enable the 2nd shader array per shader engine. User CU masking per queue cannot be guaranteed to remain functional if requested during debugging (e.g. user cu mask requests only 2nd shader array as an available resource leading to zero HW resources available) nor can runtime be alerted of any of these changes during execution. Make user CU masking and debugging mutual exclusive with respect to availability. If the debugger tries to attach to a process with a user cu masked queue, return the runtime status as enabled but busy. If the debugger tries to attach and fails to reallocate queue waves to the first shader array of each shader engine, return the runtime status as enabled but with an error. In addition, like any other mutli-process debug supported devices, disable trap temporary setup per-process to avoid performance impact from setup overhead. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: setup hw debug registers on driver initializationJonathan Kim1-0/+5
Add missing debug trap registers references and initialize all debug registers on boot by clearing the hardware exception overrides and the wave allocation ID index. The debugger requires that TTMPs 6 & 7 save the dispatch ID to map waves onto dispatch during compute context inspection. In order to correctly set this up, set the special reserved CP bit by default whenever the MQD is initailized. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Update context save handling for multi XCC setup (v2)Mukul Joshi1-0/+1
Context save handling needs to be updated for a multi XCC setup: - On a multi XCC setup, KFD needs to report context save base address and size for each XCC in MQD. - Thunk will allocate a large context save area covering all XCCs which will be equal to: num_of_xccs in a partition * size of context save area for 1 XCC. However, it will report only the size of context save area for 1 XCC only in the ioctl call. - Driver then setups the MQD correctly using the size passed from Thunk and information about number of XCCs in a partition. - Update get_wave_state function to return context save area for all XCCs in the partition. v2: update the get_wave_state function for mqd manager v11 (Morris) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Add XCC instance to kgd2kfd interface (v3)Mukul Joshi1-1/+1
Gfx 9 starts to have multiple XCC instances in one device. Add instance parameter to kgd2kfd functions where XCC instance was hard coded as 0. Also, update code to pass the correct instance number when running on a multi-XCC setup. v2: introduce the XCC instance to gfx v11 (Morris) v3: rebase (Alex) Signed-off-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Introduce kfd_node struct (v5)Mukul Joshi1-9/+9
Introduce a new structure, kfd_node, which will now represent a compute node. kfd_node is carved out of kfd_dev structure. kfd_dev struct now will become the parent of kfd_node, and will store common resources such as doorbells, GTT sub-alloctor etc. kfd_node struct will store all resources specific to a compute node, such as device queue manager, interrupt handling etc. This is the first step in adding compute partition support in KFD. v2: introduce kfd_node struct to gc v11 (Hawking) v3: make reference to kfd_dev struct through kfd_node (Morris) v4: use kfd_node instead for kfd isr/mqd functions (Morris) v5: rebase (Alex) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: Enable GFX11 SDMA context empty interruptGraham Sider1-0/+4
Enable SDMA queue empty context switching. SDMA context switch due to quantum programming no longer done here (as of sdma v6), so re-name sdma_v6_0_ctx_switch_enable to sdma_v6_0_ctxempty_int_enable to reflect this. Also program SDMAx_QUEUEx_SCHEDULE_CNTL for context switch due to quantum in KFD. Set to amdgpu_sdma_phase_quantum (defaults to 32 i.e. 3200us). Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdkfd: Check PCIe atomics support on GFX11 to set CP_HQD_HQ_STATUS0[29]Sreekant Somasekharan1-0/+7
CP_HQD_HQ_STATUS0[29] bit will be used by CPFW to acknowledge whether PCIe atomics are supported. The default value of this bit is set to 0. Driver will check whether PCIe atomics are supported and set the bit to 1 if supported. This will force CPFW to use real atomic ops. If the bit is not set, CPFW will default to read/modify/write using the firmware itself. This is applicable only to GFX11 RS64 CP with MEC FW >= 509. If MEC FW < 509 and for all GFX11 F32 CP, PCIe atomics needs to be supported else it will skip the device. This commit also involves moving amdgpu_amdkfd_device_probe() function call after per-IP early_init loop in amdgpu_device_ip_early_init() function so as to check for RS64 enabled device. Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Reviewed-by: Graham Sider <Graham.Sider@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-24drm/amdkfd: To fix sdma page fault issue for GC 11Ruili Ji1-1/+14
For the MQD memory, KMD would always allocate 4K memory, and mes scheduler would write to the end of MQD for unmap flag. Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-06drm/amdgpu: Enable F32_WPTR_POLL_ENABLE in mqdRuili Ji1-1/+2
This patch is to fix the SDMA user queue doorbell missing issue on SDMA 6.0. F32_WPTR_POLL_ENABLE has to be set if doorbell mode is used. Otherwise ringing SDMA user queue doorbell can't wake up system from gfxoff. Signed-off-by: Ruili Ji <ruiliji2@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-09-29drm/amdkfd: fix MQD init for GFX11 in init_mqdGraham Sider1-0/+4
Set remaining compute_static_thread_mgmt_se* accordingly. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19drm/amdkfd: Use the consolidated MQD manager functions for GFX11Shiwu Zhang1-73/+12
To remove duplication for GFX11 as well, use the common MQD manager functions defined in kfd_mqd_manager.c for all version of managers Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-24drm/amdkfd: Enable GFX11 usermode queue oversubscriptionGraham Sider1-0/+2
Starting with GFX11, MES requires wptr BOs to be GTT allocated/mapped to GART for usermode queues in order to support oversubscription. In the case that work is submitted to an unmapped queue, MES must have a GART wptr address to determine whether the queue should be mapped. This change is accompanied with changes in MES and is applicable for MES_API_VERSION >= 2. v3: - Use amdgpu_vm_bo_lookup_mapping for wptr_bo mapping lookup - Move wptr_bo refcount increment to amdgpu_amdkfd_map_gtt_bo_to_gart - Remove list_del_init from amdgpu_amdkfd_map_gtt_bo_to_gart - Cleanup/fix create_queue wptr_bo error handling v4: - Add MES version shift/mask defines to amdgpu_mes.h - Change version check from MES_VERSION to MES_API_VERSION - Add check in kfd_ioctl_create_queue before wptr bo pin/GART map to ensure bo is a single page. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04drm/amdkfd: Add KFD support for soc21 v3Mukul Joshi1-0/+508
Add initial support for soc21 in KFD compute driver (Mukul) - Add new definition for soc21 device. - Add new file for amdgpu-kfd interface for GFX11 family. - Add new file for queue management, interrupt handling, mqd management for GFX11 family in KFD driver. - Related changes/updates for soc21 device in KFD driver. - Repurpose last 2 entries of SDMA MQD for driver use. v2: Add an optional argument into update queue operation (Mukul) v3: Switch to ip version check, replace kgd_dev with amdgpu_device (Hawking) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>