summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
AgeCommit message (Collapse)AuthorFilesLines
2024-03-04drm/amdgpu: change vm->task_info handlingShashank Sharma1-9/+14
This patch changes the handling and lifecycle of vm->task_info object. The major changes are: - vm->task_info is a dynamically allocated ptr now, and its uasge is reference counted. - introducing two new helper funcs for task_info lifecycle management - amdgpu_vm_get_task_info: reference counts up task_info before returning this info - amdgpu_vm_put_task_info: reference counts down task_info - last put to task_info() frees task_info from the vm. This patch also does logistical changes required for existing usage of vm->task_info. V2: Do not block all the prints when task_info not found (Felix) V3: Fixed review comments from Felix - Fix wrong indentation - No debug message for -ENOMEM - Add NULL check for task_info - Do not duplicate the debug messages (ti vs no ti) - Get first reference of task_info in vm_init(), put last in vm_fini() V4: Fixed review comments from Felix - fix double reference increment in create_task_info - change amdgpu_vm_get_task_info_pasid - additional changes in amdgpu_gem.c while porting Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-22drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_stateVictor Lu1-4/+4
Under SRIOV, programming to VM_CONTEXT*_CNTL regs failed because the current macro does not pass through the correct xcc instance. Use the *REG32_XCC macro in this case. The behaviour without SRIOV is the same without this patch. Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Zhigang Luo <Zhigang.Luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07drm/amdgpu: Avoid fetching VRAM vendor infoLijo Lazar1-8/+0
The present way to fetch VRAM vendor information turns out to be not reliable on GFX 9.4.3 dGPUs as well. Avoid using the data. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-31drm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()'Srinivasan Shanmugam1-2/+2
Return 0 for success scenairos in 'gmc_v6/7/8/9_0_hw_init()' Fixes the below: drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:920 gmc_v6_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1104 gmc_v7_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1224 gmc_v8_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2347 gmc_v9_0_hw_init() warn: missing error code? 'r' Fixes: fac4ebd79fed ("drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-23drm/amdgpu: Avoid fetching vram vendor informationLijo Lazar1-1/+2
For GFX 9.4.3 APUs, the current method of fetching vram vendor information is not reliable. Avoid fetching the information. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-16drm/amdgpu: move kiq_reg_write_reg_wait() out of amdgpu_virt.cAlex Deucher1-5/+7
It's used for more than just SR-IOV now, so move it to amdgpu_gmc.c and rename it to better match the functionality and update the comments in the code paths to better document when each path is used and why. No functional change. Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Shaoyun.Liu@amd.com Cc: Christian.Koenig@amd.com
2024-01-03drm/amdgpu: Fix ecc irq enable/disable unpairedStanley.Yang1-0/+4
The ecc_irq is disabled while GPU mode2 reset suspending process, but not be enabled during GPU mode2 reset resume process. Changed from V1: only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip delete amdgpu_ras_late_resume function Changed from V2: check umc ras supported before put ecc_irq Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13drm/amdgpu: Use the right method to get IP versionLijo Lazar1-1/+1
Replace direct usage of adev->ip_versions with amdgpu_ip_version. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdgpu/gmc9: disable AGP apertureAlex Deucher1-1/+1
We've had misc reports of random IOMMU page faults when this is used. It's just a rarely used optimization anyway, so let's just disable it. It can still be toggled via the module parameter for testing. v2: leave it configurable via module parameter Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1) Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdgpu: add a module parameter to control the AGP apertureAlex Deucher1-1/+1
Add a module parameter to control the AGP aperture. The AGP aperture is an aperture in the GPU's internal address space which provides direct non-paged access to the platform address space. This access is non-snooped so only uncached memory can be accessed. Add a knob so that we can toggle this for debugging. Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17drm/amdgpu: finalizing mem_partitions at the end of GMC v9 sw_finiLe Ma1-2/+3
The valid num_mem_partitions is required during ttm pool fini, thus move the cleanup at the end of the function. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-10drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3David Yat Sin1-2/+5
Change local memory type to MTYPE_UC on revision id 0 Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-10drm/amdgpu: Add xcc param to SRIOV kiq write and WREG32_SOC15_IP_NO_KIQ (v4)Victor Lu1-11/+15
WREG32/RREG32_SOC15_IP_NO_KIQ and amdgpu_virt_kiq_reg_write_reg_wait are not using the correct rlcg interface or mec engine, respectively. Add xcc instance parameter to them. v4: Use GET_INST and squash commit with: "drm/amdgpu: Add xcc_inst param to amdgpu_virt_kiq_reg_write_reg_wait" v3: xcc not needed for MMMHUB v2: rebase Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-10drm/amdgpu: fix AGP init orderAlex Deucher1-0/+2
The default AGP settings were overwriting the IP selected ones since the default was getting set after the IP ones were selected. Fixes: de59b69932e6 ("drm/amdgpu/gmc: set a default disable value for AGP") Link: https://lists.freedesktop.org/archives/amd-gfx/2023-November/100966.html Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
2023-10-27drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systemsDavid Francis1-11/+22
On gfx943 APU, EXT_COHERENT should give MTYPE_CC for local and MTYPE_UC for nonlocal memory. On NUMA systems, local memory gets the local mtype, set by an override callback. If EXT_COHERENT is set, memory will be set as MTYPE_UC by default, with local memory MTYPE_CC. Add an option in the override function for this case, and add a check to ensure it is not used on UNCACHED memory. V2: Combined APU and NUMA code into one patch V3: Fixed a potential nullptr in amdgpu_vm_bo_update Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-27drm/amdgpu remove restriction of sriov max_pfn on Vega10Lin.Cao1-5/+2
Remove restriction of sriov max_pfn so that TBA and TMA can move to high 47 bits address. Regression test: change range alloc flag of libdrm as AMDGPU_VA_RANGE_HIGH and there is no flr occur when testing amdgpu_test of drm. Signed-off-by: Lin.Cao <lincao12@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amdgpu: replace reset_error_count with amdgpu_ras_reset_error_countTao Zhou1-7/+2
Simplify the code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05drm/amdgpu: ratelimited override pte flags messagesPhilip Yang1-8/+8
Use ratelimited version of dev_dbg to avoid flooding dmesg log. No functional change. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05drm/amdgpu: cache gpuvm fault information for gmc7+Alex Deucher1-4/+7
Cache the current fault info in the vm struct. This can be queried by userspace later to help debug UMDs. Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05drm/amdgpu: Use ttm_pages_limit to override vram reportingRajneesh Bhardwaj1-8/+1
On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages limit in NPS1 mode. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05drm/amdgpu/gmc: add a way to force a particular placement for GARTAlex Deucher1-1/+1
We normally place GART based on the location of VRAM and the available address space around that, but provide an option to force a particular location for hardware that needs it. v2: Switch to passing the placement via parameter Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-27drm/amdgpu/gmc: set a default disable value for AGPAlex Deucher1-1/+2
To disable AGP, the start needs to be set to a higher value than the end. Set a default disable value for the AGP aperture and allow the IP specific GMC code to enable it selectively be calling amdgpu_gmc_agp_location(). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-26drm/amdgpu: further move TLB hw workarounds a layer upChristian König1-51/+23
For the PASID flushing we already handled that at a higher layer, apply those workarounds to the standard flush as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-26drm/amdgpu: rework lock handling for flush_tlb v2Christian König1-5/+1
Instead of each implementation doing this more or less correctly move taking the reset lock at a higher level. v2: fix typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-26drm/amdgpu: drop error return from flush_gpu_tlb_pasidChristian König1-5/+3
That function never fails, drop the error return. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-26drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb_pasidChristian König1-76/+35
Testing for reset is pointless since the reset can start right after the test. The same PASID can be used by more than one VMID, invalidate each of them. Move the KIQ and all the workaround handling into common GMC code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-26drm/amdgpu: fix value of some UMC parameters for UMC v12Tao Zhou1-1/+3
Prepare for bad page retirement. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-26drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlbChristian König1-11/+18
The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-20drm/amdgpu: Add EXT_COHERENT memory allocation flagsDavid Francis1-1/+4
These flags (for GEM and SVM allocations) allocate memory that allows for system-scope atomic semantics. On GFX943 these flags cause caches to be avoided on non-local memory. On all other ASICs they are identical in functionality to the equivalent COHERENT flags. Corresponding Thunk patch is at https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 Reviewed-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-20drm/amdgpu: Use function for IP version checkLijo Lazar1-44/+52
Use an inline function for version check. Gives more flexibility to handle any format changes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-12drm/amdgpu: add channel index table for UMC v12Tao Zhou1-0/+1
Get UMC phyical channel index according to node id, umc instance and channel instance. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-06drm/amdgpu: Add umc v12_0 ras functionsCandice Li1-2/+15
Add umc v12_0 ras error querying. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30drm/amdgpu: Fix kcalloc over kzalloc in 'gmc_v9_0_init_mem_ranges'Srinivasan Shanmugam1-4/+3
Replace kzalloc(n * sizeof(...), ...) with kcalloc(n, sizeof(...), ...) since kcalloc is the preferred API in case of allocating with multiply. Fixes the below: WARNING: Prefer kcalloc over kzalloc with multiply Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16drm/amdgpu: Add memory vendor informationLijo Lazar1-8/+18
For ASICs with GC v9.4.3, determine the vendor information from scratch register. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-10drm/amdgpu: Prefer dev_warn over printkSrinivasan Shanmugam1-1/+1
Fix the below warning: WARNING: Prefer [subsystem eg: netdev]_warn([subsystem]dev, ... then dev_warn(dev, ... then pr_warn(... to printk(KERN_WARNING ... Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-07drm/amdgpu: Fix error & warnings in gmc_v9_0.cSrinivasan Shanmugam1-20/+17
Fix below checkpatch error & warnings: ERROR: that open brace { should be on the previous line WARNING: static const char * array should probably be static const char * const WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-07drm/amdgpu: Update invalid PTE flag settingMukul Joshi1-0/+1
Update the invalid PTE flag setting with TF enabled. This is to ensure, in addition to transitioning the retry fault to a no-retry fault, it also causes the wavefront to enter the trap handler. With the current setting, the fault only transitions to a no-retry fault. Additionally, have 2 sets of invalid PTE settings, one for TF enabled, the other for TF disabled. The setting with TF disabled, doesn't work with TF enabled. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-16drm/amdgpu: Enable translate further for GC v9.4.3Philip Yang1-0/+1
To extend UTCL2 reach. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Fix up missing parameter in kdoc for 'inst' in gmc_ v7, v8, v9, ↵Srinivasan Shanmugam1-0/+1
v10, v11.c Fix these warnings by adding 'inst' arguments to kdocs. gcc with W=1 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:428: warning: Function parameter or member 'inst' not described in 'gmc_v7_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:626: warning: Function parameter or member 'inst' not described in 'gmc_v8_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:423: warning: Function parameter or member 'inst' not described in 'gmc_v10_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:328: warning: Function parameter or member 'inst' not described in 'gmc_v11_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:950: warning: Function parameter or member 'inst' not described in 'gmc_v9_0_flush_gpu_tlb_pasid' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: bypass bios dependent operationsShiwu Zhang1-24/+39
Since bios reading does not work currently so just bypass all operations related to bios v2: hardcode the vram info for APP_APU case (hawking) v3: correct the vram_width with channel number * channel size (lijo) Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Fix unsigned comparison with zero in gmc_v9_0_process_interrupt()Harshit Mogalapalli1-2/+2
Smatch warns: drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:579: unsigned 'xcc_id' is never less than zero. gfx_v9_4_3_ih_to_xcc_inst() returns negative numbers as well. Fix this by changing type of xcc_id to int. Fixes: 98b2e9cad227 ("drm/amdgpu: correct the vmhub index when page fault occurs") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/gmc9: fix 64 bit division in partition codeAlex Deucher1-5/+6
Rework logic or use do_div() to avoid problems on 32 bit. v2: add a missing case for XCP macro v3: fix out of bounds array access v4: fix xcp handling harder Acked-by: Guchun Chen <guchun.chen@amd.com> (v1) Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> (v3) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Store xcp partition id to amdgpu boPhilip Yang1-1/+1
For memory accounting per compute partition and export drm amdgpu bo and then import to KFD, we need the xcp id to account the memory usage or find the KFD node of the original amdgpu bo to create the KFD bo on the correct adev KFD node. Set xcp_id_plus1 of amdgpu_bo_param to create bo and store xcp_id to amddgpu bo. Add helper macro to get the mem_id from adev and xcp_id. v2: squash in fix ("drm/amdgpu: Fix BO creation failure on GFX 9.4.3 dGPU") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Update MTYPE for far memory partitionPhilip Yang1-8/+7
Use MTYPE RW/MTYPE_CC for mapping system memory or VRAM to KFD node within the same memory partition, use MTYPE_NC for mapping on KFD node from the far memory partition of the same socket or from another socket on same XGMI hive. On NPS4 or 4P system, MTYPE will be overridden per page depending on the memory NUMA node id and vm->mem_id. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/bu: update mtype_local parameter settingsGraham Sider1-6/+6
Update mtype_local module parameter to use MTYPE_RW by default. 0: MTYPE_RW (default) 1: MTYPE_NC 2: MTYPE_CC Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/bu: add mtype_local as a module parameterDavid Francis1-3/+16
Selects the MTYPE to be used for local memory, (0 = MTYPE_CC (default), 1 = MTYPE_NC, 2 = MTYPE_RW) v2: squash in build fix (Alex) Reviewed-by: Graham Sider <Graham.Sider@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Override MTYPE per page on GFXv9.4.3 APUsFelix Kuehling1-0/+64
On GFXv9.4.3 NUMA APUs, system memory locality must be determined per page to choose the correct MTYPE. This patch adds a GMC callback that can provide this per-page override and implements it for native mode. Carve-out mode is not yet supported and will use the safe default (remote) MTYPE for system memory. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Fix per-BO MTYPE selection for GFXv9.4.3Felix Kuehling1-24/+16
Treat system memory on NUMA systems as remote by default. Overriding with a more efficient MTYPE per page will be implemented in the next patch. No need for a special case for APP APUs. System memory is handled the same for carve-out and native mode. And VRAM doesn't exist in native mode. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/bu: Add use_mtype_cc_wa module paramGraham Sider1-3/+7
By default, set use_mtype_cc_wa to 1 to set PTE coherence flag MTYPE_CC instead of MTYPE_RW by default. This is required for the time being to mitigate a bug causing XCCs to hit stale data due to TCC marking fully dirty lines as exclusive. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Use legacy TLB flush for gfx943Graham Sider1-0/+12
Invalidate TLBs via a legacy flush request (flush_type=0) prior to the heavyweight flush requests (flush_type=2) in gmc_v9_0.c. This is temporarily required to mitigate a bug causing CPC UTCL1 to return stale translations after invalidation requests in address range mode. v2: squash in long term fix "drm/amdgpu: disable extra gfx943 legacy flush on rev1+" Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>