summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
AgeCommit message (Collapse)AuthorFilesLines
2024-06-10drm/amdgpu: Fix the BO release clear memory warningArunpravin Paneer Selvam1-2/+0
This happens when the amdgpu_bo_release_notify running before amdgpu_ttm_set_buffer_funcs_status set the buffer funcs to enabled. check the buffer funcs enablement before calling the fill buffer memory. v2:(Christian) - Apply it only for GEM buffers and since GEM buffers are only allocated/freed while the driver is loaded we never run into the issue to clear with buffer funcs disabled. v3:(Mario) - drop the stable tag as this will presumably go into a -fixes PR for 6.10 Log snip: *ERROR* Trying to clear memory with ring turned off. RIP: 0010:amdgpu_bo_release_notify+0x201/0x220 [amdgpu] Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Tested-by: Richard Gong <richard.gong@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240610180401.9540-1-Arunpravin.PaneerSelvam@amd.com
2024-05-15Merge tag 'drm-next-2024-05-15' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds1-6/+19
Pull drm updates from Dave Airlie: "This is the main pull request for the drm subsystems for 6.10. In drivers the main thing is a new driver for ARM Mali firmware based GPUs, otherwise there are a lot of changes to amdgpu/xe/i915/msm and scattered changes to everything else. In the core a bunch of headers and Kconfig was refactored, along with the addition of a new panic handler which is meant to provide a user friendly message when a panic happens and graphical display is enabled. New drivers: - panthor: ARM Mali/Immortalis CSF-based GPU driver Core: - add a CONFIG_DRM_WERROR option - make more headers self-contained - grab resv lock in pin/unpin - fix vmap resv locking - EDID/eDP panel matching - Kconfig cleanups - DT sound bindings - Add SIZE_HINTS property for cursor planes - Add struct drm_edid_product_id and helpers. - Use drm device based logging in more drm functions. - drop seq_file.h from a bunch of places - use drm_edid driver conversions dp: - DP Tunnel documentation - MST read sideband cap - Adaptive sync SDP prep work ttm: - improve placement for TTM BOs in idle/busy handling panic: - Fixes for drm-panic, and option to test it. - Add drm panic to simpledrm, mgag200, imx, ast bridge: - improve init ordering - adv7511: allow GPIO pin sharing - tc358775: add tc358675 support panel: - AUO B120XAN01.0 - Samsung s6e3fa7 - BOE NT116WHM-N44 - CMN N116BCA-EA1, - CrystalClear CMT430B19N00 - Startek KD050HDFIA020-C020A - powertip PH128800T006-ZHC01 - Innolux G121X1-L03 - LG sw43408 - Khadas TS050 V2 - EDO RM69380 OLED - CSOT MNB601LS1-1 amdgpu: - HDCP/ODM/RAS fixes - Devcoredump improvements - Expose VCN activity via sysfs - SMY 13.0.x updates - Enable fast updates on DCN 3.1.4 - Add dclk and vclk reporting on additional devices - Add ACA RAS infrastructure - Implement TLB flush fence - EEPROM handling fixes - SMUIO 14.0.2 support - SMU 14.0.1 Updates - SMU 14.0.2 support - Sync page table freeing with TLB flushes - DML2 refactor - DC debug improvements - DCN 3.5.x Updates - GPU reset fixes - HDP fix for second GFX pipe on GC 10.x - Enable secondary GFX pipe on GC 10.3 - Refactor and clean up BACO/BOCO/BAMACO handling - Remove invalid TTM resource start check - UAF fix in VA IOCTL - GPUVM page fault redirection to secondary IH rings for IH 6.x - Initial support for mapping kernel queues via MES - Fix VRAM memory accounting amdkfd: - MQD handling cleanup - Preemption handling fixes for XCDs - TLB flush fix for GC 9.4.2 - Properly clean up workqueue during module unload - Fix memory leak process create failure - Range check CP bad op exception targets to avoid reporting invalid exceptions to userspace - Fix eviction fence handling - Fix leak in GPU memory allocation failure case - DMABuf import handling fix - Enable SQ watchpoint for gfx10 i915: - Adding new DG2 PCI ID - add context hints for GT frequency - enable only one CCS for compute workloads - new workarounds - Fix UAF on destroy against retire race and remove two earlier partial fixes - Limit the reserved VM space to only the platforms that need it - Fix gt reset with GuC submission is disable - Add and use gt_to_guc() wrapper i915/xe display: - Lunar Lake display enabling, including cdclk and other refactors - BIOS/VBT/opregion related refactor - Digital port related refactor/clean-up - Fix 2s boot time regression on DP panel replay init - Remove duplication on audio enable/disable on SDVO and g4x+ DP - Disable AuxCCS framebuffers if built for Xe - Make crtc disable more atomic - Increase DP idle pattern wait timeout to 2ms - Start using container_of_const() for some extra const safety - Fix Jasper Lake boot freeze - Enable MST mode for 128b/132b single-stream sideband - Enable Adaptive Sync SDP Support for DP - Fix MTL supported DP rates - removal of UHBR13.5 - PLL refactoring - Limit eDP MSO pipe only for display version 20 - More display refactor towards independence from i915 dev_priv - Convert i915/xe fbdev to DRM client - More initial work to make display code more independent from i915 xe: - improved error capture - clean up some uAPI leftovers - devcoredump update - Add BMG mocs table - Handle GSCCS ER interrupt - Implement xe2- and GuC workarounds - struct xe_device cleanup - Hwmon updates - Add LRC parsing for more GPU instruction - Increase VM_BIND number of per-ioctl Ops - drm/xe: Add XE_BO_GGTT_INVALIDATE flag - Initial development for SR-IOV support - Add new PCI IDs to DG2 platform - Move userptr over to start using hmm_range_fault msm: - Switched to generating register header files during build process instead of shipping pre-generated headers - Merged DPU and MDP4 format databases. - DP: - Stop using compat string to distinguish DP and eDP cases - Added support for X Elite platform (X1E80100) - Reworked DP aux/audio support - Added SM6350 DP to the bindings - GPU: - a7xx perfcntr reg fixes - MAINTAINERS updates - a750 devcoredump support radeon: - Silence UBSAN warnings related to flexible arrays nouveau: - move some uAPI objects to uapi headers omapdrm: - console fix ast: - add i2c polling qaic: - add debugfs entries exynos: - fix platform_driver .owner - drop cleanup code mediatek: - Use devm_platform_get_and_ioremap_resource() in mtk_hdmi_ddc_probe() - Add GAMMA 12-bit LUT support for MT8188 - Rename mtk_drm_* to mtk_* - Drop driver owner initialization - Correct calculation formula of PHY Timing" * tag 'drm-next-2024-05-15' of https://gitlab.freedesktop.org/drm/kernel: (1477 commits) drm/xe/ads: Use flexible-array drm/xe: Use ordered WQ for G2H handler drm/msm/gen_header: allow skipping the validation drm/msm/a6xx: Cleanup indexed regs const'ness drm/msm: Add devcoredump support for a750 drm/msm: Adjust a7xx GBIF debugbus dumping drm/msm: Update a6xx registers XML drm/msm: Fix imported a750 snapshot header for upstream drm/msm: Import a750 snapshot registers from kgsl MAINTAINERS: Add Konrad Dybcio as a reviewer for the Adreno driver MAINTAINERS: Add a separate entry for Qualcomm Adreno GPU drivers drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails drm/msm/adreno: fix CP cycles stat retrieval on a7xx drm/msm/a7xx: allow writing to CP_BV counter selection registers drm: zynqmp_dpsub: Always register bridge Revert "drm/bridge: ti-sn65dsi83: Fix enable error path" drm/fb_dma: Add checks in drm_fb_dma_get_scanout_buffer() drm/fbdev-generic: Do not set physical framebuffer address drm/panthor: Fix the FW reset logic drm/panthor: Make sure we handle 'unknown group state' case properly ...
2024-05-01drm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2Christian König1-5/+9
This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap. The basic problem here is that after the move the old location is simply not available any more. Some fixes were suggested, but essentially we should call the move notification before actually moving things because only this way we have the correct order for DMA-buf and VM move notifications as well. Also rework the statistic handling so that we don't update the eviction counter before the move. v2: add missing NULL check Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 94aeb4117343 ("drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3171 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org
2024-04-30Merge tag 'amd-drm-next-6.10-2024-04-26' of ↵Dave Airlie1-2/+8
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.10-2024-04-26: amdgpu: - Misc code cleanups and refactors - Support setting reset method at runtime - Report OD status - SMU 14.0.1 fixes - SDMA 4.4.2 fixes - VPE fixes - MES fixes - Update BO eviction priorities - UMSCH fixes - Reset fixes - Freesync fixes - GFXIP 9.4.3 fixes - SDMA 5.2 fixes - MES UAF fix - RAS updates - Devcoredump updates for dumping IP state - DSC fixes - JPEG fix - Fix VRAM memory accounting - VCN 5.0 fixes - MES fixes - UMC 12.0 updates - Modify contiguous flags handling - Initial support for mapping kernel queues via MES amdkfd: - Fix rescheduling of restore worker - VRAM accounting for SVM migrations - mGPU fix - Enable SQ watchpoint for gfx10 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240426221245.1613332-1-alexander.deucher@amd.com
2024-04-29Merge v6.9-rc6 into drm-nextDaniel Vetter1-0/+2
Thomas needs the defio fixes, Maíra needs the vkms fixes and Joonas has some fun with i915-gem conflicts. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-04-27drm/amdgpu: Modify the contiguous flags behaviourArunpravin Paneer Selvam1-1/+7
Now we have two flags for contiguous VRAM buffer allocation. If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS, it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the buffer's placement function. This patch will change the default behaviour of the two flags. When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS - This means contiguous is not mandatory. - we will try to allocate the contiguous buffer. Say if the allocation fails, we fallback to allocate the individual pages. When we setTTM_PL_FLAG_CONTIGUOUS - This means contiguous allocation is mandatory. - we are setting this in amdgpu_bo_pin_restricted() before bo validation and check this flag in the vram manager file. - if this is set, we should allocate the buffer pages contiguously. the allocation fails, we return -ENOSPC. v2: - keep the mem_flags and bo->flags check as is(Christian) - place the TTM_PL_FLAG_CONTIGUOUS flag setting into the amdgpu_bo_pin_restricted function placement range iteration loop(Christian) - rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block (Christian) - Keep the kernel BO allocation as is(Christain) - If BO pin vram allocation failed, we need to return -ENOSPC as RDMA cannot work with scattered VRAM pages(Philip) v3(Christian): - keep contiguous flag handling outside of pages_per_block calculation - remove the hacky implementation in contiguous flag error handling code v4(Christian): - use any variable and return value for non-contiguous fallback v5: rebase to amd-staging-drm-next branch Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-27drm/amdgpu: replace tmz flag into buffer flagFrank Min1-1/+1
Replace tmz flag into buffer flag to make it easier to understand and extend Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-24drm/amdgpu: Update BO eviction prioritiesFelix Kuehling1-0/+2
Make SVM BOs more likely to get evicted than other BOs. These BOs opportunistically use available VRAM, but can fall back relatively seamlessly to system memory. It also avoids SVM migrations evicting other, more important BOs as they will evict other SVM allocations first. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23drm/amdgpu: Update BO eviction prioritiesFelix Kuehling1-0/+2
Make SVM BOs more likely to get evicted than other BOs. These BOs opportunistically use available VRAM, but can fall back relatively seamlessly to system memory. It also avoids SVM migrations evicting other, more important BOs as they will evict other SVM allocations first. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23Merge drm/drm-next into drm-misc-nextMaxime Ripard1-11/+11
Maíra needs a backmerge to apply v3d patches, and Danilo for some nouveau patches. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-04-22drm/amdgpu: Enable clear page functionalityArunpravin Paneer Selvam1-4/+5
Add clear page support in vram memory region. v1(Christian): - Dont handle clear page as TTM flag since when moving the BO back in from GTT again we don't need that. - Make a specialized version of amdgpu_fill_buffer() which only clears the VRAM areas which are not already cleared - Drop the TTM_PL_FLAG_WIPE_ON_RELEASE check in amdgpu_object.c v2: - Modify the function name amdgpu_ttm_* (Alex) - Drop the delayed parameter (Christian) - handle amdgpu_res_cleared(&cursor) just above the size calculation (Christian) - Use AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE for clearing the buffers in the free path to properly wait for fences etc.. (Christian) v3(Christian): - Remove buffer clear code in VRAM manager instead change the AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE handling to set the DRM_BUDDY_CLEARED flag. - Remove ! from amdgpu_res_cleared(&cursor) check. v4(Christian): - vres flag setting move to vram manager file - use dma_fence_get_stub in amdgpu_ttm_clear_buffer function - make fence a mandatory parameter and drop the if and the get/put dance Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240419063538.11957-2-Arunpravin.PaneerSelvam@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2024-04-22Merge tag 'amd-drm-next-6.10-2024-04-19' of ↵Dave Airlie1-11/+11
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.10-2024-04-19: amdgpu: - DC resource allocation logic updates - DC IPS fixes - DC YUV fixes - DMCUB fixes - DML2 fixes - Devcoredump updates - USB-C DSC fix - Misc display code cleanups - PSR fixes - MES timeout fix - RAS updates - UAF fix in VA IOCTL - Fix visible VRAM handling during faults - Fix IP discovery handling during PCI rescans - Misc code cleanups - PSP 14 updates - More runtime PM code rework - SMU 14.0.2 support - GPUVM page fault redirection to secondary IH rings for IH 6.x - Suspend/resume fixes - SR-IOV fixes amdkfd: - Fix eviction fence handling - Fix leak in GPU memory allocation failure case - DMABuf import handling fix radeon: - Silence UBSAN warnings related to flexible arrays Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240419224332.2938259-1-alexander.deucher@amd.com
2024-04-17drm/amdgpu: fix visible VRAM handling during faultsChristian König1-11/+11
When we removed the hacky start code check we actually didn't took into account that *all* VRAM pages needs to be CPU accessible. Clean up the code and unify the handling into a single helper which checks if the whole resource is CPU accessible. The only place where a partial check would make sense is during eviction, but that is neglitible. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: aed01a68047b ("drm/amdgpu: Remove TTM resource->start visible VRAM condition v2") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org
2024-04-17drm/amdgpu: fix visible VRAM handling during faultsChristian König1-11/+11
When we removed the hacky start code check we actually didn't took into account that *all* VRAM pages needs to be CPU accessible. Clean up the code and unify the handling into a single helper which checks if the whole resource is CPU accessible. The only place where a partial check would make sense is during eviction, but that is neglitible. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: aed01a68047b ("drm/amdgpu: Remove TTM resource->start visible VRAM condition v2") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org
2024-03-01drm/amdgpu: use GTT only as fallback for VRAM|GTTChristian König1-0/+6
Try to fill up VRAM as well by setting the busy flag on GTT allocations. This fixes the issue that when VRAM was evacuated for suspend it's never filled up again unless the application is restarted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240229134003.3688-2-christian.koenig@amd.com
2024-02-16drm/amdgpu: add shared fdinfo statsAlex Deucher1-0/+11
Add shared stats. Useful for seeing shared memory. v2: take dma-buf into account as well v3: use the new gem helper Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Christian König <christian.keonig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2024-01-29Merge drm/drm-next into drm-misc-nextMaxime Ripard1-15/+10
Kickstart 6.9 development cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-01-25drm/ttm: replace busy placement with flags v6Somalapuram Amaranath1-5/+1
Instead of a list of separate busy placement add flags which indicate that a placement should only be used when there is room or if we need to evict. v2: add missing TTM_PL_FLAG_IDLE for i915 v3: fix auto build test ERROR on drm-tip/drm-tip v4: fix some typos pointed out by checkpatch v5: cleanup some rebase problems with VMWGFX v6: implement some missing VMWGFX functionality pointed out by Zack, rename the flags as suggested by Michel, rebase on drm-tip and adjust XE as well Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240112125158.2748-4-christian.koenig@amd.com
2023-12-14drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heapWang, Beyond1-12/+1
Issue: during evict or validate happened on amdgpu_bo, the 'from' and 'to' is always same in ftrace event of amdgpu_bo_move where calling the 'trace_amdgpu_bo_move', the comment says move_notify is called before move happens, but actually it is called after move happens, here the new_mem is same as bo->resource Fix: move trace_amdgpu_bo_move from move_notify to amdgpu_bo_move Signed-off-by: Wang, Beyond <Wang.Beyond@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14drm/amdgpu: warn when there are still mappings when a BO is destroyed v2Christian König1-0/+2
This can only happen when there is a reference counting bug. v2: fix typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-30drm/amdgpu: fix AGP addressing when GART is not at 0Alex Deucher1-3/+7
This worked by luck if the GART aperture ended up at 0. When we ended up moving GART on some chips, the GART aperture ended up offsetting the AGP address since the resource->start is a GART offset, not an MC address. Fix this by moving the AGP address setup into amdgpu_bo_gpu_offset_no_check(). v2: check mem_type before checking agp v3: check if the ttm bo has a ttm_tt allocated yet Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11") Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reported-by: Jesse Zhang <Jesse.Zhang@amd.com> Reported-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: christian.koenig@amd.com Cc: mario.limonciello@amd.com
2023-11-10drm/amdgpu: fix AGP init orderAlex Deucher1-3/+0
The default AGP settings were overwriting the IP selected ones since the default was getting set after the IP ones were selected. Fixes: de59b69932e6 ("drm/amdgpu/gmc: set a default disable value for AGP") Link: https://lists.freedesktop.org/archives/amd-gfx/2023-November/100966.html Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
2023-09-27drm/amdgpu/gmc: set a default disable value for AGPAlex Deucher1-0/+3
To disable AGP, the start needs to be set to a higher value than the end. Set a default disable value for the AGP aperture and allow the IP specific GMC code to enable it selectively be calling amdgpu_gmc_agp_location(). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30drm/amd: Simplify the bo size check funcitonMa Jun1-17/+12
Simplify the code logic of size check function amdgpu_bo_validate_size Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-08drm/amdgpu: accommodate DOMAIN/PL_DOORBELLAlex Deucher1-1/+10
This patch adds changes: - to accommodate the new GEM domain for DOORBELLs - to accommodate the new TTM PL for DOORBELLs in order to manage doorbell pages as GEM object. V2: Addressed reviwe comments from Christian - drop the doorbell changes for pinning/unpinning - drop the doorbell changes for dma-buf map - drop the doorbell changes for sgt - no need to handle TTM_PL_FLAG_CONTIGUOUS for doorbell - add caching type for doorbell V3: - Removed unrelated empty line (Christian) - Add PL_DOORBELL in mem_type_to_domain() as well (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
2023-07-25drm/amdgpu: add VISIBLE info in amdgpu_bo_print_infoPierre-Eric Pelloux-Prayer1-13/+21
This allows tools to distinguish between VRAM and visible VRAM. Use the opportunity to fix locking before accessing bo. v2: squash in unused variable fix Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-15drm/amdgpu: make sure that BOs have a backing storeChristian König1-1/+5
It's perfectly possible that the BO is about to be destroyed and doesn't have a backing store associated with it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-15Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar ↵Arunpravin Paneer Selvam1-1/+1
system" This reverts commit c105518679b6e87232874ffc989ec403bee59664. This patch disables the TOPDOWN flag for APU and few dGPU cards which has the VRAM size equal to the BAR size. When we enable the TOPDOWN flag, we get the free blocks at the highest available memory region and we don't split the lower order blocks. This change is required to keep off the fragmentation related issues particularly in ASIC which has VRAM space <= 500MiB Hence, we are reverting this patch. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vramHoratio Zhang1-6/+4
Use the function of amdgpu_bo_vm_destroy to handle the resource release of shadow bo. During the amdgpu_mes_self_test, shadow bo released, but vmbo->shadow_list was not, which caused a null pointer reference error in amdgpu_device_recover_vram when GPU reset. Fixes: 6c032c37ac3e ("drm/amdgpu: Fix vram recover doesn't work after whole GPU reset (v2)") Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Acked-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Add a low priority scheduler for VRAM clearingMukul Joshi1-2/+2
Add a low priority DRM scheduler for VRAM clearing instead of using the exisiting high priority scheduler. Use the high priority scheduler for migrations and evictions. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Store xcp partition id to amdgpu boPhilip Yang1-5/+10
For memory accounting per compute partition and export drm amdgpu bo and then import to KFD, we need the xcp id to account the memory usage or find the KFD node of the original amdgpu bo to create the KFD bo on the correct adev KFD node. Set xcp_id_plus1 of amdgpu_bo_param to create bo and store xcp_id to amddgpu bo. Add helper macro to get the mem_id from adev and xcp_id. v2: squash in fix ("drm/amdgpu: Fix BO creation failure on GFX 9.4.3 dGPU") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: dGPU mode set VRAM range lpfn as exclusivePhilip Yang1-1/+5
TTM place lpfn is exclusive used as end (start + size) in drm and buddy allocator, adev->gmc memory partition range lpfn is inclusive (start + size - 1), should plus 1 to set TTM place lpfn. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: dGPU mode placement support memory partitionPhilip Yang1-3/+8
dGPU mode uses VRAM manager to validate bo, amdgpu bo placement use the mem_id to get the allocation range first, last page frame number from xcp manager, pass to drm buddy allocator as the allowed range. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Add memory partition mem_id to amdgpu_boPhilip Yang1-0/+3
Add mem_id_plus1 parameter to amdgpu_gem_object_create and pass it to amdgpu_bo_create. For dGPU mode allocation, mem_id is used by VRAM manager to get the memory partition fpfn, lpfn from xcp manager. For APU native mode allocation, mem_id is used to get NUMA node id from xcp manager, then pass to TTM as numa pool id to alloc memory from the specific NUMA node. mem_id -1 means for entire VRAM or any NUMA nodes. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Fix unmapping of apertureLijo Lazar1-2/+1
When aperture size is zero, there is no mapping done. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Handle VRAM dependencies on GFXIP9.4.3Rajneesh Bhardwaj1-1/+1
[For 1P NPS1 mode driver bringup] Changes required to initialize the amdgpu driver with frontdoor firmware loading and discovery=2 with the native mode SBIOS that enables CPU GPU unified interleaved memory. sudo modprobe amdgpu discovery=2 Once PSP TMR region is reported via the ACPI interface, the dependency on the ip_discovery.bin will be removed. Choice of where to allocate driver table is given to each IP version. In general, both GTT and VRAM domains will be considered. If one of the tables has a strict restriction for VRAM domain, then only VRAM domain is considered. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> (lijo: Modified the handling for SMU Tables) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amd/amdgpu: Fix warnings in amdgpu _object, _ring.cSrinivasan Shanmugam1-5/+5
Fix below warnings reported by checkpatch: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: static const char * array should probably be static const char * const WARNING: space prohibited between function name and open parenthesis '(' WARNING: braces {} are not necessary for single statement blocks WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-20Merge tag 'amd-drm-next-6.4-2023-03-17' of ↵Dave Airlie1-5/+22
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.4-2023-03-17: amdgpu: - Misc code cleanups - Documentation fixes - Make kobj structures const - Add thermal throttling adjustments for supported APUs - UMC RAS fixes - Display reset fixes - DCN 3.2 fixes - Freesync fixes - DC code reorg - Generalize dmabuf import to work with KFD - DC DML fixes - SRIOV fixes - UVD code cleanups - IH 4.4.2 updates - HDP 4.4.2 updates - SDMA 4.4.2 updates - PSP 13.0.6 updates - Add capped/uncapped workload handling for supported APUs - DCN 3.1.4 updates - Re-org DC Kconfig - USB4 fixes - Reorg DC plane and stream handling - Register vga_switcheroo for apple-gmux - SMU 13.0.6 updates - Fix error checking in read_mm_registers functions for affected families - VCN 4.0.4 fix - Drop redundant pci_enable_pcie_error_reporting() call - RDNA2 SMU OD suspend/resume fix - Expose additional memory stats via fdinfo - RAS fixes - Misc display fixes - DP MST fixes - IOMMU regression fix for KFD amdkfd: - Make kobj structures const - Support for exporting buffers via dmabuf - Multi-VMA page migration fixes - NBIO fixes - Misc code cleanups - Fix possible double free - Fix possible UAF radeon: - iMac fix UAPI: - KFD dmabuf export support. Required for importing KFD buffers into GEM contexts and for RDMA P2P support. Proposed user mode changes: https://github.com/fxkamd/ROCT-Thunk-Interface/commits/fxkamd/dmabuf From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230317164416.138340-1-alexander.deucher@amd.com
2023-03-14Merge tag 'drm-misc-next-2023-03-07' of ↵Dave Airlie1-6/+3
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.4-rc1: Note: Only changes since pull request from 2023-02-23 are included here. UAPI Changes: - Convert rockchip bindings to YAML. - Constify kobj_type structure in dma-buf. - FBDEV cmdline parser fixes, and other small fbdev fixes for mode parsing. Cross-subsystem Changes: - Add Neil Armstrong as linaro maintainer. - Actually signal the private stub dma-fence. Core Changes: - Add function for adding syncobj dep to sched_job and use it in panfrost, v3d. - Improve DisplayID 2.0 topology parsing and EDID parsing in general. - Add a gem eviction function and callback for generic GEM shrinker purposes. - Prepare to convert shmem helper to use the GEM reservation lock instead of own locking. (Actual commit itself got reverted for now) - Move the suballocator from radeon and amdgpu drivers to core in preparation for Xe. - Assorted small fixes and documentation. - Fixes to HPD polling. - Assorted small fixes in simpledrm, bridge, accel, shmem-helper, and the selftest of format-helper. - Remove dummy resource when ttm bo is created, and during pipelined gutting. Fix all drivers to accept a NULL ttm_bo->resource. - Handle pinned BO moving prevention in ttm core. - Set drm panel-bridge orientation before connector is registered. - Remove dumb_destroy callback. - Add documentation to GEM_CLOSE, PRIME_HANDLE_TO_FD, PRIME_FD_TO_HANDLE, GETFB2 ioctl's. - Add atomic enable_plane callback, use it in ast, mgag200, tidss. Driver Changes: - Use drm_gem_objects_lookup in vc4. - Assorted small fixes to virtio, ast, bridge/tc358762, meson, nouveau. - Allow virtio KMS to be disabled and compiled out. - Add Radxa 8/10HD, Samsung AMS495QA01 panels. - Fix ivpu compiler errors. - Assorted fixes to drm/panel, malidp, rockchip, ivpu, amdgpu, vgem, nouveau, vc4. - Assorted cleanups, simplifications and fixes to vmwgfx. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ac1f5186-54bb-02f4-ac56-907f5b76f3de@linux.intel.com
2023-03-14drm/amdgpu: expose more memory stats in fdinfoMarek Olšák1-5/+22
This will be used for performance investigations. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-10drm/amdgpu: Fix call trace warning and hang when removing amdgpu devicelyndonli1-1/+1
On GPUs with RAS enabled, below call trace and hang are observed when shutting down device. v2: use DRM device unplugged flag instead of shutdown flag as the check to prevent memory wipe in shutdown stage. [ +0.000000] RIP: 0010:amdgpu_vram_mgr_fini+0x18d/0x1c0 [amdgpu] [ +0.000001] PKRU: 55555554 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000002] amdgpu_ttm_fini+0x140/0x1c0 [amdgpu] [ +0.000183] amdgpu_bo_fini+0x27/0xa0 [amdgpu] [ +0.000184] gmc_v11_0_sw_fini+0x2b/0x40 [amdgpu] [ +0.000163] amdgpu_device_fini_sw+0xb6/0x510 [amdgpu] [ +0.000152] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] [ +0.000090] drm_dev_release+0x28/0x50 [drm] [ +0.000016] devm_drm_dev_init_release+0x38/0x60 [drm] [ +0.000011] devm_action_release+0x15/0x20 [ +0.000003] release_nodes+0x40/0xc0 [ +0.000001] devres_release_all+0x9e/0xe0 [ +0.000001] device_unbind_cleanup+0x12/0x80 [ +0.000003] device_release_driver_internal+0xff/0x160 [ +0.000001] driver_detach+0x4a/0x90 [ +0.000001] bus_remove_driver+0x6c/0xf0 [ +0.000001] driver_unregister+0x31/0x50 [ +0.000001] pci_unregister_driver+0x40/0x90 [ +0.000003] amdgpu_exit+0x15/0x120 [amdgpu] Signed-off-by: lyndonli <Lyndon.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-07drm/amdgpu: Fix call trace warning and hang when removing amdgpu devicelyndonli1-1/+1
On GPUs with RAS enabled, below call trace and hang are observed when shutting down device. v2: use DRM device unplugged flag instead of shutdown flag as the check to prevent memory wipe in shutdown stage. [ +0.000000] RIP: 0010:amdgpu_vram_mgr_fini+0x18d/0x1c0 [amdgpu] [ +0.000001] PKRU: 55555554 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000002] amdgpu_ttm_fini+0x140/0x1c0 [amdgpu] [ +0.000183] amdgpu_bo_fini+0x27/0xa0 [amdgpu] [ +0.000184] gmc_v11_0_sw_fini+0x2b/0x40 [amdgpu] [ +0.000163] amdgpu_device_fini_sw+0xb6/0x510 [amdgpu] [ +0.000152] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] [ +0.000090] drm_dev_release+0x28/0x50 [drm] [ +0.000016] devm_drm_dev_init_release+0x38/0x60 [drm] [ +0.000011] devm_action_release+0x15/0x20 [ +0.000003] release_nodes+0x40/0xc0 [ +0.000001] devres_release_all+0x9e/0xe0 [ +0.000001] device_unbind_cleanup+0x12/0x80 [ +0.000003] device_release_driver_internal+0xff/0x160 [ +0.000001] driver_detach+0x4a/0x90 [ +0.000001] bus_remove_driver+0x6c/0xf0 [ +0.000001] driver_unregister+0x31/0x50 [ +0.000001] pci_unregister_driver+0x40/0x90 [ +0.000003] amdgpu_exit+0x15/0x120 [amdgpu] Signed-off-by: lyndonli <Lyndon.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-24drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar systemShane Xiao1-1/+1
Since VRAM manager is changed from drm mm to drm buddy, the TOP_DOWN flag should not be set by default in the large bar system. Removing this flag helps improve drm buddy allocator efficiency and reduce the risk of splitting higher order block into lower order. Signed-off-by: Shane Xiao <shane.xiao@amd.com> Reviewed-by: Christian K�nig <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-09drm/amdgpu: Remove TTM resource->start visible VRAM condition v2Somalapuram Amaranath1-6/+3
Use amdgpu_bo_in_cpu_visible_vram() instead. v2 (chk): fix test inversion Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230208090106.9659-2-Amaranath.Somalapuram@amd.com
2023-01-20drm/amdgpu: print bo inode number instead of ptrPierre-Eric Pelloux-Prayer1-2/+2
This allows to correlate the infos printed by /sys/kernel/debug/dri/n/amdgpu_gem_info to the ones found in /proc/.../fdinfo and /sys/kernel/debug/dma_buf/bufinfo. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-10drm/amdgpu: Fix potential NULL dereferenceLuben Tuikov1-2/+3
Fix potential NULL dereference, in the case when "man", the resource manager might be NULL, when/if we print debug information. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> Cc: Dan Carpenter <error27@gmail.com> Cc: kernel test robot <lkp@intel.com> Fixes: 7554886daa31ea ("drm/amdgpu: Fix size validation for non-exclusive domains (v4)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-15drm/amdgpu: revert "generally allow over-commit during BO allocation"Christian König1-1/+5
This reverts commit f9d00a4a8dc8fff951c97b3213f90d6bc7a72175. This causes problem for KFD because when we overcommit we accidentially bind the BO to GTT for moving it into VRAM. We also need to make sure that this is done only as fallback after trying to evict first. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-15drm/amdgpu: Remove unnecessary domain argumentLuben Tuikov1-5/+5
Remove the "domain" argument to amdgpu_bo_create_kernel_at() since this function takes an "offset" argument which is the offset off of VRAM, and as such allocation always takes place in VRAM. Thus, the "domain" argument is unnecessary. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-15drm/amdgpu: Fix size validation for non-exclusive domains (v4)Luben Tuikov1-11/+8
Fix amdgpu_bo_validate_size() to check whether the TTM domain manager for the requested memory exists, else we get a kernel oops when dereferencing "man". v2: Make the patch standalone, i.e. not dependent on local patches. v3: Preserve old behaviour and just check that the manager pointer is not NULL. v4: Complain if GTT domain requested and it is uninitialized--most likely a bug. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-14drm/amdgpu: WARN when freeing kernel memory during suspendChristian König1-0/+2
When buffers are freed during suspend there is no guarantee that they can be re-allocated during resume. The PSP subsystem seems to be quite buggy regarding this, so add a WARN_ON() to point out those bugs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexdeucher@amd.com> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>