summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
AgeCommit message (Collapse)AuthorFilesLines
2023-04-14drm/amdgpu: move vmhub out of amdgpu_ring_funcs (v4)Le Ma1-2/+2
It looks better to place this field in ring structure. Also drop the repeated ring funcs definitions if there's no difference except for vmhub field. v2: rename the field to vm_hub like others (Le) v3: apply the changes to new ip blocks (Hawking) v4: fix vcn sw ring (Alex) Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amdgpu: Rework retry fault removalMukul Joshi1-3/+33
Rework retry fault removal from the software filter by storing an expired timestamp for a fault that is being removed. When a new fault comes, and it matches an entry in the sw filter, it will be added as a new fault only when its timestamp is greater than the timestamp expiry of the fault in the sw filter. This helps in avoiding stale faults being added back into the filter and preventing legitimate faults from being handled. Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31drm/amdkfd: Set noretry/xnack for GC 9.4.3Amber Lin1-0/+1
For GC 9.4.3, disable retry as default and XNACK can be different modes per process. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-16drm/amdgpu: Rework xgmi_wafl_pcs ras sw_initHawking Zhang1-5/+4
To align with other IP blocks. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-16drm/amdgpu: Rework mca ras sw_initHawking Zhang1-0/+13
To align with other IP blocks Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14drm/amdgpu: Move hdp ras block init to ras sw_initHawking Zhang1-0/+5
Initialize hdp ras block only when mmhub ip block supports ras features. Driver queries ras capabilities after early_init, ras block init needs to be moved to sw_init. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14drm/amdgpu: Move mmhub ras block init to ras sw_initHawking Zhang1-0/+5
Initialize mmhub ras block only when mmhub ip block supports ras features. Driver queries ras capabilities after early_init, ras block init needs to be moved to sw_init. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14drm/amdgpu: Move umc ras block init to gmc ras sw_initHawking Zhang1-1/+8
Initialize umc ras block only when umc ip block supports ras. Driver queries ras capabilities after early_init, ras block init needs to be moved to sw_init. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-24drm/amdgpu: add tmz support for GC 10.3.6Jesse Zhang1-0/+1
this patch to add tmz support for GC 10.3.6 Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-16Merge tag 'amd-drm-next-6.3-2023-01-06' of ↵Dave Airlie1-1/+8
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.3-2023-01-06: amdgpu: - secure display support for multiple displays - DML optimizations - DCN 3.2 updates - PSR updates - DP 2.1 updates - SR-IOV RAS updates - VCN RAS support - SMU 13.x updates - Switch 1 element arrays to flexible arrays - Add RAS support for DF 4.3 - Stack size improvements - S0ix rework - Soft reset fix - Allow 0 as a vram limit on APUs - Display fixes - Misc code cleanups - Documentation fixes - Handle profiling modes for SMU13.x amdkfd: - Error handling fixes - PASID fixes radeon: - Switch 1 element arrays to flexible arrays drm: - Add DP adaptive sync DPCD definitions UAPI: - Add new INFO queries for peak and min sclk/mclk for profile modes on newer chips Proposed mesa patch: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/278 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230106222037.7870-1-alexander.deucher@amd.com
2023-01-04Merge tag 'drm-misc-next-2023-01-03' of ↵Daniel Vetter1-0/+1
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.3: UAPI Changes: * connector: Support analog-TV mode property * media: Add MEDIA_BUS_FMT_RGB565_1X24_CPADHI, MEDIA_BUS_FMT_RGB666_1X18 and MEDIA_BUS_FMT_RGB666_1X24_CPADHI Cross-subsystem Changes: * dma-buf: Documentation fixes * i2c: Introduce i2c_client_get_device_id() helper Core Changes: * Improve support for analog TV output * bridge: Remove unused drm_bridge_chain functions * debugfs: Add per-device helpers and convert various DRM drivers * dp-mst: Various fixes * fbdev emulation: Always pick 32 bpp as default * KUnit: Add tests for managed helpers; Various cleanups * panel-orientation: Add quirks for Lenovo Yoga Tab 3 X90F and DynaBook K50 * TTM: Open-code ttm_bo_wait() and remove the helper Driver Changes: * Fix preferred depth and bpp values throughout DRM drivers * Remove #CONFIG_PM guards throughout DRM drivers * ast: Various fixes * bridge: Implement i2c's probe_new in various drivers; Fixes; ite-it6505: Locking fixes, Cache EDID data; ite-it66121: Support IT6610 chip, Cleanups; lontium-tl9611: Fix HDMI on DragonBoard 845c; parade-ps8640: Use atomic bridge functions * gud: Convert to DRM shadow-plane helpers; Perform flushing synchronously during atomic update * ili9486: Support 16-bit pixel data * imx: Split off IPUv3 driver; Various fixes * mipi-dbi: Convert to DRM shadow-plane helpers plus rsp driver changes;i Support separate I/O-voltage supply * mxsfb: Depend on ARCH_MXS or ARCH_MXC * omapdrm: Various fixes * panel: Use ktime_get_boottime() to measure power-down delay in various drivers; Fix auto-suspend delay in various drivers; orisetech-ota5601a: Add support * sprd: Cleanups * sun4i: Convert to new TV-mode property * tidss: Various fixes * v3d: Various fixes * vc4: Convert to new TV-mode property; Support Kunit tests; Cleanups; dpi: Support RGB565 and RGB666 formats; dsi: Convert DSI driver to bridge * virtio: Improve tracing * vkms: Support small cursors in IGT tests; Various fixes Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/Y7QIwlfElAYWxRcR@linux-uq9g
2023-01-04drm/amdgpu: allow zero as vram limitChristian König1-1/+1
This allows testing the driver without any VRAM. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-04drm/amdgpu: cleanup visible vram size handlingChristian König1-0/+7
Centralize the limit handling and validation in one place instead of spreading that around in different hw generations. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-06drm/amdgpu: add tmz support for GC IP v11.0.4Tim Huang1-0/+1
Add tmz support for GC 11.0.4. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-06drm/ttm: merge ttm_bo_api.h and ttm_bo_driver.h v2Christian König1-0/+1
Merge and cleanup the two headers into a single description of the object API. Also move all the documentation to the implementation and drop unnecessary includes from the header. No functional change. v2: minimal checkpatch.pl cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-4-christian.koenig@amd.com
2022-11-23drm/amd/amdgpu: reserve vm invalidation engine for firmwareJack Xiao1-0/+6
If mes enabled, reserve VM invalidation engine 5 for firmware. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15drm/amdgpu: there is no vbios fb on devices with no display hw (v2)Alex Deucher1-1/+1
If we enable virtual display functionality on parts with no display hardware we can end up trying to check for and reserve the vbios FB area on devices where it doesn't exist. Check if display hardware is actually present on the hardware before trying to reserve the memory. v2: move the check into common code Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: add tmz support for GC 11.0.1Yifan Zhang1-0/+1
this patch to add tmz support for GC 11.0.1. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-29drm/amdgpu: remove switch from amdgpu_gmc_noretry_setGraham Sider1-39/+9
Simplify the logic in amdgpu_gmc_noretry_set by getting rid of the switch. Also set noretry=1 as default for GFX 10.3.0 and greater since retry faults are not supported. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03drm/amdgpu: enable tmz by default for GC 10.3.7Sunil Khatri1-2/+2
Add IP GC 10.3.7 in the list of target to have tmz enabled by default. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x
2022-05-26drm/amdgpu: add support of tmz for GC 10.3.7Sunil Khatri1-0/+2
Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26drm/amdgpu: change code name to ip version for tmz setSunil Khatri1-9/+18
Use IP version rather then code name of IPs for tmz set. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-11drm/amd/amdgpu: Fix asm/hypervisor.h build error.Yongqiang Sun1-0/+4
Add CONFIG_X86 check to fix the build error. Fixes: 49aa98ca30cd18 ("drm/amd/amdgpu: Only reserve vram for firmware with vega9 MS_HYPERV host.") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-07drm/amd/amdgpu: Only reserve vram for firmware with vega9 MS_HYPERV host.Yongqiang Sun1-4/+5
driver loading failed on VEGA10 SRIOV VF with linux host due to a wide range of stolen reserved vram. Since VEGA10 SRIOV VF need to reserve vram for firmware with windows Hyper_V host specifically, check hypervisor type to only reserve memory for it, and the range of the reserved vram can be limited to between 5M-7M area. Fixes: faad5ccac1eaae ("drm/amdgpu: Add stolen reserved memory for MI25 SRIOV.") Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-25drm/amdgpu: set noretry for gfx 10.3.7Prike Liang1-0/+1
Disable xnack on the gfx10.3.7 for the KFD test. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-25drm/amdgpu: set noretry=1 for GFX 10.3.4Felix Kuehling1-2/+3
Retry faults are not supported on GFX 10.3.4. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-25drm/amdgpu: set noretry=1 for gc 10.3.6Yifan Zhang1-0/+1
this patch to set noretry=1 for gc 10.3.6. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-25drm/amdgpu: add more cases to noretry=1Alex Deucher1-0/+3
Port current list from amd-staging-drm-next. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-15drm/amdgpu: Add stolen reserved memory for MI25 SRIOV.Yongqiang Sun1-0/+9
MI25 SRIOV guest driver loading failed due to allocated memory overlaps with firmware reserved area. Allocate stolen reserved memory for MI25 SRIOV specifically to avoid the memory overlap. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-15drm/amdgpu: Merge get_reserved_allocation to get_vbios_allocations.Yongqiang Sun1-19/+13
Some ASICs need reserved memory for firmware or other components, which is not allowed to be used by driver. amdgpu_gmc_get_reserved_allocation is to handle additional areas. To avoid any missing calling, merged amdgpu_gmc_get_reserved_allocation to amdgpu_gmc_get_vbios_allocations. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-03drm/amdgpu: convert code name to ip version for noretry setYifan Zhang1-6/+5
Use IP version rather than codename for noretry set. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-03drm/amdgpu: centrally calls the .ras_fini function of all ras blocksyipechai1-10/+0
centrally calls the .ras_fini function of all ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-03drm/amdgpu: Optimize xxx_ras_fini function of each ras blockyipechai1-4/+4
1. Move the variables of ras block instance members from specific xxx_ras_fini to general ras_fini call. 2. Function calls inside the modules only use parameters passed from xxx_ras_fini instead of ras block instance members. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-03drm/amdgpu: Modify .ras_fini function pointer parameteryipechai1-4/+4
Modify .ras_fini function pointer parameter so that we can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-22drm/amdgpu: enable TMZ option for onwards asicPrike Liang1-0/+1
The TMZ is disabled by default and enable TMZ option for the IP discovery based asic will help on the TMZ function verification. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17drm/amdgpu: define amdgpu_ras_late_init to call all ras blocks' .ras_late_inityipechai1-44/+0
Define amdgpu_ras_late_init to call all ras blocks' .ras_late_init. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17drm/amdgpu: Optimize xxx_ras_late_init function of each ras blockyipechai1-2/+2
1. Move calling ras block instance members from module internal function to the top calling xxx_ras_late_init. 2. Module internal function calls can only use parameter variables of xxx_ras_late_init instead of ras block instance members. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17drm/amdgpu: Remove redundant calls of ras_late_init in mca ras blockyipechai1-3/+3
Remove redundant calls of ras_late_init in mca ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17drm/amdgpu: Remove redundant calls of ras_late_init in mmhub ras blockyipechai1-1/+1
Remove redundant calls of ras_late_init in mmhub ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-17drm/amdgpu: Remove redundant calls of ras_late_init in hdp ras blockyipechai1-1/+1
Remove redundant calls of ras_late_init in hdp ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-14drm/amdgpu: Optimize amdgpu_xgmi_ras_late_init/amdgpu_xgmi_ras_fini function ↵yipechai1-0/+1
code Optimize amdgpu_xgmi_ras_late_init/amdgpu_xgmi_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-26drm/amdgpu: Move xgmi ras initialization from .late_init to .early_inityipechai1-5/+10
Move xgmi ras initialization from .late_init to .early_init, which let xgmi ras can be initialized only once. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-22drm/amdgpu: fix the page fault caused by uninitialized variablesXiaojian Du1-3/+3
This patch will fix the page fault caused by uninitialized variables. Error Log: ...... [ 130.246323] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 131.963112] [drm] PCIE GART of 512M enabled (table at 0x0000008000000000). [ 131.963130] BUG: unable to handle page fault for address: 000000000002db80 [ 131.963181] #PF: supervisor write access in kernel mode [ 131.963210] #PF: error_code(0x0002) - not-present page [ 131.963233] PGD 0 P4D 0 [ 131.963253] Oops: 0002 [#1] SMP NOPTI [ 131.963273] CPU: 3 PID: 1411 Comm: modprobe Not tainted 5.13.0+ #1 [ 131.963338] RIP: 0010:osq_lock+0x4d/0x120 [ 131.963381] Code: 10 00 00 00 00 48 c7 02 00 00 00 00 89 42 14 87 07 85 c0 0f 84 d0 00 00 00 83 e8 01 48 98 48 03 0c c5 00 d9 ea 9c 48 89 4a 08 <48> 89 11 44 8b 42 10 45 85 c0 0f 85 af 00 00 00 55 48 89 fe 65 4c [ 131.963460] RSP: 0018:ffffa40481717768 EFLAGS: 00010202 [ 131.963483] RAX: fffffffffffffffe RBX: ffffa40481717920 RCX: 000000000002db80 [ 131.963520] RDX: ffff9256fecedb80 RSI: ffff9256cbed2e80 RDI: ffffa40481717ac4 [ 131.963547] RBP: ffffa40481717808 R08: ffffa40481717920 R09: 00000000ffffffff [ 131.963582] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 [ 131.963609] R13: ffffa40481717ac4 R14: ffffa40481717ab8 R15: ffff9256c9480000 [ 131.963646] FS: 00007f23d9b9c540(0000) GS:ffff9256fecc0000(0000) knlGS:0000000000000000 [ 131.963687] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 131.963721] CR2: 000000000002db80 CR3: 0000000008444000 CR4: 00000000000506e0 [ 131.963758] Call Trace: [ 131.963772] ? __ww_mutex_lock.isra.0+0x3a2/0x760 [ 131.963810] ? prb_read_valid+0x1c/0x20 [ 131.963830] ? console_unlock+0x2fe/0x4f0 [ 131.963849] __ww_mutex_lock_interruptible_slowpath+0x16/0x20 [ 131.963882] ww_mutex_lock_interruptible+0x83/0x90 [ 131.963908] amdgpu_bo_create_reserved+0xf0/0x1e0 [amdgpu] [ 131.964237] amdgpu_bo_create_kernel+0x17/0x80 [amdgpu] [ 131.964509] amdgpu_gmc_vram_checking+0x41/0xf0 [amdgpu] [ 131.964807] gmc_v10_0_hw_init+0x105/0x120 [amdgpu] [ 131.965108] amdgpu_device_init.cold+0x1aa4/0x1e3e [amdgpu] ...... Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-20drm/amdgpu: add vram check function for GMCXiaojian Du1-0/+46
This patch will add vram check function for GMC block. It will write pattern data to the vram and then read back from the vram, so that to verify the work status of vram. This patch will cover gmc v6/7/8/9/10. Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-15drm/amdgpu: Modify mca block to fit for the unified ras block data and opsyipechai1-9/+6
1.Modify mca block to fit for the unified ras block data and ops. 2.Define special .ras_block_match function for mca block to identify itself. 3.Change amdgpu_mca_ras_funcs to amdgpu_mca_ras_block(amdgpu_mca_ras had been used), and the corresponding variable name remove _funcs suffix. 4.Remove the const flag of cma ras variable so that cma ras block can be able to be inserted into amdgpu device ras block link list. 5.Invoke amdgpu_ras_register_ras_block function to register cma ras block into amdgpu device ras block link list. 6.Remove the redundant code about cma in amdgpu_ras.c after using the unified ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-15drm/amdgpu: Modify umc block to fit for the unified ras block data and opsyipechai1-6/+4
1.Modify umc block to fit for the unified ras block data and ops. 2.Change amdgpu_umc_ras_funcs to amdgpu_umc_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of umc ras variable so that umc ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register umc ras block into amdgpu device ras block link list. 5.Remove the redundant code about umc in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of umc versions. If .ras_late_init and .ras_fini had been defined by the selected umc version, the defined functions will take effect; if not defined, default fill them with amdgpu_umc_ras_late_init and amdgpu_umc_ras_fini. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-15drm/amdgpu: Modify mmhub block to fit for the unified ras block data and opsyipechai1-6/+4
1.Modify mmhub block to fit for the unified ras block data and ops. 2.Change amdgpu_mmhub_ras_funcs to amdgpu_mmhub_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of mmhub ras variable so that mmhub ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register mmhub ras block into amdgpu device ras block link list. 5.Remove the redundant code about mmhub in amdgpu_ras.c after using the unified ras block. 5.Remove the redundant code about mmhub in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of mmhub versions. If .ras_late_init and .ras_fini had been defined by the selected mmhub version, the defined functions will take effect; if not defined, default fill them with amdgpu_mmhub_ras_late_init and amdgpu_mmhub_ras_fini. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-15drm/amdgpu: Modify hdp block to fit for the unified ras block data and opsyipechai1-6/+4
1.Modify hdp block to fit for the unified ras block data and ops. 2.Change amdgpu_hdp_ras_funcs to amdgpu_hdp_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of hdp ras variable so that hdp ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register hdp ras block into amdgpu device ras block link list. 5.Remove the redundant code about hdp in amdgpu_ras.c after using the unified ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-15drm/amdgpu: Modify xgmi block to fit for the unified ras block data and opsyipechai1-8/+8
1.Modify gmc block to fit for the unified ras block data and ops. 2.Change amdgpu_xgmi_ras_funcs to amdgpu_xgmi_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of gmc ras variable so that gmc ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register gmc ras block into amdgpu device ras block link list. 5.Remove the redundant code about gmc in amdgpu_ras.c after using the unified ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-02drm/amdgpu: handle IH ring1 overflowPhilip Yang1-1/+7
IH ring1 is used to process GPU retry fault, overflow is enabled to drain retry fault because we want receive other interrupts while handling retry fault to recover range. There is no overflow flag set when wptr pass rptr. Use timestamp of rptr and wptr to handle overflow and drain retry fault. If fault timestamp goes backward, the fault is filtered and should not be processed. Drain fault is finished if processed_timestamp is equal to or larger than checkpoint timestamp. Add amdgpu_ih_functions interface decode_iv_ts for different chips to get timestamp from IV entry with different iv size and timestamp offset. amdgpu_ih_decode_iv_ts_helper is used for vega10, vega20, navi10. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>