summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe/xe_device.h
diff options
context:
space:
mode:
authorRodrigo Vivi <rodrigo.vivi@intel.com>2024-04-24 01:18:16 +0300
committerRodrigo Vivi <rodrigo.vivi@intel.com>2024-04-24 19:12:58 +0300
commit8ed9aaae39f39130b7a3eb2726be05d7f64b344c (patch)
tree8dacb1868ca9e95b581d0f122d89106b5c958b9f /drivers/gpu/drm/xe/xe_device.h
parent692818678e80e5999ee1975953f7c6f82cb4a2be (diff)
downloadlinux-8ed9aaae39f39130b7a3eb2726be05d7f64b344c.tar.xz
drm/xe: Force wedged state and block GT reset upon any GPU hang
In many validation situations when debugging GPU Hangs, it is useful to preserve the GT situation from the moment that the timeout occurred. This patch introduces a module parameter that could be used on situations like this. If xe.wedged module parameter is set to 2, Xe will be declared wedged on every single execution timeout (a.k.a. GPU hang) right after devcoredump snapshot capture and without attempting any kind of GT reset and blocking entirely any kind of execution. v2: Really block gt_reset from guc side. (Lucas) s/wedged/busted (Lucas) v3: - s/busted/wedged - Really use global_flags (Dafna) - More robust timeout handling when wedging it. v4: A really robust clean exit done by Matt Brost. No more kernel warns on unbind. v5: Simplify error message (Lucas) Cc: Matthew Brost <matthew.brost@intel.com> Cc: Dafna Hirschfeld <dhirschfeld@habana.ai> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Himanshu Somaiya <himanshu.somaiya@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240423221817.1285081-3-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Diffstat (limited to 'drivers/gpu/drm/xe/xe_device.h')
-rw-r--r--drivers/gpu/drm/xe/xe_device.h15
1 files changed, 1 insertions, 14 deletions
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index d2e4249d37ce..9ede45fc062a 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -172,19 +172,6 @@ static inline bool xe_device_wedged(struct xe_device *xe)
return atomic_read(&xe->wedged);
}
-static inline void xe_device_declare_wedged(struct xe_device *xe)
-{
- if (!atomic_xchg(&xe->wedged, 1)) {
- xe->needs_flr_on_fini = true;
- drm_err(&xe->drm,
- "CRITICAL: Xe has declared device %s as wedged.\n"
- "IOCTLs and executions are blocked until device is probed again with unbind and bind operations:\n"
- "echo '%s' > /sys/bus/pci/drivers/xe/unbind\n"
- "echo '%s' > /sys/bus/pci/drivers/xe/bind\n"
- "Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n",
- dev_name(xe->drm.dev), dev_name(xe->drm.dev),
- dev_name(xe->drm.dev));
- }
-}
+void xe_device_declare_wedged(struct xe_device *xe);
#endif