summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
diff options
context:
space:
mode:
authorJiange Zhao <Jiange.Zhao@amd.com>2020-04-26 12:57:00 +0300
committerAlex Deucher <alexander.deucher@amd.com>2020-05-18 18:23:37 +0300
commit728e7e0cd61899208e924472b9e641dbeb0775c4 (patch)
treef37d2f8d3060d473562da259577d4d4a891f8280 /drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
parentb7f0656a25467fc26eb7fc375caf38ee99f5d004 (diff)
downloadlinux-728e7e0cd61899208e924472b9e641dbeb0775c4.tar.xz
drm/amdgpu: Add autodump debugfs node for gpu reset v8
When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wait_queue_head and give it 10 minutes to dump info. After usermode app has done its work, this 'autodump' node is closed. On node closure, amdgpu gets to know the dump is done through the completion that is triggered in release(). There is no write or read callback because necessary info can be obtained through dmesg and umr. Messages back and forth between usermode app and amdgpu are unnecessary. v2: (1) changed 'registered' to 'app_listening' (2) add a mutex in open() to prevent race condition v3 (chk): grab the reset lock to avoid race in autodump_open, rename debugfs file to amdgpu_autodump, provide autodump_read as well, style and code cleanups v4: add 'bool app_listening' to differentiate situations, so that the node can be reopened; also, there is no need to wait for completion when no app is waiting for a dump. v5: change 'bool app_listening' to 'enum amdgpu_autodump_state' add 'app_state_mutex' for race conditions: (1)Only 1 user can open this file node (2)wait_dump() can only take effect after poll() executed. (3)eliminated the race condition between release() and wait_dump() v6: removed 'enum amdgpu_autodump_state' and 'app_state_mutex' removed state checking in amdgpu_debugfs_wait_dump Improve on top of version 3 so that the node can be reopened. v7: move reinit_completion into open() so that only one user can open it. v8: remove complete_all() from amdgpu_debugfs_wait_dump(). Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h')
-rw-r--r--drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h6
1 files changed, 6 insertions, 0 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
index de12d1101526..2803884d338d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
@@ -31,6 +31,11 @@ struct amdgpu_debugfs {
unsigned num_files;
};
+struct amdgpu_autodump {
+ struct completion dumping;
+ struct wait_queue_head gpu_hang;
+};
+
int amdgpu_debugfs_regs_init(struct amdgpu_device *adev);
int amdgpu_debugfs_init(struct amdgpu_device *adev);
void amdgpu_debugfs_fini(struct amdgpu_device *adev);
@@ -40,3 +45,4 @@ int amdgpu_debugfs_add_files(struct amdgpu_device *adev,
int amdgpu_debugfs_fence_init(struct amdgpu_device *adev);
int amdgpu_debugfs_firmware_init(struct amdgpu_device *adev);
int amdgpu_debugfs_gem_init(struct amdgpu_device *adev);
+int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev);