diff options
Diffstat (limited to 'Documentation/gpu')
-rw-r--r-- | Documentation/gpu/amdgpu/apu-asic-info-table.csv | 4 | ||||
-rw-r--r-- | Documentation/gpu/drm-usage-stats.rst | 91 | ||||
-rw-r--r-- | Documentation/gpu/rfc/index.rst | 4 | ||||
-rw-r--r-- | Documentation/gpu/rfc/xe.rst | 235 | ||||
-rw-r--r-- | Documentation/gpu/todo.rst | 7 | ||||
-rw-r--r-- | Documentation/gpu/vkms.rst | 7 |
6 files changed, 310 insertions, 38 deletions
diff --git a/Documentation/gpu/amdgpu/apu-asic-info-table.csv b/Documentation/gpu/amdgpu/apu-asic-info-table.csv index 395a7b7bfaef..2e76b427ba1e 100644 --- a/Documentation/gpu/amdgpu/apu-asic-info-table.csv +++ b/Documentation/gpu/amdgpu/apu-asic-info-table.csv @@ -5,6 +5,8 @@ Ryzen 4000 series, RENOIR, DCN 2.1, 9.3, VCN 2.2, 4.1.2, 11.0.3 Ryzen 3000 series / AMD Ryzen Embedded V1*/R1* with Radeon Vega Gfx, RAVEN2, DCN 1.0, 9.2.2, VCN 1.0.1, 4.1.1, 10.0.1 SteamDeck, VANGOGH, DCN 3.0.1, 10.3.1, VCN 3.1.0, 5.2.1, 11.5.0 Ryzen 5000 series / Ryzen 7x30 series, GREEN SARDINE / Cezanne / Barcelo / Barcelo-R, DCN 2.1, 9.3, VCN 2.2, 4.1.1, 12.0.1 -Ryzen 6000 series / Ryzen 7x35 series, YELLOW CARP / Rembrandt / Rembrandt+, 3.1.2, 10.3.3, VCN 3.1.1, 5.2.3, 13.0.3 +Ryzen 6000 series / Ryzen 7x35 series / Ryzen 7x36 series, YELLOW CARP / Rembrandt / Rembrandt-R, 3.1.2, 10.3.3, VCN 3.1.1, 5.2.3, 13.0.3 Ryzen 7000 series (AM5), Raphael, 3.1.5, 10.3.6, 3.1.2, 5.2.6, 13.0.5 +Ryzen 7x45 series (FL1), / Dragon Range, 3.1.5, 10.3.6, 3.1.2, 5.2.6, 13.0.5 Ryzen 7x20 series, Mendocino, 3.1.6, 10.3.7, 3.1.1, 5.2.7, 13.0.8 +Ryzen 7x40 series, Phoenix, 3.1.4, 11.0.1 / 11.0.4, 4.0.2, 6.0.1, 13.0.4 / 13.0.11
\ No newline at end of file diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index b46327356e80..fe35a291ff3e 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -24,7 +24,7 @@ File format specification - All keys shall be prefixed with `drm-`. - Whitespace between the delimiter and first non-whitespace character shall be ignored when parsing. -- Neither keys or values are allowed to contain whitespace characters. +- Keys are not allowed to contain whitespace characters. - Numerical key value pairs can end with optional unit string. - Data type of the value is fixed as defined in the specification. @@ -39,12 +39,13 @@ Data types ---------- - <uint> - Unsigned integer without defining the maximum value. -- <str> - String excluding any above defined reserved characters or whitespace. +- <keystr> - String excluding any above defined reserved characters or whitespace. +- <valstr> - String. Mandatory fully standardised keys --------------------------------- -- drm-driver: <str> +- drm-driver: <valstr> String shall contain the name this driver registered as via the respective `struct drm_driver` data structure. @@ -52,6 +53,9 @@ String shall contain the name this driver registered as via the respective Optional fully standardised keys -------------------------------- +Identification +^^^^^^^^^^^^^^ + - drm-pdev: <aaaa:bb.cc.d> For PCI devices this should contain the PCI slot address of the device in @@ -69,10 +73,13 @@ scope of each device, in which case `drm-pdev` shall be present as well. Userspace should make sure to not double account any usage statistics by using the above described criteria in order to associate data to individual clients. -- drm-engine-<str>: <uint> ns +Utilization +^^^^^^^^^^^ + +- drm-engine-<keystr>: <uint> ns GPUs usually contain multiple execution engines. Each shall be given a stable -and unique name (str), with possible values documented in the driver specific +and unique name (keystr), with possible values documented in the driver specific documentation. Value shall be in specified time units which the respective GPU engine spent @@ -84,31 +91,19 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. -- drm-engine-capacity-<str>: <uint> +- drm-engine-capacity-<keystr>: <uint> Engine identifier string must be the same as the one specified in the -drm-engine-<str> tag and shall contain a greater than zero number in case the +drm-engine-<keystr> tag and shall contain a greater than zero number in case the exported engine corresponds to a group of identical hardware engines. In the absence of this tag parser shall assume capacity of one. Zero capacity is not allowed. -- drm-memory-<str>: <uint> [KiB|MiB] - -Each possible memory type which can be used to store buffer objects by the -GPU in question shall be given a stable and unique name to be returned as the -string here. - -Value shall reflect the amount of storage currently consumed by the buffer -object belong to this client, in the respective memory region. - -Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB' -indicating kibi- or mebi-bytes. - -- drm-cycles-<str> <uint> +- drm-cycles-<keystr>: <uint> Engine identifier string must be the same as the one specified in the -drm-engine-<str> tag and shall contain the number of busy cycles for the given +drm-engine-<keystr> tag and shall contain the number of busy cycles for the given engine. Values are not required to be constantly monotonic if it makes the driver @@ -117,16 +112,60 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. -- drm-maxfreq-<str> <uint> [Hz|MHz|KHz] +- drm-maxfreq-<keystr>: <uint> [Hz|MHz|KHz] Engine identifier string must be the same as the one specified in the -drm-engine-<str> tag and shall contain the maximum frequency for the given -engine. Taken together with drm-cycles-<str>, this can be used to calculate -percentage utilization of the engine, whereas drm-engine-<str> only reflects +drm-engine-<keystr> tag and shall contain the maximum frequency for the given +engine. Taken together with drm-cycles-<keystr>, this can be used to calculate +percentage utilization of the engine, whereas drm-engine-<keystr> only reflects time active without considering what frequency the engine is operating as a percentage of it's maximum frequency. +Memory +^^^^^^ + +- drm-memory-<region>: <uint> [KiB|MiB] + +Each possible memory type which can be used to store buffer objects by the +GPU in question shall be given a stable and unique name to be returned as the +string here. The name "memory" is reserved to refer to normal system memory. + +Value shall reflect the amount of storage currently consumed by the buffer +objects belong to this client, in the respective memory region. + +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB' +indicating kibi- or mebi-bytes. + +- drm-shared-<region>: <uint> [KiB|MiB] + +The total size of buffers that are shared with another file (ie. have more +than a single handle). + +- drm-total-<region>: <uint> [KiB|MiB] + +The total size of buffers that including shared and private memory. + +- drm-resident-<region>: <uint> [KiB|MiB] + +The total size of buffers that are resident in the specified region. + +- drm-purgeable-<region>: <uint> [KiB|MiB] + +The total size of buffers that are purgeable. + +- drm-active-<region>: <uint> [KiB|MiB] + +The total size of buffers that are active on one or more engines. + +Implementation Details +====================== + +Drivers should use drm_show_fdinfo() in their `struct file_operations`, and +implement &drm_driver.show_fdinfo if they wish to provide any stats which +are not provided by drm_show_fdinfo(). But even driver specific stats should +be documented above and where possible, aligned with other drivers. + Driver specific implementations -=============================== +------------------------------- :ref:`i915-usage-stats` diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst index 476719771eef..e4f7b005138d 100644 --- a/Documentation/gpu/rfc/index.rst +++ b/Documentation/gpu/rfc/index.rst @@ -31,3 +31,7 @@ host such documentation: .. toctree:: i915_vm_bind.rst + +.. toctree:: + + xe.rst diff --git a/Documentation/gpu/rfc/xe.rst b/Documentation/gpu/rfc/xe.rst new file mode 100644 index 000000000000..2516fe141db6 --- /dev/null +++ b/Documentation/gpu/rfc/xe.rst @@ -0,0 +1,235 @@ +========================== +Xe – Merge Acceptance Plan +========================== +Xe is a new driver for Intel GPUs that supports both integrated and +discrete platforms starting with Tiger Lake (first Intel Xe Architecture). + +This document aims to establish a merge plan for the Xe, by writing down clear +pre-merge goals, in order to avoid unnecessary delays. + +Xe – Overview +============= +The main motivation of Xe is to have a fresh base to work from that is +unencumbered by older platforms, whilst also taking the opportunity to +rearchitect our driver to increase sharing across the drm subsystem, both +leveraging and allowing us to contribute more towards other shared components +like TTM and drm/scheduler. + +This is also an opportunity to start from the beginning with a clean uAPI that is +extensible by design and already aligned with the modern userspace needs. For +this reason, the memory model is solely based on GPU Virtual Address space +bind/unbind (‘VM_BIND’) of GEM buffer objects (BOs) and execution only supporting +explicit synchronization. With persistent mapping across the execution, the +userspace does not need to provide a list of all required mappings during each +submission. + +The new driver leverages a lot from i915. As for display, the intent is to share +the display code with the i915 driver so that there is maximum reuse there. + +As for the power management area, the goal is to have a much-simplified support +for the system suspend states (S-states), PCI device suspend states (D-states), +GPU/Render suspend states (R-states) and frequency management. It should leverage +as much as possible all the existent PCI-subsystem infrastructure (pm and +runtime_pm) and underlying firmware components such PCODE and GuC for the power +states and frequency decisions. + +Repository: + +https://gitlab.freedesktop.org/drm/xe/kernel (branch drm-xe-next) + +Xe – Platforms +============== +Currently, Xe is already functional and has experimental support for multiple +platforms starting from Tiger Lake, with initial support in userspace implemented +in Mesa (for Iris and Anv, our OpenGL and Vulkan drivers), as well as in NEO +(for OpenCL and Level0). + +During a transition period, platforms will be supported by both Xe and i915. +However, the force_probe mechanism existent in both drivers will allow only one +official and by-default probe at a given time. + +For instance, in order to probe a DG2 which PCI ID is 0x5690 by Xe instead of +i915, the following set of parameters need to be used: + +``` +i915.force_probe=!5690 xe.force_probe=5690 +``` + +In both drivers, the ‘.require_force_probe’ protection forces the user to use the +force_probe parameter while the driver is under development. This protection is +only removed when the support for the platform and the uAPI are stable. Stability +which needs to be demonstrated by CI results. + +In order to avoid user space regressions, i915 will continue to support all the +current platforms that are already out of this protection. Xe support will be +forever experimental and dependent on the usage of force_probe for these +platforms. + +When the time comes for Xe, the protection will be lifted on Xe and kept in i915. + +Xe driver will be protected with both STAGING Kconfig and force_probe. Changes in +the uAPI are expected while the driver is behind these protections. STAGING will +be removed when the driver uAPI gets to a mature state where we can guarantee the +‘no regression’ rule. Then force_probe will be lifted only for future platforms +that will be productized with Xe driver, but not with i915. + +Xe – Pre-Merge Goals +==================== + +Drm_scheduler +------------- +Xe primarily uses Firmware based scheduling (GuC FW). However, it will use +drm_scheduler as the scheduler ‘frontend’ for userspace submission in order to +resolve syncobj and dma-buf implicit sync dependencies. However, drm_scheduler is +not yet prepared to handle the 1-to-1 relationship between drm_gpu_scheduler and +drm_sched_entity. + +Deeper changes to drm_scheduler should *not* be required to get Xe accepted, but +some consensus needs to be reached between Xe and other community drivers that +could also benefit from this work, for coupling FW based/assisted submission such +as the ARM’s new Mali GPU driver, and others. + +As a key measurable result, the patch series introducing Xe itself shall not +depend on any other patch touching drm_scheduler itself that was not yet merged +through drm-misc. This, by itself, already includes the reach of an agreement for +uniform 1 to 1 relationship implementation / usage across drivers. + +GPU VA +------ +Two main goals of Xe are meeting together here: + +1) Have an uAPI that aligns with modern UMD needs. + +2) Early upstream engagement. + +RedHat engineers working on Nouveau proposed a new DRM feature to handle keeping +track of GPU virtual address mappings. This is still not merged upstream, but +this aligns very well with our goals and with our VM_BIND. The engagement with +upstream and the port of Xe towards GPUVA is already ongoing. + +As a key measurable result, Xe needs to be aligned with the GPU VA and working in +our tree. Missing Nouveau patches should *not* block Xe and any needed GPUVA +related patch should be independent and present on dri-devel or acked by +maintainers to go along with the first Xe pull request towards drm-next. + +DRM_VM_BIND +----------- +Nouveau, and Xe are all implementing ‘VM_BIND’ and new ‘Exec’ uAPIs in order to +fulfill the needs of the modern uAPI. Xe merge should *not* be blocked on the +development of a common new drm_infrastructure. However, the Xe team needs to +engage with the community to explore the options of a common API. + +As a key measurable result, the DRM_VM_BIND needs to be documented in this file +below, or this entire block deleted if the consensus is for independent drivers +vm_bind ioctls. + +Although having a common DRM level IOCTL for VM_BIND is not a requirement to get +Xe merged, it is mandatory to enforce the overall locking scheme for all major +structs and list (so vm and vma). So, a consensus is needed, and possibly some +common helpers. If helpers are needed, they should be also documented in this +document. + +ASYNC VM_BIND +------------- +Although having a common DRM level IOCTL for VM_BIND is not a requirement to get +Xe merged, it is mandatory to have a consensus with other drivers and Mesa. +It needs to be clear how to handle async VM_BIND and interactions with userspace +memory fences. Ideally with helper support so people don't get it wrong in all +possible ways. + +As a key measurable result, the benefits of ASYNC VM_BIND and a discussion of +various flavors, error handling and a sample API should be documented here or in +a separate document pointed to by this document. + +Userptr integration and vm_bind +------------------------------- +Different drivers implement different ways of dealing with execution of userptr. +With multiple drivers currently introducing support to VM_BIND, the goal is to +aim for a DRM consensus on what’s the best way to have that support. To some +extent this is already getting addressed itself with the GPUVA where likely the +userptr will be a GPUVA with a NULL GEM call VM bind directly on the userptr. +However, there are more aspects around the rules for that and the usage of +mmu_notifiers, locking and other aspects. + +This task here has the goal of introducing a documentation of the basic rules. + +The documentation *needs* to first live in this document (API session below) and +then moved to another more specific document or at Xe level or at DRM level. + +Documentation should include: + + * The userptr part of the VM_BIND api. + + * Locking, including the page-faulting case. + + * O(1) complexity under VM_BIND. + +Some parts of userptr like mmu_notifiers should become GPUVA or DRM helpers when +the second driver supporting VM_BIND+userptr appears. Details to be defined when +the time comes. + +Long running compute: minimal data structure/scaffolding +-------------------------------------------------------- +The generic scheduler code needs to include the handling of endless compute +contexts, with the minimal scaffolding for preempt-ctx fences (probably on the +drm_sched_entity) and making sure drm_scheduler can cope with the lack of job +completion fence. + +The goal is to achieve a consensus ahead of Xe initial pull-request, ideally with +this minimal drm/scheduler work, if needed, merged to drm-misc in a way that any +drm driver, including Xe, could re-use and add their own individual needs on top +in a next stage. However, this should not block the initial merge. + +This is a non-blocker item since the driver without the support for the long +running compute enabled is not a showstopper. + +Display integration with i915 +----------------------------- +In order to share the display code with the i915 driver so that there is maximum +reuse, the i915/display/ code is built twice, once for i915.ko and then for +xe.ko. Currently, the i915/display code in Xe tree is polluted with many 'ifdefs' +depending on the build target. The goal is to refactor both Xe and i915/display +code simultaneously in order to get a clean result before they land upstream, so +that display can already be part of the initial pull request towards drm-next. + +However, display code should not gate the acceptance of Xe in upstream. Xe +patches will be refactored in a way that display code can be removed, if needed, +from the first pull request of Xe towards drm-next. The expectation is that when +both drivers are part of the drm-tip, the introduction of cleaner patches will be +easier and speed up. + +Drm_exec +-------- +Helper to make dma_resv locking for a big number of buffers is getting removed in +the drm_exec series proposed in https://patchwork.freedesktop.org/patch/524376/ +If that happens, Xe needs to change and incorporate the changes in the driver. +The goal is to engage with the Community to understand if the best approach is to +move that to the drivers that are using it or if we should keep the helpers in +place waiting for Xe to get merged. + +This item ties into the GPUVA, VM_BIND, and even long-running compute support. + +As a key measurable result, we need to have a community consensus documented in +this document and the Xe driver prepared for the changes, if necessary. + +Dev_coredump +------------ + +Xe needs to align with other drivers on the way that the error states are +dumped, avoiding a Xe only error_state solution. The goal is to use devcoredump +infrastructure to report error states, since it produces a standardized way +by exposing a virtual and temporary /sys/class/devcoredump device. + +As the key measurable result, Xe driver needs to provide GPU snapshots captured +at hang time through devcoredump, but without depending on any core modification +of devcoredump infrastructure itself. + +Later, when we are in-tree, the goal is to collaborate with devcoredump +infrastructure with overall possible improvements, like multiple file support +for better organization of the dumps, snapshot support, dmesg extra print, +and whatever may make sense and help the overall infrastructure. + +Xe – uAPI high level overview +============================= + +...Warning: To be done in follow up patches after/when/where the main consensus in various items are individually reached. diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst index 1f8a5ebe188e..68bdafa0284f 100644 --- a/Documentation/gpu/todo.rst +++ b/Documentation/gpu/todo.rst @@ -276,11 +276,8 @@ Various hold-ups: - Need to switch to drm_fbdev_generic_setup(), otherwise a lot of the custom fb setup code can't be deleted. -- Many drivers wrap drm_gem_fb_create() only to check for valid formats. For - atomic drivers we could check for valid formats by calling - drm_plane_check_pixel_format() against all planes, and pass if any plane - supports the format. For non-atomic that's not possible since like the format - list for the primary plane is fake and we'd therefor reject valid formats. +- Need to switch to drm_gem_fb_create(), as now drm_gem_fb_create() checks for + valid formats for atomic drivers. - Many drivers subclass drm_framebuffer, we'd need a embedding compatible version of the varios drm_gem_fb_create functions. Maybe called diff --git a/Documentation/gpu/vkms.rst b/Documentation/gpu/vkms.rst index 49db221c0f52..ba04ac7c2167 100644 --- a/Documentation/gpu/vkms.rst +++ b/Documentation/gpu/vkms.rst @@ -118,14 +118,9 @@ Add Plane Features There's lots of plane features we could add support for: -- ARGB format on primary plane: blend the primary plane into background with - translucent alpha. - - Add background color KMS property[Good to get started]. -- Full alpha blending on all planes. - -- Rotation, scaling. +- Scaling. - Additional buffer formats, especially YUV formats for video like NV12. Low/high bpp RGB formats would also be interesting. |