From 2033659c22132695f6e4a70570c330e735164e55 Mon Sep 17 00:00:00 2001 From: Dmitry Baryshkov Date: Mon, 1 Apr 2024 05:42:38 +0300 Subject: drm/msm: import A5xx XML display registers database Import Adreno registers database for A5xx from the Mesa, commit 639488f924d9 ("freedreno/registers: limit the rules schema"). Signed-off-by: Dmitry Baryshkov Patchwork: https://patchwork.freedesktop.org/patch/585857/ Link: https://lore.kernel.org/r/20240401-fd-xml-shipped-v5-8-4bdb277a85a1@linaro.org --- drivers/gpu/drm/msm/registers/adreno/a5xx.xml | 3039 +++++++++++++++++++++++++ 1 file changed, 3039 insertions(+) create mode 100644 drivers/gpu/drm/msm/registers/adreno/a5xx.xml (limited to 'drivers/gpu/drm/msm/registers') diff --git a/drivers/gpu/drm/msm/registers/adreno/a5xx.xml b/drivers/gpu/drm/msm/registers/adreno/a5xx.xml new file mode 100644 index 000000000000..bd8df5945166 --- /dev/null +++ b/drivers/gpu/drm/msm/registers/adreno/a5xx.xml @@ -0,0 +1,3039 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Configures the mapping between VSC_PIPE buffer and + bin, X/Y specify the bin index in the horiz/vert + direction (0,0 is upper left, 0,1 is leftmost bin + on second row, and so on). W/H specify the number + of bins assigned to this VSC_PIPE in the horiz/vert + dimension. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + LRZ: (Low Resolution Z ??) + ---- + + I think it serves two functions, early discard of primitives in binning + pass without needing full resolution depth buffer, and also functions as + a depth-prepass, used during the GMEM draws to discard primitives that + would not be visible due to later draws. + + The LRZ buffer always seems to be z16 format, regardless of actual + depth buffer format. + + Note that LRZ write should be disabled when blend/stencil/etc is enabled, + since the occluded primitive can still contribute to final color value + of a fragment. + + Only enabled for GL_LESS/GL_LEQUAL/GL_GREATER/GL_GEQUAL? + + + + LRZ write also disabled for blend/etc. + + update MAX instead of MIN value, ie. GL_GREATER/GL_GEQUAL + + + + + + + + Pitch is depth width (in pixels) / 8 (aligned to 32). Height + is also divided by 8 (ie. covers 8x8 pixels) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Z_READ_ENABLE bit is set for zfunc other than GL_ALWAYS or GL_NEVER + + + + + + + + + stride of depth/stencil buffer + + + size of layer + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Blits: + ------ + + Blits are triggered by CP_EVENT_WRITE:BLIT, compared to previous + generations where they shared most of the gl pipeline and were + triggered by CP_DRAW_INDX* + + For gmem->mem blob uses RB_BLIT_CNTL.BUF to specify src of + blit (ie MRTn, ZS, etc) and RB_BLIT_DST_LO/HI for destination + gpuaddr. The gmem offset is taken from RB_MRT[n].BASE_LO/HI + + For mem->gmem blob uses just MRT0 or ZS and RB_BLIT_DST_LO/HI + for the GMEM offset, and gpuaddr from RB_MRT[0].BASE_LO/HI + (I suppose this is just to avoid trashing RB_MRT[1..7]??) + + + + + + + + + + + + + + + + + + + + + + + + + For MASK, if RB_BLIT_CNTL.BUF=BLIT_ZS: + 1 - depth + 2 - stencil + 3 - depth+stencil + if RB_BLIT_CNTL.BUF=BLIT_MRTn + then probably a component mask, I always see 0xf + + + + + + Buffer Metadata (flag buffers): + ------------------------------- + + Blob seems to stick some metadata at the front of the buffer, + both z/s and MRT. I think this is same as UBWC (bandwidth + compression) metadata that mdp 1.7 and later supports. See + 1d3fae5698ce5358caab87a15383b690941697e8 in downstream kernel. + UBWC seems to stand for "universal bandwidth compression". + + Before glReadPixels() it does a pair of BYPASS blits (at least + if metadata is used) presumably to resolve metadata. + + NOTES: see: getUBwcBlockSize(), getUBwcMetaBufferSize() at + https://android.googlesource.com/platform/hardware/qcom/display/+/android-6.0.1_r40/msm8994/libgralloc/alloc_controller.cpp + (note that bpp in bytes, not bits, so really cpp) + + Example Layout 2d w/ mipmap levels: + + 100x2000, ifmt=GL_RG, fmt=GL_RG16F, type=GL_FLOAT, meta=64x512@0x8000 (7x500) + base=c072e000, offset=16384, size=1703936 + + color flags + 0 c073a000 c0732000 - level 0 flags is address + 1 c0838000 c0834000 programmed in texture state + 2 c0879000 c0877000 + 3 c089a000 c0899000 + 4 c08ab000 c08aa000 + 5 c08b4000 c08b3000 + 6 c08b9000 c08b8000 + 7 c08bc000 c08bb000 + 8 c08be000 c08bd000 + 9 c08c0000 c08bf000 + 10 c08c2000 c08c1000 + + ARRAY_PITCH is the combined size of all the levels plus flags, + so 0xc08c3000 - 0xc0732000 = 0x00191000 (1642496); each level + takes up a minimum of 2 pages (since color and flags parts are + each page aligned. + + { TILE_MODE = TILE5_3 | SWIZ_X = A5XX_TEX_X | SWIZ_Y = A5XX_TEX_Y | SWIZ_Z = A5XX_TEX_ZERO | SWIZ_W = A5XX_TEX_ONE | MIPLVLS = 0 | FMT = TFMT5_16_16_FLOAT | SWAP = WZYX } + { WIDTH = 100 | HEIGHT = 2000 } + { FETCHSIZE = TFETCH5_4_BYTE | PITCH = 512 | TYPE = A5XX_TEX_2D } + { ARRAY_PITCH = 1642496 | 0x18800000 } - NOTE c2dc always has 0x18800000 but + { BASE_LO = 0xc0732000 } this varies for blob gles driver.. + { BASE_HI = 0 | DEPTH = 1 } not sure what it is + + + + + + + + + + + + + + + + + + + + + + + + + + num of varyings plus four for gl_Position (plus one if gl_PointSize) + plus # of transform-feedback (streamout) varyings if using the + hw streamout (rather than stg instructions in shader) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Stream-Out: + ----------- + + VPC_SO[0..3] registers setup details about streamout buffers, and + number of components to write to each. + + VPC_SO_PROG provides the mapping between output varyings and the SO + buffers. It is written multiple times (via a CP_CONTEXT_REG_BUNCH + packet, not sure if that matters), each write can handle up to two + components of stream-out output. Order matches up to OUTLOC, + including padding. So, if outputting first 3 varyings: + + SP_VS_OUT[0].REG: { A_REGID = r0.w | A_COMPMASK = 0xf | B_REGID = r0.x | B_COMPMASK = 0x7 } + SP_VS_OUT[0x1].REG: { A_REGID = r1.w | A_COMPMASK = 0x3 | B_REGID = r2.y | B_COMPMASK = 0xf } + SP_VS_VPC_DST[0].REG: { OUTLOC0 = 0 | OUTLOC1 = 4 | OUTLOC2 = 8 | OUTLOC3 = 12 } + + Then: + + VPC_SO_PROG: { A_BUF = 0 | A_OFF = 0 | A_EN | A_BUF = 0 | B_OFF = 4 | B_EN } + VPC_SO_PROG: { A_BUF = 0 | A_OFF = 8 | A_EN | A_BUF = 0 | B_OFF = 12 | B_EN } + VPC_SO_PROG: { A_BUF = 2 | A_OFF = 0 | A_EN | A_BUF = 2 | B_OFF = 4 | B_EN } + VPC_SO_PROG: { A_BUF = 2 | A_OFF = 8 | A_EN | A_BUF = 0 | B_OFF = 0 } + VPC_SO_PROG: { A_BUF = 1 | A_OFF = 0 | A_EN | A_BUF = 1 | B_OFF = 4 | B_EN } + + Note that varying order is OUTLOC0, OUTLOC2, OUTLOC1, and note + the padding between OUTLOC1 and OUTLOC2. + + The BUF bitfield indicates which of the four streamout buffers + to write into at the specified offset. + + The VPC_SO[n].FLUSH_BASE_LO/HI is used for hw to write back next + offset which gets loaded back into VPC_SO[n].BUFFER_OFFSET via a + CP_MEM_TO_REG. Probably can be ignored until we have GS/etc, at + which point we can't calculate the offset on the CPU. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + The size of memory that ldp/stp can address. + + + + Guessing that this is the same as a3xx/a6xx. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + per MRT + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture sampler dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Texture constant dwords + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Pitch in bytes (so actually stride) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Pitch in bytes (so actually stride) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -- cgit v1.2.3