summaryrefslogtreecommitdiff
path: root/Documentation/driver-api
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/driver-api')
-rw-r--r--Documentation/driver-api/basics.rst8
-rw-r--r--Documentation/driver-api/edac.rst120
-rw-r--r--Documentation/driver-api/gpio/legacy.rst31
-rw-r--r--Documentation/driver-api/ptp.rst29
4 files changed, 157 insertions, 31 deletions
diff --git a/Documentation/driver-api/basics.rst b/Documentation/driver-api/basics.rst
index 4b4d8e28d3be..7671b531ba1a 100644
--- a/Documentation/driver-api/basics.rst
+++ b/Documentation/driver-api/basics.rst
@@ -84,7 +84,13 @@ Reference counting
Atomics
-------
-.. kernel-doc:: arch/x86/include/asm/atomic.h
+.. kernel-doc:: include/linux/atomic/atomic-instrumented.h
+ :internal:
+
+.. kernel-doc:: include/linux/atomic/atomic-arch-fallback.h
+ :internal:
+
+.. kernel-doc:: include/linux/atomic/atomic-long.h
:internal:
Kernel objects manipulation
diff --git a/Documentation/driver-api/edac.rst b/Documentation/driver-api/edac.rst
index b8c742aa0a71..f4f044b95c4f 100644
--- a/Documentation/driver-api/edac.rst
+++ b/Documentation/driver-api/edac.rst
@@ -106,6 +106,16 @@ will occupy those chip-select rows.
This term is avoided because it is unclear when needing to distinguish
between chip-select rows and socket sets.
+* High Bandwidth Memory (HBM)
+
+HBM is a new memory type with low power consumption and ultra-wide
+communication lanes. It uses vertically stacked memory chips (DRAM dies)
+interconnected by microscopic wires called "through-silicon vias," or
+TSVs.
+
+Several stacks of HBM chips connect to the CPU or GPU through an ultra-fast
+interconnect called the "interposer". Therefore, HBM's characteristics
+are nearly indistinguishable from on-chip integrated RAM.
Memory Controllers
------------------
@@ -176,3 +186,113 @@ nodes::
the L1 and L2 directories would be "edac_device_block's"
.. kernel-doc:: drivers/edac/edac_device.h
+
+
+Heterogeneous system support
+----------------------------
+
+An AMD heterogeneous system is built by connecting the data fabrics of
+both CPUs and GPUs via custom xGMI links. Thus, the data fabric on the
+GPU nodes can be accessed the same way as the data fabric on CPU nodes.
+
+The MI200 accelerators are data center GPUs. They have 2 data fabrics,
+and each GPU data fabric contains four Unified Memory Controllers (UMC).
+Each UMC contains eight channels. Each UMC channel controls one 128-bit
+HBM2e (2GB) channel (equivalent to 8 X 2GB ranks). This creates a total
+of 4096-bits of DRAM data bus.
+
+While the UMC is interfacing a 16GB (8high X 2GB DRAM) HBM stack, each UMC
+channel is interfacing 2GB of DRAM (represented as rank).
+
+Memory controllers on AMD GPU nodes can be represented in EDAC thusly:
+
+ GPU DF / GPU Node -> EDAC MC
+ GPU UMC -> EDAC CSROW
+ GPU UMC channel -> EDAC CHANNEL
+
+For example: a heterogeneous system with 1 AMD CPU is connected to
+4 MI200 (Aldebaran) GPUs using xGMI.
+
+Some more heterogeneous hardware details:
+
+- The CPU UMC (Unified Memory Controller) is mostly the same as the GPU UMC.
+ They have chip selects (csrows) and channels. However, the layouts are different
+ for performance, physical layout, or other reasons.
+- CPU UMCs use 1 channel, In this case UMC = EDAC channel. This follows the
+ marketing speak. CPU has X memory channels, etc.
+- CPU UMCs use up to 4 chip selects, So UMC chip select = EDAC CSROW.
+- GPU UMCs use 1 chip select, So UMC = EDAC CSROW.
+- GPU UMCs use 8 channels, So UMC channel = EDAC channel.
+
+The EDAC subsystem provides a mechanism to handle AMD heterogeneous
+systems by calling system specific ops for both CPUs and GPUs.
+
+AMD GPU nodes are enumerated in sequential order based on the PCI
+hierarchy, and the first GPU node is assumed to have a Node ID value
+following those of the CPU nodes after latter are fully populated::
+
+ $ ls /sys/devices/system/edac/mc/
+ mc0 - CPU MC node 0
+ mc1 |
+ mc2 |- GPU card[0] => node 0(mc1), node 1(mc2)
+ mc3 |
+ mc4 |- GPU card[1] => node 0(mc3), node 1(mc4)
+ mc5 |
+ mc6 |- GPU card[2] => node 0(mc5), node 1(mc6)
+ mc7 |
+ mc8 |- GPU card[3] => node 0(mc7), node 1(mc8)
+
+For example, a heterogeneous system with one AMD CPU is connected to
+four MI200 (Aldebaran) GPUs using xGMI. This topology can be represented
+via the following sysfs entries::
+
+ /sys/devices/system/edac/mc/..
+
+ CPU # CPU node
+ ├── mc 0
+
+ GPU Nodes are enumerated sequentially after CPU nodes have been populated
+ GPU card 1 # Each MI200 GPU has 2 nodes/mcs
+ ├── mc 1 # GPU node 0 == mc1, Each MC node has 4 UMCs/CSROWs
+ │   ├── csrow 0 # UMC 0
+ │   │   ├── channel 0 # Each UMC has 8 channels
+ │   │   ├── channel 1 # size of each channel is 2 GB, so each UMC has 16 GB
+ │   │   ├── channel 2
+ │   │   ├── channel 3
+ │   │   ├── channel 4
+ │   │   ├── channel 5
+ │   │   ├── channel 6
+ │   │   ├── channel 7
+ │   ├── csrow 1 # UMC 1
+ │   │   ├── channel 0
+ │   │   ├── ..
+ │   │   ├── channel 7
+ │   ├── .. ..
+ │   ├── csrow 3 # UMC 3
+ │   │   ├── channel 0
+ │   │   ├── ..
+ │   │   ├── channel 7
+ │   ├── rank 0
+ │   ├── .. ..
+ │   ├── rank 31 # total 32 ranks/dimms from 4 UMCs
+ ├
+ ├── mc 2 # GPU node 1 == mc2
+ │   ├── .. # each GPU has total 64 GB
+
+ GPU card 2
+ ├── mc 3
+ │   ├── ..
+ ├── mc 4
+ │   ├── ..
+
+ GPU card 3
+ ├── mc 5
+ │   ├── ..
+ ├── mc 6
+ │   ├── ..
+
+ GPU card 4
+ ├── mc 7
+ │   ├── ..
+ ├── mc 8
+ │   ├── ..
diff --git a/Documentation/driver-api/gpio/legacy.rst b/Documentation/driver-api/gpio/legacy.rst
index 78372853c6d4..b6505914791c 100644
--- a/Documentation/driver-api/gpio/legacy.rst
+++ b/Documentation/driver-api/gpio/legacy.rst
@@ -165,8 +165,7 @@ Most GPIO controllers can be accessed with memory read/write instructions.
Those don't need to sleep, and can safely be done from inside hard
(nonthreaded) IRQ handlers and similar contexts.
-Use the following calls to access such GPIOs,
-for which gpio_cansleep() will always return false (see below)::
+Use the following calls to access such GPIOs::
/* GPIO INPUT: return zero or nonzero */
int gpio_get_value(unsigned gpio);
@@ -200,13 +199,6 @@ Some GPIO controllers must be accessed using message based busses like I2C
or SPI. Commands to read or write those GPIO values require waiting to
get to the head of a queue to transmit a command and get its response.
This requires sleeping, which can't be done from inside IRQ handlers.
-
-Platforms that support this type of GPIO distinguish them from other GPIOs
-by returning nonzero from this call (which requires a valid GPIO number,
-which should have been previously allocated with gpio_request)::
-
- int gpio_cansleep(unsigned gpio);
-
To access such GPIOs, a different set of accessors is defined::
/* GPIO INPUT: return zero or nonzero, might sleep */
@@ -215,7 +207,6 @@ To access such GPIOs, a different set of accessors is defined::
/* GPIO OUTPUT, might sleep */
void gpio_set_value_cansleep(unsigned gpio, int value);
-
Accessing such GPIOs requires a context which may sleep, for example
a threaded IRQ handler, and those accessors must be used instead of
spinlock-safe accessors without the cansleep() name suffix.
@@ -319,11 +310,6 @@ where 'flags' is currently defined to specify the following properties:
* GPIOF_INIT_LOW - as output, set initial level to LOW
* GPIOF_INIT_HIGH - as output, set initial level to HIGH
- * GPIOF_OPEN_DRAIN - gpio pin is open drain type.
- * GPIOF_OPEN_SOURCE - gpio pin is open source type.
-
- * GPIOF_EXPORT_DIR_FIXED - export gpio to sysfs, keep direction
- * GPIOF_EXPORT_DIR_CHANGEABLE - also export, allow changing direction
since GPIOF_INIT_* are only valid when configured as output, so group valid
combinations as:
@@ -332,20 +318,6 @@ combinations as:
* GPIOF_OUT_INIT_LOW - configured as output, initial level LOW
* GPIOF_OUT_INIT_HIGH - configured as output, initial level HIGH
-When setting the flag as GPIOF_OPEN_DRAIN then it will assume that pins is
-open drain type. Such pins will not be driven to 1 in output mode. It is
-require to connect pull-up on such pins. By enabling this flag, gpio lib will
-make the direction to input when it is asked to set value of 1 in output mode
-to make the pin HIGH. The pin is make to LOW by driving value 0 in output mode.
-
-When setting the flag as GPIOF_OPEN_SOURCE then it will assume that pins is
-open source type. Such pins will not be driven to 0 in output mode. It is
-require to connect pull-down on such pin. By enabling this flag, gpio lib will
-make the direction to input when it is asked to set value of 0 in output mode
-to make the pin LOW. The pin is make to HIGH by driving value 1 in output mode.
-
-In the future, these flags can be extended to support more properties.
-
Further more, to ease the claim/release of multiple GPIOs, 'struct gpio' is
introduced to encapsulate all three fields as::
@@ -556,7 +528,6 @@ code, which always dispatches through the gpio_chip::
#define gpio_get_value __gpio_get_value
#define gpio_set_value __gpio_set_value
- #define gpio_cansleep __gpio_cansleep
Fancier implementations could instead define those as inline functions with
logic optimizing access to specific SOC-based GPIOs. For example, if the
diff --git a/Documentation/driver-api/ptp.rst b/Documentation/driver-api/ptp.rst
index 664838ae7776..5e033c3b11b3 100644
--- a/Documentation/driver-api/ptp.rst
+++ b/Documentation/driver-api/ptp.rst
@@ -73,6 +73,22 @@ Writing clock drivers
class driver, since the lock may also be needed by the clock
driver's interrupt service routine.
+PTP hardware clock requirements for '.adjphase'
+-----------------------------------------------
+
+ The 'struct ptp_clock_info' interface has a '.adjphase' function.
+ This function has a set of requirements from the PHC in order to be
+ implemented.
+
+ * The PHC implements a servo algorithm internally that is used to
+ correct the offset passed in the '.adjphase' call.
+ * When other PTP adjustment functions are called, the PHC servo
+ algorithm is disabled.
+
+ **NOTE:** '.adjphase' is not a simple time adjustment functionality
+ that 'jumps' the PHC clock time based on the provided offset. It
+ should correct the offset provided using an internal algorithm.
+
Supported hardware
==================
@@ -106,3 +122,16 @@ Supported hardware
- LPF settings (bandwidth, phase limiting, automatic holdover, physical layer assist (per ITU-T G.8273.2))
- Programmable output PTP clocks, any frequency up to 1GHz (to other PHY/MAC time stampers, refclk to ASSPs/SoCs/FPGAs)
- Lock to GNSS input, automatic switching between GNSS and user-space PHC control (optional)
+
+ * NVIDIA Mellanox
+
+ - GPIO
+ - Certain variants of ConnectX-6 Dx and later products support one
+ GPIO which can time stamp external triggers and one GPIO to produce
+ periodic signals.
+ - Certain variants of ConnectX-5 and older products support one GPIO,
+ configured to either time stamp external triggers or produce
+ periodic signals.
+ - PHC instances
+ - All ConnectX devices have a free-running counter
+ - ConnectX-6 Dx and later devices have a UTC format counter