summaryrefslogtreecommitdiff
path: root/Documentation/driver-api
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/driver-api')
-rw-r--r--Documentation/driver-api/dma-buf.rst6
-rw-r--r--Documentation/driver-api/dmaengine/client.rst2
-rw-r--r--Documentation/driver-api/dmaengine/dmatest.rst2
-rw-r--r--Documentation/driver-api/gpio/legacy.rst17
-rw-r--r--Documentation/driver-api/hsi.rst4
-rw-r--r--Documentation/driver-api/index.rst9
-rw-r--r--Documentation/driver-api/io-mapping.rst4
-rw-r--r--Documentation/driver-api/md/md-cluster.rst2
-rw-r--r--Documentation/driver-api/md/raid5-cache.rst2
-rw-r--r--Documentation/driver-api/media/drivers/davinci-vpbe-devel.rst39
-rw-r--r--Documentation/driver-api/media/drivers/index.rst1
-rw-r--r--Documentation/driver-api/media/drivers/vidtv.rst2
-rw-r--r--Documentation/driver-api/media/dtv-demux.rst2
-rw-r--r--Documentation/driver-api/media/v4l2-subdev.rst4
-rw-r--r--Documentation/driver-api/mei/nfc.rst2
-rw-r--r--Documentation/driver-api/nfc/nfc-hci.rst2
-rw-r--r--Documentation/driver-api/nvdimm/nvdimm.rst2
-rw-r--r--Documentation/driver-api/nvdimm/security.rst2
-rw-r--r--Documentation/driver-api/phy/phy.rst24
-rw-r--r--Documentation/driver-api/pin-control.rst500
-rw-r--r--Documentation/driver-api/pldmfw/index.rst2
-rw-r--r--Documentation/driver-api/serial/driver.rst2
-rw-r--r--Documentation/driver-api/surface_aggregator/client.rst12
-rw-r--r--Documentation/driver-api/surface_aggregator/ssh.rst38
-rw-r--r--Documentation/driver-api/thermal/index.rst1
-rw-r--r--Documentation/driver-api/thermal/intel_dptf.rst3
-rw-r--r--Documentation/driver-api/thermal/intel_powerclamp.rst320
-rw-r--r--Documentation/driver-api/tty/n_gsm.rst19
-rw-r--r--Documentation/driver-api/usb/dwc3.rst2
-rw-r--r--Documentation/driver-api/usb/usb3-debug-port.rst2
-rw-r--r--Documentation/driver-api/vfio-mediated-device.rst108
-rw-r--r--Documentation/driver-api/vfio.rst82
-rw-r--r--Documentation/driver-api/virtio/index.rst11
-rw-r--r--Documentation/driver-api/virtio/virtio.rst145
-rw-r--r--Documentation/driver-api/virtio/writing_virtio_drivers.rst197
35 files changed, 749 insertions, 823 deletions
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 622b8156d212..2e8dfd1a66b6 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -1,5 +1,5 @@
-Buffer Sharing and Synchronization
-==================================
+Buffer Sharing and Synchronization (dma-buf)
+============================================
The dma-buf subsystem provides the framework for sharing buffers for
hardware (DMA) access across multiple device drivers and subsystems, and
@@ -264,7 +264,7 @@ through memory management dependencies which userspace is unaware of, which
randomly hangs workloads until the timeout kicks in. Workloads, which from
userspace's perspective, do not contain a deadlock. In such a mixed fencing
architecture there is no single entity with knowledge of all dependencies.
-Thefore preventing such deadlocks from within the kernel is not possible.
+Therefore preventing such deadlocks from within the kernel is not possible.
The only solution to avoid dependencies loops is by not allowing indefinite
fences in the kernel. This means:
diff --git a/Documentation/driver-api/dmaengine/client.rst b/Documentation/driver-api/dmaengine/client.rst
index bfd057b21a00..ecf139f73da4 100644
--- a/Documentation/driver-api/dmaengine/client.rst
+++ b/Documentation/driver-api/dmaengine/client.rst
@@ -175,7 +175,7 @@ The details of these operations are:
driver can ask for the pointer, maximum size and the currently used size of
the metadata and can directly update or read it.
- Becasue the DMA driver manages the memory area containing the metadata,
+ Because the DMA driver manages the memory area containing the metadata,
clients must make sure that they do not try to access or get the pointer
after their transfer completion callback has run for the descriptor.
If no completion callback has been defined for the transfer, then the
diff --git a/Documentation/driver-api/dmaengine/dmatest.rst b/Documentation/driver-api/dmaengine/dmatest.rst
index cf9859cd0b43..e2a63cefd783 100644
--- a/Documentation/driver-api/dmaengine/dmatest.rst
+++ b/Documentation/driver-api/dmaengine/dmatest.rst
@@ -89,7 +89,7 @@ The following command returns the state of the test. ::
% cat /sys/module/dmatest/parameters/run
-To wait for test completion userpace can poll 'run' until it is false, or use
+To wait for test completion userspace can poll 'run' until it is false, or use
the wait parameter. Specifying 'wait=1' when loading the module causes module
initialization to pause until a test run has completed, while reading
/sys/module/dmatest/parameters/wait waits for any running test to complete
diff --git a/Documentation/driver-api/gpio/legacy.rst b/Documentation/driver-api/gpio/legacy.rst
index e17910cc3271..a0559d93efd1 100644
--- a/Documentation/driver-api/gpio/legacy.rst
+++ b/Documentation/driver-api/gpio/legacy.rst
@@ -387,9 +387,6 @@ map between them using calls like::
/* map GPIO numbers to IRQ numbers */
int gpio_to_irq(unsigned gpio);
- /* map IRQ numbers to GPIO numbers (avoid using this) */
- int irq_to_gpio(unsigned irq);
-
Those return either the corresponding number in the other namespace, or
else a negative errno code if the mapping can't be done. (For example,
some GPIOs can't be used as IRQs.) It is an unchecked error to use a GPIO
@@ -405,11 +402,6 @@ devices, by the board-specific initialization code. Note that IRQ trigger
options are part of the IRQ interface, e.g. IRQF_TRIGGER_FALLING, as are
system wakeup capabilities.
-Non-error values returned from irq_to_gpio() would most commonly be used
-with gpio_get_value(), for example to initialize or update driver state
-when the IRQ is edge-triggered. Note that some platforms don't support
-this reverse mapping, so you should avoid using it.
-
Emulating Open Drain Signals
----------------------------
@@ -735,10 +727,6 @@ requested using gpio_request()::
/* reverse gpio_export() */
void gpio_unexport();
- /* create a sysfs link to an exported GPIO node */
- int gpio_export_link(struct device *dev, const char *name,
- unsigned gpio)
-
After a kernel driver requests a GPIO, it may only be made available in
the sysfs interface by gpio_export(). The driver can control whether the
signal direction may change. This helps drivers prevent userspace code
@@ -748,11 +736,6 @@ This explicit exporting can help with debugging (by making some kinds
of experiments easier), or can provide an always-there interface that's
suitable for documenting as part of a board support package.
-After the GPIO has been exported, gpio_export_link() allows creating
-symlinks from elsewhere in sysfs to the GPIO sysfs node. Drivers can
-use this to provide the interface under their own device in sysfs with
-a descriptive name.
-
API Reference
=============
diff --git a/Documentation/driver-api/hsi.rst b/Documentation/driver-api/hsi.rst
index f9cec02b72a1..01b6bebfbd1a 100644
--- a/Documentation/driver-api/hsi.rst
+++ b/Documentation/driver-api/hsi.rst
@@ -4,7 +4,7 @@ High Speed Synchronous Serial Interface (HSI)
Introduction
---------------
-High Speed Syncronous Interface (HSI) is a fullduplex, low latency protocol,
+High Speed Synchronous Interface (HSI) is a full duplex, low latency protocol,
that is optimized for die-level interconnect between an Application Processor
and a Baseband chipset. It has been specified by the MIPI alliance in 2003 and
implemented by multiple vendors since then.
@@ -52,7 +52,7 @@ hsi-char Device
------------------
Each port automatically registers a generic client driver called hsi_char,
-which provides a charecter device for userspace representing the HSI port.
+which provides a character device for userspace representing the HSI port.
It can be used to communicate via HSI from userspace. Userspace may
configure the hsi_char device using the following ioctl commands:
diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst
index d3a58f77328e..ff9aa1afdc62 100644
--- a/Documentation/driver-api/index.rst
+++ b/Documentation/driver-api/index.rst
@@ -1,6 +1,8 @@
-========================================
-The Linux driver implementer's API guide
-========================================
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Driver implementer's API guide
+==============================
The kernel offers a wide variety of interfaces to support the development
of device drivers. This document is an only somewhat organized collection
@@ -106,6 +108,7 @@ available subsections can be seen below.
vfio-mediated-device
vfio
vfio-pci-device-specific-driver-acceptance
+ virtio/index
xilinx/index
xillybus
zorro
diff --git a/Documentation/driver-api/io-mapping.rst b/Documentation/driver-api/io-mapping.rst
index a0cfb15988df..7274204b0435 100644
--- a/Documentation/driver-api/io-mapping.rst
+++ b/Documentation/driver-api/io-mapping.rst
@@ -44,7 +44,7 @@ This _wc variant returns a write-combining map to the page and may only be
used with mappings created by io_mapping_create_wc()
Temporary mappings are only valid in the context of the caller. The mapping
-is not guaranteed to be globaly visible.
+is not guaranteed to be globally visible.
io_mapping_map_local_wc() has a side effect on X86 32bit as it disables
migration to make the mapping code work. No caller can rely on this side
@@ -78,7 +78,7 @@ variant, although this may be significantly slower::
unsigned long offset)
This works like io_mapping_map_atomic/local_wc() except it has no side
-effects and the pointer is globaly visible.
+effects and the pointer is globally visible.
The mappings are released with::
diff --git a/Documentation/driver-api/md/md-cluster.rst b/Documentation/driver-api/md/md-cluster.rst
index 96eb52cec7eb..e93f35e0e157 100644
--- a/Documentation/driver-api/md/md-cluster.rst
+++ b/Documentation/driver-api/md/md-cluster.rst
@@ -65,7 +65,7 @@ There are three groups of locks for managing the device:
2.3 new-device management
-------------------------
- A single lock: "no-new-dev" is used to co-ordinate the addition of
+ A single lock: "no-new-dev" is used to coordinate the addition of
new devices - this must be synchronized across the array.
Normally all nodes hold a concurrent-read lock on this device.
diff --git a/Documentation/driver-api/md/raid5-cache.rst b/Documentation/driver-api/md/raid5-cache.rst
index d7a15f44a7c3..5f947cbc2e78 100644
--- a/Documentation/driver-api/md/raid5-cache.rst
+++ b/Documentation/driver-api/md/raid5-cache.rst
@@ -81,7 +81,7 @@ The write-through and write-back cache use the same disk format. The cache disk
is organized as a simple write log. The log consists of 'meta data' and 'data'
pairs. The meta data describes the data. It also includes checksum and sequence
ID for recovery identification. Data can be IO data and parity data. Data is
-checksumed too. The checksum is stored in the meta data ahead of the data. The
+checksummed too. The checksum is stored in the meta data ahead of the data. The
checksum is an optimization because MD can write meta and data freely without
worry about the order. MD superblock has a field pointed to the valid meta data
of log head.
diff --git a/Documentation/driver-api/media/drivers/davinci-vpbe-devel.rst b/Documentation/driver-api/media/drivers/davinci-vpbe-devel.rst
deleted file mode 100644
index 4e87bdbc7ae4..000000000000
--- a/Documentation/driver-api/media/drivers/davinci-vpbe-devel.rst
+++ /dev/null
@@ -1,39 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-The VPBE V4L2 driver design
-===========================
-
-File partitioning
------------------
-
- V4L2 display device driver
- drivers/media/platform/ti/davinci/vpbe_display.c
- drivers/media/platform/ti/davinci/vpbe_display.h
-
- VPBE display controller
- drivers/media/platform/ti/davinci/vpbe.c
- drivers/media/platform/ti/davinci/vpbe.h
-
- VPBE venc sub device driver
- drivers/media/platform/ti/davinci/vpbe_venc.c
- drivers/media/platform/ti/davinci/vpbe_venc.h
- drivers/media/platform/ti/davinci/vpbe_venc_regs.h
-
- VPBE osd driver
- drivers/media/platform/ti/davinci/vpbe_osd.c
- drivers/media/platform/ti/davinci/vpbe_osd.h
- drivers/media/platform/ti/davinci/vpbe_osd_regs.h
-
-To be done
-----------
-
-vpbe display controller
- - Add support for external encoders.
- - add support for selecting external encoder as default at probe time.
-
-vpbe venc sub device
- - add timings for supporting ths8200
- - add support for LogicPD LCD.
-
-FB drivers
- - Add support for fbdev drivers.- Ready and part of subsequent patches.
diff --git a/Documentation/driver-api/media/drivers/index.rst b/Documentation/driver-api/media/drivers/index.rst
index 61d64366ab51..c4123a16b5f9 100644
--- a/Documentation/driver-api/media/drivers/index.rst
+++ b/Documentation/driver-api/media/drivers/index.rst
@@ -15,7 +15,6 @@ Video4Linux (V4L) drivers
bttv-devel
cx2341x-devel
cx88-devel
- davinci-vpbe-devel
fimc-devel
pvrusb2
pxa_camera
diff --git a/Documentation/driver-api/media/drivers/vidtv.rst b/Documentation/driver-api/media/drivers/vidtv.rst
index 673bdff919ea..54f269f478d3 100644
--- a/Documentation/driver-api/media/drivers/vidtv.rst
+++ b/Documentation/driver-api/media/drivers/vidtv.rst
@@ -28,7 +28,7 @@ Currently, it consists of:
takes parameters at initialization that will dictate how the simulation
behaves.
-- Code reponsible for encoding a valid MPEG Transport Stream, which is then
+- Code responsible for encoding a valid MPEG Transport Stream, which is then
passed to the bridge driver. This fake stream contains some hardcoded content.
For now, we have a single, audio-only channel containing a single MPEG
Elementary Stream, which in turn contains a SMPTE 302m encoded sine-wave.
diff --git a/Documentation/driver-api/media/dtv-demux.rst b/Documentation/driver-api/media/dtv-demux.rst
index c0ae5dec5328..144124142622 100644
--- a/Documentation/driver-api/media/dtv-demux.rst
+++ b/Documentation/driver-api/media/dtv-demux.rst
@@ -24,7 +24,7 @@ unless this is fixed in the HW platform.
The demux kABI only controls front-ends regarding to their connections with
demuxes; the kABI used to set the other front-end parameters, such as
-tuning, are devined via the Digital TV Frontend kABI.
+tuning, are defined via the Digital TV Frontend kABI.
The functions that implement the abstract interface demux should be defined
static or module private and registered to the Demux core for external
diff --git a/Documentation/driver-api/media/v4l2-subdev.rst b/Documentation/driver-api/media/v4l2-subdev.rst
index 260cfa8c3f3d..602dadaa81d8 100644
--- a/Documentation/driver-api/media/v4l2-subdev.rst
+++ b/Documentation/driver-api/media/v4l2-subdev.rst
@@ -321,7 +321,7 @@ response to video node operations. This hides the complexity of the underlying
hardware from applications. For complex devices, finer-grained control of the
device than what the video nodes offer may be required. In those cases, bridge
drivers that implement :ref:`the media controller API <media_controller>` may
-opt for making the subdevice operations directly accessible from userpace.
+opt for making the subdevice operations directly accessible from userspace.
Device nodes named ``v4l-subdev``\ *X* can be created in ``/dev`` to access
sub-devices directly. If a sub-device supports direct userspace configuration
@@ -574,7 +574,7 @@ issues with subdevice drivers that let the V4L2 core manage the active state,
as they expect to receive the appropriate state as a parameter. To help the
conversion of subdevice drivers to a managed active state without having to
convert all callers at the same time, an additional wrapper layer has been
-added to v4l2_subdev_call(), which handles the NULL case by geting and locking
+added to v4l2_subdev_call(), which handles the NULL case by getting and locking
the callee's active state with :c:func:`v4l2_subdev_lock_and_get_active_state()`,
and unlocking the state after the call.
diff --git a/Documentation/driver-api/mei/nfc.rst b/Documentation/driver-api/mei/nfc.rst
index b5b6fc96f85e..8fe8664c28cc 100644
--- a/Documentation/driver-api/mei/nfc.rst
+++ b/Documentation/driver-api/mei/nfc.rst
@@ -3,7 +3,7 @@
MEI NFC
-------
-Some Intel 8 and 9 Serieses chipsets supports NFC devices connected behind
+Some Intel 8 and 9 Series chipsets support NFC devices connected behind
the Intel Management Engine controller.
MEI client bus exposes the NFC chips as NFC phy devices and enables
binding with Microread and NXP PN544 NFC device driver from the Linux NFC
diff --git a/Documentation/driver-api/nfc/nfc-hci.rst b/Documentation/driver-api/nfc/nfc-hci.rst
index f10fe53aa9fe..486aa647c456 100644
--- a/Documentation/driver-api/nfc/nfc-hci.rst
+++ b/Documentation/driver-api/nfc/nfc-hci.rst
@@ -150,7 +150,7 @@ LLC
Communication between the CPU and the chip often requires some link layer
protocol. Those are isolated as modules managed by the HCI layer. There are
-currently two modules : nop (raw transfert) and shdlc.
+currently two modules : nop (raw transfer) and shdlc.
A new llc must implement the following functions::
struct nfc_llc_ops {
diff --git a/Documentation/driver-api/nvdimm/nvdimm.rst b/Documentation/driver-api/nvdimm/nvdimm.rst
index be8587a558e1..ca16b5acbf30 100644
--- a/Documentation/driver-api/nvdimm/nvdimm.rst
+++ b/Documentation/driver-api/nvdimm/nvdimm.rst
@@ -82,7 +82,7 @@ LABEL:
Metadata stored on a DIMM device that partitions and identifies
(persistently names) capacity allocated to different PMEM namespaces. It
also indicates whether an address abstraction like a BTT is applied to
- the namepsace. Note that traditional partition tables, GPT/MBR, are
+ the namespace. Note that traditional partition tables, GPT/MBR, are
layered on top of a PMEM namespace, or an address abstraction like BTT
if present, but partition support is deprecated going forward.
diff --git a/Documentation/driver-api/nvdimm/security.rst b/Documentation/driver-api/nvdimm/security.rst
index 7aab71524116..eb3d35e6a95c 100644
--- a/Documentation/driver-api/nvdimm/security.rst
+++ b/Documentation/driver-api/nvdimm/security.rst
@@ -83,7 +83,7 @@ passed in.
6. Freeze
---------
The freeze operation does not require any keys. The security config can be
-frozen by a user with root privelege.
+frozen by a user with root privilege.
7. Disable
----------
diff --git a/Documentation/driver-api/phy/phy.rst b/Documentation/driver-api/phy/phy.rst
index 8e8b3e8f9523..81785c084f3e 100644
--- a/Documentation/driver-api/phy/phy.rst
+++ b/Documentation/driver-api/phy/phy.rst
@@ -103,27 +103,31 @@ it. This framework provides the following APIs to get a reference to the PHY.
::
struct phy *phy_get(struct device *dev, const char *string);
- struct phy *phy_optional_get(struct device *dev, const char *string);
struct phy *devm_phy_get(struct device *dev, const char *string);
struct phy *devm_phy_optional_get(struct device *dev,
const char *string);
+ struct phy *devm_of_phy_get(struct device *dev, struct device_node *np,
+ const char *con_id);
+ struct phy *devm_of_phy_optional_get(struct device *dev,
+ struct device_node *np,
+ const char *con_id);
struct phy *devm_of_phy_get_by_index(struct device *dev,
struct device_node *np,
int index);
-phy_get, phy_optional_get, devm_phy_get and devm_phy_optional_get can
-be used to get the PHY. In the case of dt boot, the string arguments
+phy_get, devm_phy_get and devm_phy_optional_get can be used to get the PHY.
+In the case of dt boot, the string arguments
should contain the phy name as given in the dt data and in the case of
non-dt boot, it should contain the label of the PHY. The two
devm_phy_get associates the device with the PHY using devres on
successful PHY get. On driver detach, release function is invoked on
-the devres data and devres data is freed. phy_optional_get and
-devm_phy_optional_get should be used when the phy is optional. These
-two functions will never return -ENODEV, but instead returns NULL when
-the phy cannot be found.Some generic drivers, such as ehci, may use multiple
-phys and for such drivers referencing phy(s) by name(s) does not make sense. In
-this case, devm_of_phy_get_by_index can be used to get a phy reference based on
-the index.
+the devres data and devres data is freed.
+The _optional_get variants should be used when the phy is optional. These
+functions will never return -ENODEV, but instead return NULL when
+the phy cannot be found.
+Some generic drivers, such as ehci, may use multiple phys. In this case,
+devm_of_phy_get or devm_of_phy_get_by_index can be used to get a phy
+reference based on name or index.
It should be noted that NULL is a valid phy reference. All phy
consumer calls on the NULL phy become NOPs. That is the release calls,
diff --git a/Documentation/driver-api/pin-control.rst b/Documentation/driver-api/pin-control.rst
index 0022e930e93e..4639912dc9cc 100644
--- a/Documentation/driver-api/pin-control.rst
+++ b/Documentation/driver-api/pin-control.rst
@@ -11,20 +11,18 @@ This subsystem deals with:
- Multiplexing of pins, pads, fingers (etc) see below for details
- Configuration of pins, pads, fingers (etc), such as software-controlled
- biasing and driving mode specific pins, such as pull-up/down, open drain,
+ biasing and driving mode specific pins, such as pull-up, pull-down, open drain,
load capacitance etc.
Top-level interface
===================
-Definition of PIN CONTROLLER:
+Definitions:
-- A pin controller is a piece of hardware, usually a set of registers, that
+- A PIN CONTROLLER is a piece of hardware, usually a set of registers, that
can control PINs. It may be able to multiplex, bias, set load capacitance,
set drive strength, etc. for individual pins or groups of pins.
-Definition of PIN:
-
- PINS are equal to pads, fingers, balls or whatever packaging input or
output line you want to control and these are denoted by unsigned integers
in the range 0..maxpin. This numberspace is local to each PIN CONTROLLER, so
@@ -57,7 +55,9 @@ Here is an example of a PGA (Pin Grid Array) chip seen from underneath::
1 o o o o o o o o
To register a pin controller and name all the pins on this package we can do
-this in our driver::
+this in our driver:
+
+.. code-block:: c
#include <linux/pinctrl/pinctrl.h>
@@ -78,14 +78,13 @@ this in our driver::
.owner = THIS_MODULE,
};
- int __init foo_probe(void)
+ int __init foo_init(void)
{
int error;
struct pinctrl_dev *pctl;
- error = pinctrl_register_and_init(&foo_desc, <PARENT>,
- NULL, &pctl);
+ error = pinctrl_register_and_init(&foo_desc, <PARENT>, NULL, &pctl);
if (error)
return error;
@@ -95,20 +94,20 @@ this in our driver::
To enable the pinctrl subsystem and the subgroups for PINMUX and PINCONF and
selected drivers, you need to select them from your machine's Kconfig entry,
since these are so tightly integrated with the machines they are used on.
-See for example arch/arm/mach-ux500/Kconfig for an example.
+See ``arch/arm/mach-ux500/Kconfig`` for an example.
Pins usually have fancier names than this. You can find these in the datasheet
for your chip. Notice that the core pinctrl.h file provides a fancy macro
-called PINCTRL_PIN() to create the struct entries. As you can see I enumerated
-the pins from 0 in the upper left corner to 63 in the lower right corner.
+called ``PINCTRL_PIN()`` to create the struct entries. As you can see the pins are
+enumerated from 0 in the upper left corner to 63 in the lower right corner.
This enumeration was arbitrarily chosen, in practice you need to think
through your numbering system so that it matches the layout of registers
and such things in your driver, or the code may become complicated. You must
also consider matching of offsets to the GPIO ranges that may be handled by
the pin controller.
-For a padring with 467 pads, as opposed to actual pins, I used an enumeration
-like this, walking around the edge of the chip, which seems to be industry
+For a padding with 467 pads, as opposed to actual pins, the enumeration will
+be like this, walking around the edge of the chip, which seems to be industry
standard too (all these pads had names, too)::
@@ -132,50 +131,38 @@ on { 0, 8, 16, 24 }, and a group of pins dealing with an I2C interface on pins
on { 24, 25 }.
These two groups are presented to the pin control subsystem by implementing
-some generic pinctrl_ops like this::
+some generic ``pinctrl_ops`` like this:
- #include <linux/pinctrl/pinctrl.h>
+.. code-block:: c
- struct foo_group {
- const char *name;
- const unsigned int *pins;
- const unsigned num_pins;
- };
+ #include <linux/pinctrl/pinctrl.h>
static const unsigned int spi0_pins[] = { 0, 8, 16, 24 };
static const unsigned int i2c0_pins[] = { 24, 25 };
- static const struct foo_group foo_groups[] = {
- {
- .name = "spi0_grp",
- .pins = spi0_pins,
- .num_pins = ARRAY_SIZE(spi0_pins),
- },
- {
- .name = "i2c0_grp",
- .pins = i2c0_pins,
- .num_pins = ARRAY_SIZE(i2c0_pins),
- },
+ static const struct pingroup foo_groups[] = {
+ PINCTRL_PINGROUP("spi0_grp", spi0_pins, ARRAY_SIZE(spi0_pins)),
+ PINCTRL_PINGROUP("i2c0_grp", i2c0_pins, ARRAY_SIZE(i2c0_pins)),
};
-
static int foo_get_groups_count(struct pinctrl_dev *pctldev)
{
return ARRAY_SIZE(foo_groups);
}
static const char *foo_get_group_name(struct pinctrl_dev *pctldev,
- unsigned selector)
+ unsigned int selector)
{
return foo_groups[selector].name;
}
- static int foo_get_group_pins(struct pinctrl_dev *pctldev, unsigned selector,
- const unsigned **pins,
- unsigned *num_pins)
+ static int foo_get_group_pins(struct pinctrl_dev *pctldev,
+ unsigned int selector,
+ const unsigned int **pins,
+ unsigned int *npins)
{
- *pins = (unsigned *) foo_groups[selector].pins;
- *num_pins = foo_groups[selector].num_pins;
+ *pins = foo_groups[selector].pins;
+ *npins = foo_groups[selector].npins;
return 0;
}
@@ -185,13 +172,12 @@ some generic pinctrl_ops like this::
.get_group_pins = foo_get_group_pins,
};
-
static struct pinctrl_desc foo_desc = {
- ...
- .pctlops = &foo_pctrl_ops,
+ ...
+ .pctlops = &foo_pctrl_ops,
};
-The pin control subsystem will call the .get_groups_count() function to
+The pin control subsystem will call the ``.get_groups_count()`` function to
determine the total number of legal selectors, then it will call the other functions
to retrieve the name and pins of the group. Maintaining the data structure of
the groups is up to the driver, this is just a simple example - in practice you
@@ -204,59 +190,62 @@ Pin configuration
Pins can sometimes be software-configured in various ways, mostly related
to their electronic properties when used as inputs or outputs. For example you
-may be able to make an output pin high impedance, or "tristate" meaning it is
+may be able to make an output pin high impedance (Hi-Z), or "tristate" meaning it is
effectively disconnected. You may be able to connect an input pin to VDD or GND
using a certain resistor value - pull up and pull down - so that the pin has a
stable value when nothing is driving the rail it is connected to, or when it's
unconnected.
Pin configuration can be programmed by adding configuration entries into the
-mapping table; see section "Board/machine configuration" below.
+mapping table; see section `Board/machine configuration`_ below.
The format and meaning of the configuration parameter, PLATFORM_X_PULL_UP
above, is entirely defined by the pin controller driver.
The pin configuration driver implements callbacks for changing pin
-configuration in the pin controller ops like this::
+configuration in the pin controller ops like this:
+
+.. code-block:: c
- #include <linux/pinctrl/pinctrl.h>
#include <linux/pinctrl/pinconf.h>
+ #include <linux/pinctrl/pinctrl.h>
+
#include "platform_x_pindefs.h"
static int foo_pin_config_get(struct pinctrl_dev *pctldev,
- unsigned offset,
- unsigned long *config)
+ unsigned int offset,
+ unsigned long *config)
{
struct my_conftype conf;
- ... Find setting for pin @ offset ...
+ /* ... Find setting for pin @ offset ... */
*config = (unsigned long) conf;
}
static int foo_pin_config_set(struct pinctrl_dev *pctldev,
- unsigned offset,
- unsigned long config)
+ unsigned int offset,
+ unsigned long config)
{
struct my_conftype *conf = (struct my_conftype *) config;
switch (conf) {
case PLATFORM_X_PULL_UP:
...
- }
+ break;
}
}
- static int foo_pin_config_group_get (struct pinctrl_dev *pctldev,
- unsigned selector,
- unsigned long *config)
+ static int foo_pin_config_group_get(struct pinctrl_dev *pctldev,
+ unsigned selector,
+ unsigned long *config)
{
...
}
- static int foo_pin_config_group_set (struct pinctrl_dev *pctldev,
- unsigned selector,
- unsigned long config)
+ static int foo_pin_config_group_set(struct pinctrl_dev *pctldev,
+ unsigned selector,
+ unsigned long config)
{
...
}
@@ -281,8 +270,8 @@ The GPIO drivers may want to perform operations of various types on the same
physical pins that are also registered as pin controller pins.
First and foremost, the two subsystems can be used as completely orthogonal,
-see the section named "pin control requests from drivers" and
-"drivers needing both pin control and GPIOs" below for details. But in some
+see the section named `Pin control requests from drivers`_ and
+`Drivers needing both pin control and GPIOs`_ below for details. But in some
situations a cross-subsystem mapping between pins and GPIOs is needed.
Since the pin controller subsystem has its pinspace local to the pin controller
@@ -291,7 +280,13 @@ controller handles control of a certain GPIO pin. Since a single pin controller
may be muxing several GPIO ranges (typically SoCs that have one set of pins,
but internally several GPIO silicon blocks, each modelled as a struct
gpio_chip) any number of GPIO ranges can be added to a pin controller instance
-like this::
+like this:
+
+.. code-block:: c
+
+ #include <linux/gpio/driver.h>
+
+ #include <linux/pinctrl/pinctrl.h>
struct gpio_chip chip_a;
struct gpio_chip chip_b;
@@ -302,7 +297,7 @@ like this::
.base = 32,
.pin_base = 32,
.npins = 16,
- .gc = &chip_a;
+ .gc = &chip_a,
};
static struct pinctrl_gpio_range gpio_range_b = {
@@ -314,16 +309,18 @@ like this::
.gc = &chip_b;
};
+ int __init foo_init(void)
{
struct pinctrl_dev *pctl;
...
pinctrl_add_gpio_range(pctl, &gpio_range_a);
pinctrl_add_gpio_range(pctl, &gpio_range_b);
+ ...
}
So this complex system has one pin controller handling two different
GPIO chips. "chip a" has 16 pins and "chip b" has 8 pins. The "chip a" and
-"chip b" have different .pin_base, which means a start pin number of the
+"chip b" have different ``pin_base``, which means a start pin number of the
GPIO range.
The GPIO range of "chip a" starts from the GPIO base of 32 and actual
@@ -331,7 +328,7 @@ pin range also starts from 32. However "chip b" has different starting
offset for the GPIO range and pin range. The GPIO range of "chip b" starts
from GPIO number 48, while the pin range of "chip b" starts from 64.
-We can convert a gpio number to actual pin number using this "pin_base".
+We can convert a gpio number to actual pin number using this ``pin_base``.
They are mapped in the global GPIO pin space at:
chip a:
@@ -343,9 +340,11 @@ chip b:
The above examples assume the mapping between the GPIOs and pins is
linear. If the mapping is sparse or haphazard, an array of arbitrary pin
-numbers can be encoded in the range like this::
+numbers can be encoded in the range like this:
+
+.. code-block:: c
- static const unsigned range_pins[] = { 14, 1, 22, 17, 10, 8, 6, 2 };
+ static const unsigned int range_pins[] = { 14, 1, 22, 17, 10, 8, 6, 2 };
static struct pinctrl_gpio_range gpio_range = {
.name = "chip",
@@ -353,16 +352,17 @@ numbers can be encoded in the range like this::
.base = 32,
.pins = &range_pins,
.npins = ARRAY_SIZE(range_pins),
- .gc = &chip;
+ .gc = &chip,
};
-In this case the pin_base property will be ignored. If the name of a pin
+In this case the ``pin_base`` property will be ignored. If the name of a pin
group is known, the pins and npins elements of the above structure can be
-initialised using the function pinctrl_get_group_pins(), e.g. for pin
-group "foo"::
+initialised using the function ``pinctrl_get_group_pins()``, e.g. for pin
+group "foo":
- pinctrl_get_group_pins(pctl, "foo", &gpio_range.pins,
- &gpio_range.npins);
+.. code-block:: c
+
+ pinctrl_get_group_pins(pctl, "foo", &gpio_range.pins, &gpio_range.npins);
When GPIO-specific functions in the pin control subsystem are called, these
ranges will be used to look up the appropriate pin controller by inspecting
@@ -378,8 +378,8 @@ will get a pin number into its handled number range. Further it is also passed
the range ID value, so that the pin controller knows which range it should
deal with.
-Calling pinctrl_add_gpio_range from pinctrl driver is DEPRECATED. Please see
-section 2.1 of Documentation/devicetree/bindings/gpio/gpio.txt on how to bind
+Calling ``pinctrl_add_gpio_range()`` from pinctrl driver is DEPRECATED. Please see
+section 2.1 of ``Documentation/devicetree/bindings/gpio/gpio.txt`` on how to bind
pinctrl and gpio drivers.
@@ -466,10 +466,10 @@ in your machine configuration. It is inspired by the clk, GPIO and regulator
subsystems, so devices will request their mux setting, but it's also possible
to request a single pin for e.g. GPIO.
-Definitions:
+The conventions are:
- FUNCTIONS can be switched in and out by a driver residing with the pin
- control subsystem in the drivers/pinctrl/* directory of the kernel. The
+ control subsystem in the ``drivers/pinctrl`` directory of the kernel. The
pin control driver knows the possible functions. In the example above you can
identify three pinmux functions, one for spi, one for i2c and one for mmc.
@@ -515,11 +515,13 @@ Definitions:
In the example case we can define that this particular machine shall
use device spi0 with pinmux function fspi0 group gspi0 and i2c0 on function
fi2c0 group gi2c0, on the primary pin controller, we get mappings
- like these::
+ like these:
+
+ .. code-block:: c
{
{"map-spi0", spi0, pinctrl0, fspi0, gspi0},
- {"map-i2c0", i2c0, pinctrl0, fi2c0, gi2c0}
+ {"map-i2c0", i2c0, pinctrl0, fi2c0, gi2c0},
}
Every map must be assigned a state name, pin controller, device and
@@ -569,80 +571,51 @@ is possible to perform the requested mux setting, poke the hardware so that
this happens.
Pinmux drivers are required to supply a few callback functions, some are
-optional. Usually the set_mux() function is implemented, writing values into
+optional. Usually the ``.set_mux()`` function is implemented, writing values into
some certain registers to activate a certain mux setting for a certain pin.
-A simple driver for the above example will work by setting bits 0, 1, 2, 3 or 4
+A simple driver for the above example will work by setting bits 0, 1, 2, 3, 4, or 5
into some register named MUX to select a certain function with a certain
-group of pins would work something like this::
+group of pins would work something like this:
+
+.. code-block:: c
#include <linux/pinctrl/pinctrl.h>
#include <linux/pinctrl/pinmux.h>
- struct foo_group {
- const char *name;
- const unsigned int *pins;
- const unsigned num_pins;
- };
-
- static const unsigned spi0_0_pins[] = { 0, 8, 16, 24 };
- static const unsigned spi0_1_pins[] = { 38, 46, 54, 62 };
- static const unsigned i2c0_pins[] = { 24, 25 };
- static const unsigned mmc0_1_pins[] = { 56, 57 };
- static const unsigned mmc0_2_pins[] = { 58, 59 };
- static const unsigned mmc0_3_pins[] = { 60, 61, 62, 63 };
-
- static const struct foo_group foo_groups[] = {
- {
- .name = "spi0_0_grp",
- .pins = spi0_0_pins,
- .num_pins = ARRAY_SIZE(spi0_0_pins),
- },
- {
- .name = "spi0_1_grp",
- .pins = spi0_1_pins,
- .num_pins = ARRAY_SIZE(spi0_1_pins),
- },
- {
- .name = "i2c0_grp",
- .pins = i2c0_pins,
- .num_pins = ARRAY_SIZE(i2c0_pins),
- },
- {
- .name = "mmc0_1_grp",
- .pins = mmc0_1_pins,
- .num_pins = ARRAY_SIZE(mmc0_1_pins),
- },
- {
- .name = "mmc0_2_grp",
- .pins = mmc0_2_pins,
- .num_pins = ARRAY_SIZE(mmc0_2_pins),
- },
- {
- .name = "mmc0_3_grp",
- .pins = mmc0_3_pins,
- .num_pins = ARRAY_SIZE(mmc0_3_pins),
- },
+ static const unsigned int spi0_0_pins[] = { 0, 8, 16, 24 };
+ static const unsigned int spi0_1_pins[] = { 38, 46, 54, 62 };
+ static const unsigned int i2c0_pins[] = { 24, 25 };
+ static const unsigned int mmc0_1_pins[] = { 56, 57 };
+ static const unsigned int mmc0_2_pins[] = { 58, 59 };
+ static const unsigned int mmc0_3_pins[] = { 60, 61, 62, 63 };
+
+ static const struct pingroup foo_groups[] = {
+ PINCTRL_PINGROUP("spi0_0_grp", spi0_0_pins, ARRAY_SIZE(spi0_0_pins)),
+ PINCTRL_PINGROUP("spi0_1_grp", spi0_1_pins, ARRAY_SIZE(spi0_1_pins)),
+ PINCTRL_PINGROUP("i2c0_grp", i2c0_pins, ARRAY_SIZE(i2c0_pins)),
+ PINCTRL_PINGROUP("mmc0_1_grp", mmc0_1_pins, ARRAY_SIZE(mmc0_1_pins)),
+ PINCTRL_PINGROUP("mmc0_2_grp", mmc0_2_pins, ARRAY_SIZE(mmc0_2_pins)),
+ PINCTRL_PINGROUP("mmc0_3_grp", mmc0_3_pins, ARRAY_SIZE(mmc0_3_pins)),
};
-
static int foo_get_groups_count(struct pinctrl_dev *pctldev)
{
return ARRAY_SIZE(foo_groups);
}
static const char *foo_get_group_name(struct pinctrl_dev *pctldev,
- unsigned selector)
+ unsigned int selector)
{
return foo_groups[selector].name;
}
- static int foo_get_group_pins(struct pinctrl_dev *pctldev, unsigned selector,
- const unsigned ** pins,
- unsigned * num_pins)
+ static int foo_get_group_pins(struct pinctrl_dev *pctldev, unsigned int selector,
+ const unsigned int **pins,
+ unsigned int *npins)
{
- *pins = (unsigned *) foo_groups[selector].pins;
- *num_pins = foo_groups[selector].num_pins;
+ *pins = foo_groups[selector].pins;
+ *npins = foo_groups[selector].npins;
return 0;
}
@@ -652,33 +625,14 @@ group of pins would work something like this::
.get_group_pins = foo_get_group_pins,
};
- struct foo_pmx_func {
- const char *name;
- const char * const *groups;
- const unsigned num_groups;
- };
-
static const char * const spi0_groups[] = { "spi0_0_grp", "spi0_1_grp" };
static const char * const i2c0_groups[] = { "i2c0_grp" };
- static const char * const mmc0_groups[] = { "mmc0_1_grp", "mmc0_2_grp",
- "mmc0_3_grp" };
+ static const char * const mmc0_groups[] = { "mmc0_1_grp", "mmc0_2_grp", "mmc0_3_grp" };
- static const struct foo_pmx_func foo_functions[] = {
- {
- .name = "spi0",
- .groups = spi0_groups,
- .num_groups = ARRAY_SIZE(spi0_groups),
- },
- {
- .name = "i2c0",
- .groups = i2c0_groups,
- .num_groups = ARRAY_SIZE(i2c0_groups),
- },
- {
- .name = "mmc0",
- .groups = mmc0_groups,
- .num_groups = ARRAY_SIZE(mmc0_groups),
- },
+ static const struct pinfunction foo_functions[] = {
+ PINCTRL_PINFUNCTION("spi0", spi0_groups, ARRAY_SIZE(spi0_groups)),
+ PINCTRL_PINFUNCTION("i2c0", i2c0_groups, ARRAY_SIZE(i2c0_groups)),
+ PINCTRL_PINFUNCTION("mmc0", mmc0_groups, ARRAY_SIZE(mmc0_groups)),
};
static int foo_get_functions_count(struct pinctrl_dev *pctldev)
@@ -686,26 +640,26 @@ group of pins would work something like this::
return ARRAY_SIZE(foo_functions);
}
- static const char *foo_get_fname(struct pinctrl_dev *pctldev, unsigned selector)
+ static const char *foo_get_fname(struct pinctrl_dev *pctldev, unsigned int selector)
{
return foo_functions[selector].name;
}
- static int foo_get_groups(struct pinctrl_dev *pctldev, unsigned selector,
- const char * const **groups,
- unsigned * const num_groups)
+ static int foo_get_groups(struct pinctrl_dev *pctldev, unsigned int selector,
+ const char * const **groups,
+ unsigned int * const ngroups)
{
*groups = foo_functions[selector].groups;
- *num_groups = foo_functions[selector].num_groups;
+ *ngroups = foo_functions[selector].ngroups;
return 0;
}
- static int foo_set_mux(struct pinctrl_dev *pctldev, unsigned selector,
- unsigned group)
+ static int foo_set_mux(struct pinctrl_dev *pctldev, unsigned int selector,
+ unsigned int group)
{
- u8 regbit = (1 << selector + group);
+ u8 regbit = BIT(group);
- writeb((readb(MUX)|regbit), MUX);
+ writeb((readb(MUX) | regbit), MUX);
return 0;
}
@@ -724,16 +678,17 @@ group of pins would work something like this::
.pmxops = &foo_pmxops,
};
-In the example activating muxing 0 and 1 at the same time setting bits
-0 and 1, uses one pin in common so they would collide.
+In the example activating muxing 0 and 2 at the same time setting bits
+0 and 2, uses pin 24 in common so they would collide. All the same for
+the muxes 1 and 5, which have pin 62 in common.
The beauty of the pinmux subsystem is that since it keeps track of all
pins and who is using them, it will already have denied an impossible
request like that, so the driver does not need to worry about such
things - when it gets a selector passed in, the pinmux subsystem makes
sure no other device or GPIO assignment is already using the selected
-pins. Thus bits 0 and 1 in the control register will never be set at the
-same time.
+pins. Thus bits 0 and 2, or 1 and 5 in the control register will never
+be set at the same time.
All the above functions are mandatory to implement for a pinmux driver.
@@ -742,18 +697,18 @@ Pin control interaction with the GPIO subsystem
===============================================
Note that the following implies that the use case is to use a certain pin
-from the Linux kernel using the API in <linux/gpio.h> with gpio_request()
+from the Linux kernel using the API in ``<linux/gpio/consumer.h>`` with gpiod_get()
and similar functions. There are cases where you may be using something
that your datasheet calls "GPIO mode", but actually is just an electrical
configuration for a certain device. See the section below named
-"GPIO mode pitfalls" for more details on this scenario.
+`GPIO mode pitfalls`_ for more details on this scenario.
-The public pinmux API contains two functions named pinctrl_gpio_request()
-and pinctrl_gpio_free(). These two functions shall *ONLY* be called from
-gpiolib-based drivers as part of their gpio_request() and
-gpio_free() semantics. Likewise the pinctrl_gpio_direction_[input|output]
-shall only be called from within respective gpio_direction_[input|output]
-gpiolib implementation.
+The public pinmux API contains two functions named ``pinctrl_gpio_request()``
+and ``pinctrl_gpio_free()``. These two functions shall *ONLY* be called from
+gpiolib-based drivers as part of their ``.request()`` and ``.free()`` semantics.
+Likewise the ``pinctrl_gpio_direction_input()`` / ``pinctrl_gpio_direction_output()``
+shall only be called from within respective ``.direction_input()`` /
+``.direction_output()`` gpiolib implementation.
NOTE that platforms and individual drivers shall *NOT* request GPIO pins to be
controlled e.g. muxed in. Instead, implement a proper gpiolib driver and have
@@ -767,8 +722,8 @@ In this case, the function array would become 64 entries for each GPIO
setting and then the device functions.
For this reason there are two functions a pin control driver can implement
-to enable only GPIO on an individual pin: .gpio_request_enable() and
-.gpio_disable_free().
+to enable only GPIO on an individual pin: ``.gpio_request_enable()`` and
+``.gpio_disable_free()``.
This function will pass in the affected GPIO range identified by the pin
controller core, so you know which GPIO pins are being affected by the request
@@ -776,12 +731,12 @@ operation.
If your driver needs to have an indication from the framework of whether the
GPIO pin shall be used for input or output you can implement the
-.gpio_set_direction() function. As described this shall be called from the
+``.gpio_set_direction()`` function. As described this shall be called from the
gpiolib driver and the affected GPIO range, pin offset and desired direction
will be passed along to this function.
Alternatively to using these special functions, it is fully allowed to use
-named functions for each GPIO pin, the pinctrl_gpio_request() will attempt to
+named functions for each GPIO pin, the ``pinctrl_gpio_request()`` will attempt to
obtain the function "gpioN" where "N" is the global GPIO pin number if no
special GPIO-handler is registered.
@@ -794,7 +749,7 @@ is taken to mean different things than what the kernel does, the developer
may be confused by a datasheet talking about a pin being possible to set
into "GPIO mode". It appears that what hardware engineers mean with
"GPIO mode" is not necessarily the use case that is implied in the kernel
-interface <linux/gpio.h>: a pin that you grab from kernel code and then
+interface ``<linux/gpio/consumer.h>``: a pin that you grab from kernel code and then
either listen for input or drive high/low to assert/deassert some
external line.
@@ -805,9 +760,10 @@ for a device.
The GPIO portions of a pin and its relation to a certain pin controller
configuration and muxing logic can be constructed in several ways. Here
-are two examples::
+are two examples.
+
+Example **(A)**::
- (A)
pin config
logic regs
| +- SPI
@@ -836,9 +792,7 @@ simultaneous access to the same pin from GPIO and pin multiplexing
consumers on hardware of this type. The pinctrl driver should set this flag
accordingly.
-::
-
- (B)
+Example **(B)**::
pin config
logic regs
@@ -882,7 +836,7 @@ hardware and shall be put into different subsystems:
Depending on the exact HW register design, some functions exposed by the
GPIO subsystem may call into the pinctrl subsystem in order to
-co-ordinate register settings across HW modules. In particular, this may
+coordinate register settings across HW modules. In particular, this may
be needed for HW with separate GPIO and pin controller HW modules, where
e.g. GPIO direction is determined by a register in the pin controller HW
module rather than the GPIO HW module.
@@ -899,14 +853,14 @@ If you make a 1-to-1 map to the GPIO subsystem for this pin, you may start
to think that you need to come up with something really complex, that the
pin shall be used for UART TX and GPIO at the same time, that you will grab
a pin control handle and set it to a certain state to enable UART TX to be
-muxed in, then twist it over to GPIO mode and use gpio_direction_output()
+muxed in, then twist it over to GPIO mode and use gpiod_direction_output()
to drive it low during sleep, then mux it over to UART TX again when you
-wake up and maybe even gpio_request/gpio_free as part of this cycle. This
+wake up and maybe even gpiod_get() / gpiod_put() as part of this cycle. This
all gets very complicated.
The solution is to not think that what the datasheet calls "GPIO mode"
-has to be handled by the <linux/gpio.h> interface. Instead view this as
-a certain pin config setting. Look in e.g. <linux/pinctrl/pinconf-generic.h>
+has to be handled by the ``<linux/gpio/consumer.h>`` interface. Instead view this as
+a certain pin config setting. Look in e.g. ``<linux/pinctrl/pinconf-generic.h>``
and you find this in the documentation:
PIN_CONFIG_OUTPUT:
@@ -915,7 +869,9 @@ and you find this in the documentation:
So it is perfectly possible to push a pin into "GPIO mode" and drive the
line low as part of the usual pin control map. So for example your UART
-driver may look like this::
+driver may look like this:
+
+.. code-block:: c
#include <linux/pinctrl/consumer.h>
@@ -928,13 +884,13 @@ driver may look like this::
/* Normal mode */
retval = pinctrl_select_state(pinctrl, pins_default);
+
/* Sleep mode */
retval = pinctrl_select_state(pinctrl, pins_sleep);
And your machine configuration may look like this:
---------------------------------------------------
-::
+.. code-block:: c
static unsigned long uart_default_mode[] = {
PIN_CONF_PACKED(PIN_CONFIG_DRIVE_PUSH_PULL, 0),
@@ -946,16 +902,17 @@ And your machine configuration may look like this:
static struct pinctrl_map pinmap[] __initdata = {
PIN_MAP_MUX_GROUP("uart", PINCTRL_STATE_DEFAULT, "pinctrl-foo",
- "u0_group", "u0"),
+ "u0_group", "u0"),
PIN_MAP_CONFIGS_PIN("uart", PINCTRL_STATE_DEFAULT, "pinctrl-foo",
- "UART_TX_PIN", uart_default_mode),
+ "UART_TX_PIN", uart_default_mode),
PIN_MAP_MUX_GROUP("uart", PINCTRL_STATE_SLEEP, "pinctrl-foo",
- "u0_group", "gpio-mode"),
+ "u0_group", "gpio-mode"),
PIN_MAP_CONFIGS_PIN("uart", PINCTRL_STATE_SLEEP, "pinctrl-foo",
- "UART_TX_PIN", uart_sleep_mode),
+ "UART_TX_PIN", uart_sleep_mode),
};
- foo_init(void) {
+ foo_init(void)
+ {
pinctrl_register_mappings(pinmap, ARRAY_SIZE(pinmap));
}
@@ -995,7 +952,9 @@ part of this.
A pin controller configuration for a machine looks pretty much like a simple
regulator configuration, so for the example array above we want to enable i2c
-and spi on the second function mapping::
+and spi on the second function mapping:
+
+.. code-block:: c
#include <linux/pinctrl/machine.h>
@@ -1030,13 +989,17 @@ must match a function provided by the pinmux driver handling this pin range.
As you can see we may have several pin controllers on the system and thus
we need to specify which one of them contains the functions we wish to map.
-You register this pinmux mapping to the pinmux subsystem by simply::
+You register this pinmux mapping to the pinmux subsystem by simply:
+
+.. code-block:: c
ret = pinctrl_register_mappings(mapping, ARRAY_SIZE(mapping));
Since the above construct is pretty common there is a helper macro to make
it even more compact which assumes you want to use pinctrl-foo and position
-0 for mapping, for example::
+0 for mapping, for example:
+
+.. code-block:: c
static struct pinctrl_map mapping[] __initdata = {
PIN_MAP_MUX_GROUP("foo-i2c.o", PINCTRL_STATE_DEFAULT,
@@ -1046,7 +1009,9 @@ it even more compact which assumes you want to use pinctrl-foo and position
The mapping table may also contain pin configuration entries. It's common for
each pin/group to have a number of configuration entries that affect it, so
the table entries for configuration reference an array of config parameters
-and values. An example using the convenience macros is shown below::
+and values. An example using the convenience macros is shown below:
+
+.. code-block:: c
static unsigned long i2c_grp_configs[] = {
FOO_PIN_DRIVEN,
@@ -1073,8 +1038,10 @@ Finally, some devices expect the mapping table to contain certain specific
named states. When running on hardware that doesn't need any pin controller
configuration, the mapping table must still contain those named states, in
order to explicitly indicate that the states were provided and intended to
-be empty. Table entry macro PIN_MAP_DUMMY_STATE serves the purpose of defining
-a named state without causing any pin controller to be programmed::
+be empty. Table entry macro ``PIN_MAP_DUMMY_STATE()`` serves the purpose of defining
+a named state without causing any pin controller to be programmed:
+
+.. code-block:: c
static struct pinctrl_map mapping[] __initdata = {
PIN_MAP_DUMMY_STATE("foo-i2c.0", PINCTRL_STATE_DEFAULT),
@@ -1085,7 +1052,9 @@ Complex mappings
================
As it is possible to map a function to different groups of pins an optional
-.group can be specified like this::
+.group can be specified like this:
+
+.. code-block:: c
...
{
@@ -1107,13 +1076,15 @@ As it is possible to map a function to different groups of pins an optional
...
This example mapping is used to switch between two positions for spi0 at
-runtime, as described further below under the heading "Runtime pinmuxing".
+runtime, as described further below under the heading `Runtime pinmuxing`_.
Further it is possible for one named state to affect the muxing of several
groups of pins, say for example in the mmc0 example above, where you can
additively expand the mmc0 bus from 2 to 4 to 8 pins. If we want to use all
-three groups for a total of 2+2+4 = 8 pins (for an 8-bit MMC bus as is the
-case), we define a mapping like this::
+three groups for a total of 2 + 2 + 4 = 8 pins (for an 8-bit MMC bus as is the
+case), we define a mapping like this:
+
+.. code-block:: c
...
{
@@ -1167,13 +1138,17 @@ case), we define a mapping like this::
...
The result of grabbing this mapping from the device with something like
-this (see next paragraph)::
+this (see next paragraph):
+
+.. code-block:: c
p = devm_pinctrl_get(dev);
s = pinctrl_lookup_state(p, "8bit");
ret = pinctrl_select_state(p, s);
-or more simply::
+or more simply:
+
+.. code-block:: c
p = devm_pinctrl_get_select(dev, "8bit");
@@ -1188,7 +1163,7 @@ Pin control requests from drivers
=================================
When a device driver is about to probe the device core will automatically
-attempt to issue pinctrl_get_select_default() on these devices.
+attempt to issue ``pinctrl_get_select_default()`` on these devices.
This way driver writers do not need to add any of the boilerplate code
of the type found below. However when doing fine-grained state selection
and not using the "default" state, you may have to do some device driver
@@ -1206,12 +1181,14 @@ some cases where a driver needs to e.g. switch between different mux mappings
at runtime this is not possible.
A typical case is if a driver needs to switch bias of pins from normal
-operation and going to sleep, moving from the PINCTRL_STATE_DEFAULT to
-PINCTRL_STATE_SLEEP at runtime, re-biasing or even re-muxing pins to save
+operation and going to sleep, moving from the ``PINCTRL_STATE_DEFAULT`` to
+``PINCTRL_STATE_SLEEP`` at runtime, re-biasing or even re-muxing pins to save
current in sleep mode.
A driver may request a certain control state to be activated, usually just the
-default state like this::
+default state like this:
+
+.. code-block:: c
#include <linux/pinctrl/consumer.h>
@@ -1251,49 +1228,49 @@ arrangement on your bus.
The semantics of the pinctrl APIs are:
-- pinctrl_get() is called in process context to obtain a handle to all pinctrl
+- ``pinctrl_get()`` is called in process context to obtain a handle to all pinctrl
information for a given client device. It will allocate a struct from the
kernel memory to hold the pinmux state. All mapping table parsing or similar
slow operations take place within this API.
-- devm_pinctrl_get() is a variant of pinctrl_get() that causes pinctrl_put()
+- ``devm_pinctrl_get()`` is a variant of pinctrl_get() that causes ``pinctrl_put()``
to be called automatically on the retrieved pointer when the associated
device is removed. It is recommended to use this function over plain
- pinctrl_get().
+ ``pinctrl_get()``.
-- pinctrl_lookup_state() is called in process context to obtain a handle to a
+- ``pinctrl_lookup_state()`` is called in process context to obtain a handle to a
specific state for a client device. This operation may be slow, too.
-- pinctrl_select_state() programs pin controller hardware according to the
+- ``pinctrl_select_state()`` programs pin controller hardware according to the
definition of the state as given by the mapping table. In theory, this is a
fast-path operation, since it only involved blasting some register settings
into hardware. However, note that some pin controllers may have their
registers on a slow/IRQ-based bus, so client devices should not assume they
- can call pinctrl_select_state() from non-blocking contexts.
+ can call ``pinctrl_select_state()`` from non-blocking contexts.
-- pinctrl_put() frees all information associated with a pinctrl handle.
+- ``pinctrl_put()`` frees all information associated with a pinctrl handle.
-- devm_pinctrl_put() is a variant of pinctrl_put() that may be used to
- explicitly destroy a pinctrl object returned by devm_pinctrl_get().
+- ``devm_pinctrl_put()`` is a variant of ``pinctrl_put()`` that may be used to
+ explicitly destroy a pinctrl object returned by ``devm_pinctrl_get()``.
However, use of this function will be rare, due to the automatic cleanup
that will occur even without calling it.
- pinctrl_get() must be paired with a plain pinctrl_put().
- pinctrl_get() may not be paired with devm_pinctrl_put().
- devm_pinctrl_get() can optionally be paired with devm_pinctrl_put().
- devm_pinctrl_get() may not be paired with plain pinctrl_put().
+ ``pinctrl_get()`` must be paired with a plain ``pinctrl_put()``.
+ ``pinctrl_get()`` may not be paired with ``devm_pinctrl_put()``.
+ ``devm_pinctrl_get()`` can optionally be paired with ``devm_pinctrl_put()``.
+ ``devm_pinctrl_get()`` may not be paired with plain ``pinctrl_put()``.
Usually the pin control core handled the get/put pair and call out to the
device drivers bookkeeping operations, like checking available functions and
-the associated pins, whereas select_state pass on to the pin controller
+the associated pins, whereas ``pinctrl_select_state()`` pass on to the pin controller
driver which takes care of activating and/or deactivating the mux setting by
quickly poking some registers.
-The pins are allocated for your device when you issue the devm_pinctrl_get()
+The pins are allocated for your device when you issue the ``devm_pinctrl_get()``
call, after this you should be able to see this in the debugfs listing of all
pins.
-NOTE: the pinctrl system will return -EPROBE_DEFER if it cannot find the
+NOTE: the pinctrl system will return ``-EPROBE_DEFER`` if it cannot find the
requested pinctrl handles, for example if the pinctrl driver has not yet
registered. Thus make sure that the error path in your driver gracefully
cleans up and is ready to retry the probing later in the startup process.
@@ -1305,18 +1282,20 @@ Drivers needing both pin control and GPIOs
Again, it is discouraged to let drivers lookup and select pin control states
themselves, but again sometimes this is unavoidable.
-So say that your driver is fetching its resources like this::
+So say that your driver is fetching its resources like this:
+
+.. code-block:: c
#include <linux/pinctrl/consumer.h>
- #include <linux/gpio.h>
+ #include <linux/gpio/consumer.h>
struct pinctrl *pinctrl;
- int gpio;
+ struct gpio_desc *gpio;
pinctrl = devm_pinctrl_get_select_default(&dev);
- gpio = devm_gpio_request(&dev, 14, "foo");
+ gpio = devm_gpiod_get(&dev, "foo");
-Here we first request a certain pin state and then request GPIO 14 to be
+Here we first request a certain pin state and then request GPIO "foo" to be
used. If you're using the subsystems orthogonally like this, you should
nominally always get your pinctrl handle and select the desired pinctrl
state BEFORE requesting the GPIO. This is a semantic convention to avoid
@@ -1331,9 +1310,9 @@ probing, nevertheless orthogonal to the GPIO subsystem.
But there are also situations where it makes sense for the GPIO subsystem
to communicate directly with the pinctrl subsystem, using the latter as a
back-end. This is when the GPIO driver may call out to the functions
-described in the section "Pin control interaction with the GPIO subsystem"
+described in the section `Pin control interaction with the GPIO subsystem`_
above. This only involves per-pin multiplexing, and will be completely
-hidden behind the gpio_*() function namespace. In this case, the driver
+hidden behind the gpiod_*() function namespace. In this case, the driver
need not interact with the pin control subsystem at all.
If a pin control driver and a GPIO driver is dealing with the same pins
@@ -1348,12 +1327,14 @@ System pin control hogging
==========================
Pin control map entries can be hogged by the core when the pin controller
-is registered. This means that the core will attempt to call pinctrl_get(),
-lookup_state() and select_state() on it immediately after the pin control
-device has been registered.
+is registered. This means that the core will attempt to call ``pinctrl_get()``,
+``pinctrl_lookup_state()`` and ``pinctrl_select_state()`` on it immediately after
+the pin control device has been registered.
This occurs for mapping table entries where the client device name is equal
-to the pin controller device name, and the state name is PINCTRL_STATE_DEFAULT::
+to the pin controller device name, and the state name is ``PINCTRL_STATE_DEFAULT``:
+
+.. code-block:: c
{
.dev_name = "pinctrl-foo",
@@ -1365,7 +1346,9 @@ to the pin controller device name, and the state name is PINCTRL_STATE_DEFAULT::
Since it may be common to request the core to hog a few always-applicable
mux settings on the primary pin controller, there is a convenience macro for
-this::
+this:
+
+.. code-block:: c
PIN_MAP_MUX_GROUP_HOG_DEFAULT("pinctrl-foo", NULL /* group */,
"power_func")
@@ -1385,7 +1368,9 @@ function, but with different named in the mapping as described under
This snippet first initializes a state object for both groups (in foo_probe()),
then muxes the function in the pins defined by group A, and finally muxes it in
-on the pins defined by group B::
+on the pins defined by group B:
+
+.. code-block:: c
#include <linux/pinctrl/consumer.h>
@@ -1413,14 +1398,14 @@ on the pins defined by group B::
/* Enable on position A */
ret = pinctrl_select_state(p, s1);
if (ret < 0)
- ...
+ ...
...
/* Enable on position B */
ret = pinctrl_select_state(p, s2);
if (ret < 0)
- ...
+ ...
...
}
@@ -1432,6 +1417,7 @@ can be used by different functions at different times on a running system.
Debugfs files
=============
+
These files are created in ``/sys/kernel/debug/pinctrl``:
- ``pinctrl-devices``: prints each pin controller device along with columns to
@@ -1440,7 +1426,7 @@ These files are created in ``/sys/kernel/debug/pinctrl``:
- ``pinctrl-handles``: prints each configured pin controller handle and the
corresponding pinmux maps
-- ``pinctrl-maps``: print all pinctrl maps
+- ``pinctrl-maps``: prints all pinctrl maps
A sub-directory is created inside of ``/sys/kernel/debug/pinctrl`` for each pin
controller device containing these files:
@@ -1448,20 +1434,22 @@ controller device containing these files:
- ``pins``: prints a line for each pin registered on the pin controller. The
pinctrl driver may add additional information such as register contents.
-- ``gpio-ranges``: print ranges that map gpio lines to pins on the controller
+- ``gpio-ranges``: prints ranges that map gpio lines to pins on the controller
-- ``pingroups``: print all pin groups registered on the pin controller
+- ``pingroups``: prints all pin groups registered on the pin controller
-- ``pinconf-pins``: print pin config settings for each pin
+- ``pinconf-pins``: prints pin config settings for each pin
-- ``pinconf-groups``: print pin config settings per pin group
+- ``pinconf-groups``: prints pin config settings per pin group
-- ``pinmux-functions``: print each pin function along with the pin groups that
+- ``pinmux-functions``: prints each pin function along with the pin groups that
map to the pin function
-- ``pinmux-pins``: iterate through all pins and print mux owner, gpio owner
+- ``pinmux-pins``: iterates through all pins and prints mux owner, gpio owner
and if the pin is a hog
-- ``pinmux-select``: write to this file to activate a pin function for a group::
+- ``pinmux-select``: write to this file to activate a pin function for a group:
+
+ .. code-block:: sh
echo "<group-name function-name>" > pinmux-select
diff --git a/Documentation/driver-api/pldmfw/index.rst b/Documentation/driver-api/pldmfw/index.rst
index ad2c33ece30f..fd871b83f34f 100644
--- a/Documentation/driver-api/pldmfw/index.rst
+++ b/Documentation/driver-api/pldmfw/index.rst
@@ -20,7 +20,7 @@ Overview of the ``pldmfw`` library
The ``pldmfw`` library is intended to be used by device drivers for
implementing device flash update based on firmware files following the PLDM
-firwmare file format.
+firmware file format.
It is implemented using an ops table that allows device drivers to provide
the underlying device specific functionality.
diff --git a/Documentation/driver-api/serial/driver.rst b/Documentation/driver-api/serial/driver.rst
index 98d268555dcc..84b43061c11b 100644
--- a/Documentation/driver-api/serial/driver.rst
+++ b/Documentation/driver-api/serial/driver.rst
@@ -24,7 +24,7 @@ console support.
Console Support
---------------
-The serial core provides a few helper functions. This includes identifing
+The serial core provides a few helper functions. This includes identifying
the correct port structure (via uart_get_console()) and decoding command line
arguments (uart_parse_options()).
diff --git a/Documentation/driver-api/surface_aggregator/client.rst b/Documentation/driver-api/surface_aggregator/client.rst
index 27f95abdbe99..e100ab0a24cc 100644
--- a/Documentation/driver-api/surface_aggregator/client.rst
+++ b/Documentation/driver-api/surface_aggregator/client.rst
@@ -19,7 +19,7 @@
.. |ssam_notifier_unregister| replace:: :c:func:`ssam_notifier_unregister`
.. |ssam_device_notifier_register| replace:: :c:func:`ssam_device_notifier_register`
.. |ssam_device_notifier_unregister| replace:: :c:func:`ssam_device_notifier_unregister`
-.. |ssam_request_sync| replace:: :c:func:`ssam_request_sync`
+.. |ssam_request_do_sync| replace:: :c:func:`ssam_request_do_sync`
.. |ssam_event_mask| replace:: :c:type:`enum ssam_event_mask <ssam_event_mask>`
@@ -191,7 +191,7 @@ data received from it is converted from little-endian to host endianness.
* they do not correspond to an actual SAM/EC request.
*/
rqst.target_category = SSAM_SSH_TC_SAM;
- rqst.target_id = 0x01;
+ rqst.target_id = SSAM_SSH_TID_SAM;
rqst.command_id = 0x02;
rqst.instance_id = 0x03;
rqst.flags = SSAM_REQUEST_HAS_RESPONSE;
@@ -209,12 +209,12 @@ data received from it is converted from little-endian to host endianness.
* with the SSAM_REQUEST_HAS_RESPONSE flag set in the specification
* above.
*/
- status = ssam_request_sync(ctrl, &rqst, &resp);
+ status = ssam_request_do_sync(ctrl, &rqst, &resp);
/*
* Alternatively use
*
- * ssam_request_sync_onstack(ctrl, &rqst, &resp, sizeof(arg_le));
+ * ssam_request_do_sync_onstack(ctrl, &rqst, &resp, sizeof(arg_le));
*
* to perform the request, allocating the message buffer directly
* on the stack as opposed to allocation via kzalloc().
@@ -230,7 +230,7 @@ data received from it is converted from little-endian to host endianness.
return status;
}
-Note that |ssam_request_sync| in its essence is a wrapper over lower-level
+Note that |ssam_request_do_sync| in its essence is a wrapper over lower-level
request primitives, which may also be used to perform requests. Refer to its
implementation and documentation for more details.
@@ -241,7 +241,7 @@ one of the generator macros, for example via:
SSAM_DEFINE_SYNC_REQUEST_W(__ssam_tmp_perf_mode_set, __le32, {
.target_category = SSAM_SSH_TC_TMP,
- .target_id = 0x01,
+ .target_id = SSAM_SSH_TID_SAM,
.command_id = 0x03,
.instance_id = 0x00,
});
diff --git a/Documentation/driver-api/surface_aggregator/ssh.rst b/Documentation/driver-api/surface_aggregator/ssh.rst
index bf007d6c9873..b955b673838b 100644
--- a/Documentation/driver-api/surface_aggregator/ssh.rst
+++ b/Documentation/driver-api/surface_aggregator/ssh.rst
@@ -13,6 +13,7 @@
.. |DATA_NSQ| replace:: ``DATA_NSQ``
.. |TC| replace:: ``TC``
.. |TID| replace:: ``TID``
+.. |SID| replace:: ``SID``
.. |IID| replace:: ``IID``
.. |RQID| replace:: ``RQID``
.. |CID| replace:: ``CID``
@@ -76,7 +77,7 @@ after the frame structure and before the payload. The payload is followed by
its own CRC (over all payload bytes). If the payload is not present (i.e.
the frame has ``LEN=0``), the CRC of the payload is still present and will
evaluate to ``0xffff``. The |LEN| field does not include any of the CRCs, it
-equals the number of bytes inbetween the CRC of the frame and the CRC of the
+equals the number of bytes between the CRC of the frame and the CRC of the
payload.
Additionally, the following fixed two-byte sequences are used:
@@ -219,13 +220,13 @@ following fields, packed together and in order:
- |u8|
- Target category.
- * - |TID| (out)
+ * - |TID|
- |u8|
- - Target ID for outgoing (host to EC) commands.
+ - Target ID for commands/messages.
- * - |TID| (in)
+ * - |SID|
- |u8|
- - Target ID for incoming (EC to host) commands.
+ - Source ID for commands/messages.
* - |IID|
- |u8|
@@ -286,19 +287,20 @@ general, however, a single target category should map to a single reserved
event request ID.
Furthermore, requests, responses, and events have an associated target ID
-(``TID``). This target ID is split into output (host to EC) and input (EC to
-host) fields, with the respecting other field (e.g. output field on incoming
-messages) set to zero. Two ``TID`` values are known: Primary (``0x01``) and
-secondary (``0x02``). In general, the response to a request should have the
-same ``TID`` value, however, the field (output vs. input) should be used in
-accordance to the direction in which the response is sent (i.e. on the input
-field, as responses are generally sent from the EC to the host).
-
-Note that, even though requests and events should be uniquely identifiable
-by target category and command ID alone, the EC may require specific
-target ID and instance ID values to accept a command. A command that is
-accepted for ``TID=1``, for example, may not be accepted for ``TID=2``
-and vice versa.
+(``TID``) and source ID (``SID``). These two fields indicate where a message
+originates from (``SID``) and what the intended target of the message is
+(``TID``). Note that a response to a specific request therefore has the source
+and target IDs swapped when compared to the original request (i.e. the request
+target is the response source and the request source is the response target).
+See (:c:type:`enum ssh_request_id <ssh_request_id>`) for possible values of
+both.
+
+Note that, even though requests and events should be uniquely identifiable by
+target category and command ID alone, the EC may require specific target ID and
+instance ID values to accept a command. A command that is accepted for
+``TID=1``, for example, may not be accepted for ``TID=2`` and vice versa. While
+this may not always hold in reality, you can think of different target/source
+IDs indicating different physical ECs with potentially different feature sets.
Limitations and Observations
diff --git a/Documentation/driver-api/thermal/index.rst b/Documentation/driver-api/thermal/index.rst
index 030306ffa408..a886028014ab 100644
--- a/Documentation/driver-api/thermal/index.rst
+++ b/Documentation/driver-api/thermal/index.rst
@@ -14,7 +14,6 @@ Thermal
exynos_thermal
exynos_thermal_emulation
- intel_powerclamp
nouveau_thermal
x86_pkg_temperature_thermal
intel_dptf
diff --git a/Documentation/driver-api/thermal/intel_dptf.rst b/Documentation/driver-api/thermal/intel_dptf.rst
index 372bdb4d04c6..f5c193cccbda 100644
--- a/Documentation/driver-api/thermal/intel_dptf.rst
+++ b/Documentation/driver-api/thermal/intel_dptf.rst
@@ -84,6 +84,9 @@ DPTF ACPI Drivers interface
https:/github.com/intel/thermal_daemon for decoding
thermal table.
+``production_mode`` (RO)
+ When different from zero, manufacturer locked thermal configuration
+ from further changes.
ACPI Thermal Relationship table interface
------------------------------------------
diff --git a/Documentation/driver-api/thermal/intel_powerclamp.rst b/Documentation/driver-api/thermal/intel_powerclamp.rst
deleted file mode 100644
index 3f6dfb0b3ea6..000000000000
--- a/Documentation/driver-api/thermal/intel_powerclamp.rst
+++ /dev/null
@@ -1,320 +0,0 @@
-=======================
-Intel Powerclamp Driver
-=======================
-
-By:
- - Arjan van de Ven <arjan@linux.intel.com>
- - Jacob Pan <jacob.jun.pan@linux.intel.com>
-
-.. Contents:
-
- (*) Introduction
- - Goals and Objectives
-
- (*) Theory of Operation
- - Idle Injection
- - Calibration
-
- (*) Performance Analysis
- - Effectiveness and Limitations
- - Power vs Performance
- - Scalability
- - Calibration
- - Comparison with Alternative Techniques
-
- (*) Usage and Interfaces
- - Generic Thermal Layer (sysfs)
- - Kernel APIs (TBD)
-
-INTRODUCTION
-============
-
-Consider the situation where a system’s power consumption must be
-reduced at runtime, due to power budget, thermal constraint, or noise
-level, and where active cooling is not preferred. Software managed
-passive power reduction must be performed to prevent the hardware
-actions that are designed for catastrophic scenarios.
-
-Currently, P-states, T-states (clock modulation), and CPU offlining
-are used for CPU throttling.
-
-On Intel CPUs, C-states provide effective power reduction, but so far
-they’re only used opportunistically, based on workload. With the
-development of intel_powerclamp driver, the method of synchronizing
-idle injection across all online CPU threads was introduced. The goal
-is to achieve forced and controllable C-state residency.
-
-Test/Analysis has been made in the areas of power, performance,
-scalability, and user experience. In many cases, clear advantage is
-shown over taking the CPU offline or modulating the CPU clock.
-
-
-THEORY OF OPERATION
-===================
-
-Idle Injection
---------------
-
-On modern Intel processors (Nehalem or later), package level C-state
-residency is available in MSRs, thus also available to the kernel.
-
-These MSRs are::
-
- #define MSR_PKG_C2_RESIDENCY 0x60D
- #define MSR_PKG_C3_RESIDENCY 0x3F8
- #define MSR_PKG_C6_RESIDENCY 0x3F9
- #define MSR_PKG_C7_RESIDENCY 0x3FA
-
-If the kernel can also inject idle time to the system, then a
-closed-loop control system can be established that manages package
-level C-state. The intel_powerclamp driver is conceived as such a
-control system, where the target set point is a user-selected idle
-ratio (based on power reduction), and the error is the difference
-between the actual package level C-state residency ratio and the target idle
-ratio.
-
-Injection is controlled by high priority kernel threads, spawned for
-each online CPU.
-
-These kernel threads, with SCHED_FIFO class, are created to perform
-clamping actions of controlled duty ratio and duration. Each per-CPU
-thread synchronizes its idle time and duration, based on the rounding
-of jiffies, so accumulated errors can be prevented to avoid a jittery
-effect. Threads are also bound to the CPU such that they cannot be
-migrated, unless the CPU is taken offline. In this case, threads
-belong to the offlined CPUs will be terminated immediately.
-
-Running as SCHED_FIFO and relatively high priority, also allows such
-scheme to work for both preemptable and non-preemptable kernels.
-Alignment of idle time around jiffies ensures scalability for HZ
-values. This effect can be better visualized using a Perf timechart.
-The following diagram shows the behavior of kernel thread
-kidle_inject/cpu. During idle injection, it runs monitor/mwait idle
-for a given "duration", then relinquishes the CPU to other tasks,
-until the next time interval.
-
-The NOHZ schedule tick is disabled during idle time, but interrupts
-are not masked. Tests show that the extra wakeups from scheduler tick
-have a dramatic impact on the effectiveness of the powerclamp driver
-on large scale systems (Westmere system with 80 processors).
-
-::
-
- CPU0
- ____________ ____________
- kidle_inject/0 | sleep | mwait | sleep |
- _________| |________| |_______
- duration
- CPU1
- ____________ ____________
- kidle_inject/1 | sleep | mwait | sleep |
- _________| |________| |_______
- ^
- |
- |
- roundup(jiffies, interval)
-
-Only one CPU is allowed to collect statistics and update global
-control parameters. This CPU is referred to as the controlling CPU in
-this document. The controlling CPU is elected at runtime, with a
-policy that favors BSP, taking into account the possibility of a CPU
-hot-plug.
-
-In terms of dynamics of the idle control system, package level idle
-time is considered largely as a non-causal system where its behavior
-cannot be based on the past or current input. Therefore, the
-intel_powerclamp driver attempts to enforce the desired idle time
-instantly as given input (target idle ratio). After injection,
-powerclamp monitors the actual idle for a given time window and adjust
-the next injection accordingly to avoid over/under correction.
-
-When used in a causal control system, such as a temperature control,
-it is up to the user of this driver to implement algorithms where
-past samples and outputs are included in the feedback. For example, a
-PID-based thermal controller can use the powerclamp driver to
-maintain a desired target temperature, based on integral and
-derivative gains of the past samples.
-
-
-
-Calibration
------------
-During scalability testing, it is observed that synchronized actions
-among CPUs become challenging as the number of cores grows. This is
-also true for the ability of a system to enter package level C-states.
-
-To make sure the intel_powerclamp driver scales well, online
-calibration is implemented. The goals for doing such a calibration
-are:
-
-a) determine the effective range of idle injection ratio
-b) determine the amount of compensation needed at each target ratio
-
-Compensation to each target ratio consists of two parts:
-
- a) steady state error compensation
- This is to offset the error occurring when the system can
- enter idle without extra wakeups (such as external interrupts).
-
- b) dynamic error compensation
- When an excessive amount of wakeups occurs during idle, an
- additional idle ratio can be added to quiet interrupts, by
- slowing down CPU activities.
-
-A debugfs file is provided for the user to examine compensation
-progress and results, such as on a Westmere system::
-
- [jacob@nex01 ~]$ cat
- /sys/kernel/debug/intel_powerclamp/powerclamp_calib
- controlling cpu: 0
- pct confidence steady dynamic (compensation)
- 0 0 0 0
- 1 1 0 0
- 2 1 1 0
- 3 3 1 0
- 4 3 1 0
- 5 3 1 0
- 6 3 1 0
- 7 3 1 0
- 8 3 1 0
- ...
- 30 3 2 0
- 31 3 2 0
- 32 3 1 0
- 33 3 2 0
- 34 3 1 0
- 35 3 2 0
- 36 3 1 0
- 37 3 2 0
- 38 3 1 0
- 39 3 2 0
- 40 3 3 0
- 41 3 1 0
- 42 3 2 0
- 43 3 1 0
- 44 3 1 0
- 45 3 2 0
- 46 3 3 0
- 47 3 0 0
- 48 3 2 0
- 49 3 3 0
-
-Calibration occurs during runtime. No offline method is available.
-Steady state compensation is used only when confidence levels of all
-adjacent ratios have reached satisfactory level. A confidence level
-is accumulated based on clean data collected at runtime. Data
-collected during a period without extra interrupts is considered
-clean.
-
-To compensate for excessive amounts of wakeup during idle, additional
-idle time is injected when such a condition is detected. Currently,
-we have a simple algorithm to double the injection ratio. A possible
-enhancement might be to throttle the offending IRQ, such as delaying
-EOI for level triggered interrupts. But it is a challenge to be
-non-intrusive to the scheduler or the IRQ core code.
-
-
-CPU Online/Offline
-------------------
-Per-CPU kernel threads are started/stopped upon receiving
-notifications of CPU hotplug activities. The intel_powerclamp driver
-keeps track of clamping kernel threads, even after they are migrated
-to other CPUs, after a CPU offline event.
-
-
-Performance Analysis
-====================
-This section describes the general performance data collected on
-multiple systems, including Westmere (80P) and Ivy Bridge (4P, 8P).
-
-Effectiveness and Limitations
------------------------------
-The maximum range that idle injection is allowed is capped at 50
-percent. As mentioned earlier, since interrupts are allowed during
-forced idle time, excessive interrupts could result in less
-effectiveness. The extreme case would be doing a ping -f to generated
-flooded network interrupts without much CPU acknowledgement. In this
-case, little can be done from the idle injection threads. In most
-normal cases, such as scp a large file, applications can be throttled
-by the powerclamp driver, since slowing down the CPU also slows down
-network protocol processing, which in turn reduces interrupts.
-
-When control parameters change at runtime by the controlling CPU, it
-may take an additional period for the rest of the CPUs to catch up
-with the changes. During this time, idle injection is out of sync,
-thus not able to enter package C- states at the expected ratio. But
-this effect is minor, in that in most cases change to the target
-ratio is updated much less frequently than the idle injection
-frequency.
-
-Scalability
------------
-Tests also show a minor, but measurable, difference between the 4P/8P
-Ivy Bridge system and the 80P Westmere server under 50% idle ratio.
-More compensation is needed on Westmere for the same amount of
-target idle ratio. The compensation also increases as the idle ratio
-gets larger. The above reason constitutes the need for the
-calibration code.
-
-On the IVB 8P system, compared to an offline CPU, powerclamp can
-achieve up to 40% better performance per watt. (measured by a spin
-counter summed over per CPU counting threads spawned for all running
-CPUs).
-
-Usage and Interfaces
-====================
-The powerclamp driver is registered to the generic thermal layer as a
-cooling device. Currently, it’s not bound to any thermal zones::
-
- jacob@chromoly:/sys/class/thermal/cooling_device14$ grep . *
- cur_state:0
- max_state:50
- type:intel_powerclamp
-
-cur_state allows user to set the desired idle percentage. Writing 0 to
-cur_state will stop idle injection. Writing a value between 1 and
-max_state will start the idle injection. Reading cur_state returns the
-actual and current idle percentage. This may not be the same value
-set by the user in that current idle percentage depends on workload
-and includes natural idle. When idle injection is disabled, reading
-cur_state returns value -1 instead of 0 which is to avoid confusing
-100% busy state with the disabled state.
-
-Example usage:
-- To inject 25% idle time::
-
- $ sudo sh -c "echo 25 > /sys/class/thermal/cooling_device80/cur_state
-
-If the system is not busy and has more than 25% idle time already,
-then the powerclamp driver will not start idle injection. Using Top
-will not show idle injection kernel threads.
-
-If the system is busy (spin test below) and has less than 25% natural
-idle time, powerclamp kernel threads will do idle injection. Forced
-idle time is accounted as normal idle in that common code path is
-taken as the idle task.
-
-In this example, 24.1% idle is shown. This helps the system admin or
-user determine the cause of slowdown, when a powerclamp driver is in action::
-
-
- Tasks: 197 total, 1 running, 196 sleeping, 0 stopped, 0 zombie
- Cpu(s): 71.2%us, 4.7%sy, 0.0%ni, 24.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
- Mem: 3943228k total, 1689632k used, 2253596k free, 74960k buffers
- Swap: 4087804k total, 0k used, 4087804k free, 945336k cached
-
- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
- 3352 jacob 20 0 262m 644 428 S 286 0.0 0:17.16 spin
- 3341 root -51 0 0 0 0 D 25 0.0 0:01.62 kidle_inject/0
- 3344 root -51 0 0 0 0 D 25 0.0 0:01.60 kidle_inject/3
- 3342 root -51 0 0 0 0 D 25 0.0 0:01.61 kidle_inject/1
- 3343 root -51 0 0 0 0 D 25 0.0 0:01.60 kidle_inject/2
- 2935 jacob 20 0 696m 125m 35m S 5 3.3 0:31.11 firefox
- 1546 root 20 0 158m 20m 6640 S 3 0.5 0:26.97 Xorg
- 2100 jacob 20 0 1223m 88m 30m S 3 2.3 0:23.68 compiz
-
-Tests have shown that by using the powerclamp driver as a cooling
-device, a PID based userspace thermal controller can manage to
-control CPU temperature effectively, when no other thermal influence
-is added. For example, a UltraBook user can compile the kernel under
-certain temperature (below most active trip points).
diff --git a/Documentation/driver-api/tty/n_gsm.rst b/Documentation/driver-api/tty/n_gsm.rst
index 35d7381515b0..9447b8a3b8e2 100644
--- a/Documentation/driver-api/tty/n_gsm.rst
+++ b/Documentation/driver-api/tty/n_gsm.rst
@@ -25,6 +25,8 @@ Config Initiator
#. Switch the serial line to using the n_gsm line discipline by using
``TIOCSETD`` ioctl.
+#. Configure the mux using ``GSMIOC_GETCONF_EXT``/``GSMIOC_SETCONF_EXT`` ioctl if needed.
+
#. Configure the mux using ``GSMIOC_GETCONF``/``GSMIOC_SETCONF`` ioctl.
#. Obtain base gsmtty number for the used serial port.
@@ -42,6 +44,7 @@ Config Initiator
int ldisc = N_GSM0710;
struct gsm_config c;
+ struct gsm_config_ext ce;
struct termios configuration;
uint32_t first;
@@ -62,6 +65,12 @@ Config Initiator
/* use n_gsm line discipline */
ioctl(fd, TIOCSETD, &ldisc);
+ /* get n_gsm extended configuration */
+ ioctl(fd, GSMIOC_GETCONF_EXT, &ce);
+ /* use keep-alive once every 5s for modem connection supervision */
+ ce.keep_alive = 500;
+ /* set the new extended configuration */
+ ioctl(fd, GSMIOC_SETCONF_EXT, &ce);
/* get n_gsm configuration */
ioctl(fd, GSMIOC_GETCONF, &c);
/* we are initiator and need encoding 0 (basic) */
@@ -106,6 +115,9 @@ Config Requester
#. Switch the serial line to using the *n_gsm* line discipline by using
``TIOCSETD`` ioctl.
+#. Configure the mux using ``GSMIOC_GETCONF_EXT``/``GSMIOC_SETCONF_EXT``
+ ioctl if needed.
+
#. Configure the mux using ``GSMIOC_GETCONF``/``GSMIOC_SETCONF`` ioctl.
#. Obtain base gsmtty number for the used serial port::
@@ -119,6 +131,7 @@ Config Requester
int ldisc = N_GSM0710;
struct gsm_config c;
+ struct gsm_config_ext ce;
struct termios configuration;
uint32_t first;
@@ -132,6 +145,12 @@ Config Requester
/* use n_gsm line discipline */
ioctl(fd, TIOCSETD, &ldisc);
+ /* get n_gsm extended configuration */
+ ioctl(fd, GSMIOC_GETCONF_EXT, &ce);
+ /* use keep-alive once every 5s for peer connection supervision */
+ ce.keep_alive = 500;
+ /* set the new extended configuration */
+ ioctl(fd, GSMIOC_SETCONF_EXT, &ce);
/* get n_gsm configuration */
ioctl(fd, GSMIOC_GETCONF, &c);
/* we are requester and need encoding 0 (basic) */
diff --git a/Documentation/driver-api/usb/dwc3.rst b/Documentation/driver-api/usb/dwc3.rst
index 8b36ff11cef9..e3d6a620997f 100644
--- a/Documentation/driver-api/usb/dwc3.rst
+++ b/Documentation/driver-api/usb/dwc3.rst
@@ -18,7 +18,7 @@ controller which can be configured in one of 4 ways:
4. Hub configuration
Linux currently supports several versions of this controller. In all
-likelyhood, the version in your SoC is already supported. At the time
+likelihood, the version in your SoC is already supported. At the time
of this writing, known tested versions range from 2.02a to 3.10a. As a
rule of thumb, anything above 2.02a should work reliably well.
diff --git a/Documentation/driver-api/usb/usb3-debug-port.rst b/Documentation/driver-api/usb/usb3-debug-port.rst
index b9fd131f4723..d4610457b052 100644
--- a/Documentation/driver-api/usb/usb3-debug-port.rst
+++ b/Documentation/driver-api/usb/usb3-debug-port.rst
@@ -48,7 +48,7 @@ kernel boot parameter::
"earlyprintk=xdbc"
If there are multiple xHCI controllers in your system, you can
-append a host contoller index to this kernel parameter. This
+append a host controller index to this kernel parameter. This
index starts from 0.
Current design doesn't support DbC runtime suspend/resume. As
diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst
index fdf7d69378ec..bbd548b66b42 100644
--- a/Documentation/driver-api/vfio-mediated-device.rst
+++ b/Documentation/driver-api/vfio-mediated-device.rst
@@ -60,7 +60,7 @@ devices as examples, as these devices are the first devices to use this module::
| mdev.ko |
| +-----------+ | mdev_register_parent() +--------------+
| | | +<------------------------+ |
- | | | | | nvidia.ko |<-> physical
+ | | | | | ccw_device.ko|<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
| | Physical | |
@@ -69,12 +69,6 @@ devices as examples, as these devices are the first devices to use this module::
| | | | | i915.ko |<-> physical
| | | +------------------------>+ | device
| | | | callbacks +--------------+
- | | | |
- | | | | mdev_register_parent() +--------------+
- | | | +<------------------------+ |
- | | | | | ccw_device.ko|<-> physical
- | | | +------------------------>+ | device
- | | | | callbacks +--------------+
| +-----------+ |
+---------------+
@@ -270,106 +264,6 @@ these callbacks are supported in the TYPE1 IOMMU module. To enable them for
other IOMMU backend modules, such as PPC64 sPAPR module, they need to provide
these two callback functions.
-Using the Sample Code
-=====================
-
-mtty.c in samples/vfio-mdev/ directory is a sample driver program to
-demonstrate how to use the mediated device framework.
-
-The sample driver creates an mdev device that simulates a serial port over a PCI
-card.
-
-1. Build and load the mtty.ko module.
-
- This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
-
- Files in this device directory in sysfs are similar to the following::
-
- # tree /sys/devices/virtual/mtty/mtty/
- /sys/devices/virtual/mtty/mtty/
- |-- mdev_supported_types
- | |-- mtty-1
- | | |-- available_instances
- | | |-- create
- | | |-- device_api
- | | |-- devices
- | | `-- name
- | `-- mtty-2
- | |-- available_instances
- | |-- create
- | |-- device_api
- | |-- devices
- | `-- name
- |-- mtty_dev
- | `-- sample_mtty_dev
- |-- power
- | |-- autosuspend_delay_ms
- | |-- control
- | |-- runtime_active_time
- | |-- runtime_status
- | `-- runtime_suspended_time
- |-- subsystem -> ../../../../class/mtty
- `-- uevent
-
-2. Create a mediated device by using the dummy device that you created in the
- previous step::
-
- # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
- /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
-
-3. Add parameters to qemu-kvm::
-
- -device vfio-pci,\
- sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
-
-4. Boot the VM.
-
- In the Linux guest VM, with no hardware on the host, the device appears
- as follows::
-
- # lspci -s 00:05.0 -xxvv
- 00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
- Subsystem: Device 4348:3253
- Physical Slot: 5
- Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
- Stepping- SERR- FastB2B- DisINTx-
- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
- <TAbort- <MAbort- >SERR- <PERR- INTx-
- Interrupt: pin A routed to IRQ 10
- Region 0: I/O ports at c150 [size=8]
- Region 1: I/O ports at c158 [size=8]
- Kernel driver in use: serial
- 00: 48 43 53 32 01 00 00 02 10 02 00 07 00 00 00 00
- 10: 51 c1 00 00 59 c1 00 00 00 00 00 00 00 00 00 00
- 20: 00 00 00 00 00 00 00 00 00 00 00 00 48 43 53 32
- 30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00
-
- In the Linux guest VM, dmesg output for the device is as follows:
-
- serial 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
- 0000:00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
- 0000:00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
-
-
-5. In the Linux guest VM, check the serial ports::
-
- # setserial -g /dev/ttyS*
- /dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
- /dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
- /dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
-
-6. Using minicom or any terminal emulation program, open port /dev/ttyS1 or
- /dev/ttyS2 with hardware flow control disabled.
-
-7. Type data on the minicom terminal or send data to the terminal emulation
- program and read the data.
-
- Data is loop backed from hosts mtty driver.
-
-8. Destroy the mediated device that you created::
-
- # echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
-
References
==========
diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst
index c663b6f97825..50b690f7f663 100644
--- a/Documentation/driver-api/vfio.rst
+++ b/Documentation/driver-api/vfio.rst
@@ -249,19 +249,21 @@ VFIO bus driver API
VFIO bus drivers, such as vfio-pci make use of only a few interfaces
into VFIO core. When devices are bound and unbound to the driver,
-the driver should call vfio_register_group_dev() and
-vfio_unregister_group_dev() respectively::
+Following interfaces are called when devices are bound to and
+unbound from the driver::
- void vfio_init_group_dev(struct vfio_device *device,
- struct device *dev,
- const struct vfio_device_ops *ops);
- void vfio_uninit_group_dev(struct vfio_device *device);
int vfio_register_group_dev(struct vfio_device *device);
+ int vfio_register_emulated_iommu_dev(struct vfio_device *device);
void vfio_unregister_group_dev(struct vfio_device *device);
-The driver should embed the vfio_device in its own structure and call
-vfio_init_group_dev() to pre-configure it before going to registration
-and call vfio_uninit_group_dev() after completing the un-registration.
+The driver should embed the vfio_device in its own structure and use
+vfio_alloc_device() to allocate the structure, and can register
+@init/@release callbacks to manage any private state wrapping the
+vfio_device::
+
+ vfio_alloc_device(dev_struct, member, dev, ops);
+ void vfio_put_device(struct vfio_device *device);
+
vfio_register_group_dev() indicates to the core to begin tracking the
iommu_group of the specified dev and register the dev as owned by a VFIO bus
driver. Once vfio_register_group_dev() returns it is possible for userspace to
@@ -270,28 +272,64 @@ ready before calling it. The driver provides an ops structure for callbacks
similar to a file operations structure::
struct vfio_device_ops {
- int (*open)(struct vfio_device *vdev);
+ char *name;
+ int (*init)(struct vfio_device *vdev);
void (*release)(struct vfio_device *vdev);
+ int (*bind_iommufd)(struct vfio_device *vdev,
+ struct iommufd_ctx *ictx, u32 *out_device_id);
+ void (*unbind_iommufd)(struct vfio_device *vdev);
+ int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
+ int (*open_device)(struct vfio_device *vdev);
+ void (*close_device)(struct vfio_device *vdev);
ssize_t (*read)(struct vfio_device *vdev, char __user *buf,
size_t count, loff_t *ppos);
- ssize_t (*write)(struct vfio_device *vdev,
- const char __user *buf,
- size_t size, loff_t *ppos);
+ ssize_t (*write)(struct vfio_device *vdev, const char __user *buf,
+ size_t count, loff_t *size);
long (*ioctl)(struct vfio_device *vdev, unsigned int cmd,
unsigned long arg);
- int (*mmap)(struct vfio_device *vdev,
- struct vm_area_struct *vma);
+ int (*mmap)(struct vfio_device *vdev, struct vm_area_struct *vma);
+ void (*request)(struct vfio_device *vdev, unsigned int count);
+ int (*match)(struct vfio_device *vdev, char *buf);
+ void (*dma_unmap)(struct vfio_device *vdev, u64 iova, u64 length);
+ int (*device_feature)(struct vfio_device *device, u32 flags,
+ void __user *arg, size_t argsz);
};
Each function is passed the vdev that was originally registered
-in the vfio_register_group_dev() call above. This allows the bus driver
-to obtain its private data using container_of(). The open/release
-callbacks are issued when a new file descriptor is created for a
-device (via VFIO_GROUP_GET_DEVICE_FD). The ioctl interface provides
-a direct pass through for VFIO_DEVICE_* ioctls. The read/write/mmap
-interfaces implement the device region access defined by the device's
-own VFIO_DEVICE_GET_REGION_INFO ioctl.
+in the vfio_register_group_dev() or vfio_register_emulated_iommu_dev()
+call above. This allows the bus driver to obtain its private data using
+container_of().
+
+::
+
+ - The init/release callbacks are issued when vfio_device is initialized
+ and released.
+
+ - The open/close device callbacks are issued when the first
+ instance of a file descriptor for the device is created (eg.
+ via VFIO_GROUP_GET_DEVICE_FD) for a user session.
+
+ - The ioctl callback provides a direct pass through for some VFIO_DEVICE_*
+ ioctls.
+
+ - The [un]bind_iommufd callbacks are issued when the device is bound to
+ and unbound from iommufd.
+
+ - The attach_ioas callback is issued when the device is attached to an
+ IOAS managed by the bound iommufd. The attached IOAS is automatically
+ detached when the device is unbound from iommufd.
+
+ - The read/write/mmap callbacks implement the device region access defined
+ by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl.
+
+ - The request callback is issued when device is going to be unregistered,
+ such as when trying to unbind the device from the vfio bus driver.
+ - The dma_unmap callback is issued when a range of iovas are unmapped
+ in the container or IOAS attached by the device. Drivers which make
+ use of the vfio page pinning interface must implement this callback in
+ order to unpin pages within the dma_unmap range. Drivers must tolerate
+ this callback even before calls to open_device().
PPC64 sPAPR implementation note
-------------------------------
diff --git a/Documentation/driver-api/virtio/index.rst b/Documentation/driver-api/virtio/index.rst
new file mode 100644
index 000000000000..528b14b291e3
--- /dev/null
+++ b/Documentation/driver-api/virtio/index.rst
@@ -0,0 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======
+Virtio
+======
+
+.. toctree::
+ :maxdepth: 1
+
+ virtio
+ writing_virtio_drivers
diff --git a/Documentation/driver-api/virtio/virtio.rst b/Documentation/driver-api/virtio/virtio.rst
new file mode 100644
index 000000000000..7947b4ca690e
--- /dev/null
+++ b/Documentation/driver-api/virtio/virtio.rst
@@ -0,0 +1,145 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _virtio:
+
+===============
+Virtio on Linux
+===============
+
+Introduction
+============
+
+Virtio is an open standard that defines a protocol for communication
+between drivers and devices of different types, see Chapter 5 ("Device
+Types") of the virtio spec (`[1]`_). Originally developed as a standard
+for paravirtualized devices implemented by a hypervisor, it can be used
+to interface any compliant device (real or emulated) with a driver.
+
+For illustrative purposes, this document will focus on the common case
+of a Linux kernel running in a virtual machine and using paravirtualized
+devices provided by the hypervisor, which exposes them as virtio devices
+via standard mechanisms such as PCI.
+
+
+Device - Driver communication: virtqueues
+=========================================
+
+Although the virtio devices are really an abstraction layer in the
+hypervisor, they're exposed to the guest as if they are physical devices
+using a specific transport method -- PCI, MMIO or CCW -- that is
+orthogonal to the device itself. The virtio spec defines these transport
+methods in detail, including device discovery, capabilities and
+interrupt handling.
+
+The communication between the driver in the guest OS and the device in
+the hypervisor is done through shared memory (that's what makes virtio
+devices so efficient) using specialized data structures called
+virtqueues, which are actually ring buffers [#f1]_ of buffer descriptors
+similar to the ones used in a network device:
+
+.. kernel-doc:: include/uapi/linux/virtio_ring.h
+ :identifiers: struct vring_desc
+
+All the buffers the descriptors point to are allocated by the guest and
+used by the host either for reading or for writing but not for both.
+
+Refer to Chapter 2.5 ("Virtqueues") of the virtio spec (`[1]`_) for the
+reference definitions of virtqueues and "Virtqueues and virtio ring: How
+the data travels" blog post (`[2]`_) for an illustrated overview of how
+the host device and the guest driver communicate.
+
+The :c:type:`vring_virtqueue` struct models a virtqueue, including the
+ring buffers and management data. Embedded in this struct is the
+:c:type:`virtqueue` struct, which is the data structure that's
+ultimately used by virtio drivers:
+
+.. kernel-doc:: include/linux/virtio.h
+ :identifiers: struct virtqueue
+
+The callback function pointed by this struct is triggered when the
+device has consumed the buffers provided by the driver. More
+specifically, the trigger will be an interrupt issued by the hypervisor
+(see vring_interrupt()). Interrupt request handlers are registered for
+a virtqueue during the virtqueue setup process (transport-specific).
+
+.. kernel-doc:: drivers/virtio/virtio_ring.c
+ :identifiers: vring_interrupt
+
+
+Device discovery and probing
+============================
+
+In the kernel, the virtio core contains the virtio bus driver and
+transport-specific drivers like `virtio-pci` and `virtio-mmio`. Then
+there are individual virtio drivers for specific device types that are
+registered to the virtio bus driver.
+
+How a virtio device is found and configured by the kernel depends on how
+the hypervisor defines it. Taking the `QEMU virtio-console
+<https://gitlab.com/qemu-project/qemu/-/blob/master/hw/char/virtio-console.c>`__
+device as an example. When using PCI as a transport method, the device
+will present itself on the PCI bus with vendor 0x1af4 (Red Hat, Inc.)
+and device id 0x1003 (virtio console), as defined in the spec, so the
+kernel will detect it as it would do with any other PCI device.
+
+During the PCI enumeration process, if a device is found to match the
+virtio-pci driver (according to the virtio-pci device table, any PCI
+device with vendor id = 0x1af4)::
+
+ /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */
+ static const struct pci_device_id virtio_pci_id_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) },
+ { 0 }
+ };
+
+then the virtio-pci driver is probed and, if the probing goes well, the
+device is registered to the virtio bus::
+
+ static int virtio_pci_probe(struct pci_dev *pci_dev,
+ const struct pci_device_id *id)
+ {
+ ...
+
+ if (force_legacy) {
+ rc = virtio_pci_legacy_probe(vp_dev);
+ /* Also try modern mode if we can't map BAR0 (no IO space). */
+ if (rc == -ENODEV || rc == -ENOMEM)
+ rc = virtio_pci_modern_probe(vp_dev);
+ if (rc)
+ goto err_probe;
+ } else {
+ rc = virtio_pci_modern_probe(vp_dev);
+ if (rc == -ENODEV)
+ rc = virtio_pci_legacy_probe(vp_dev);
+ if (rc)
+ goto err_probe;
+ }
+
+ ...
+
+ rc = register_virtio_device(&vp_dev->vdev);
+
+When the device is registered to the virtio bus the kernel will look
+for a driver in the bus that can handle the device and call that
+driver's ``probe`` method.
+
+At this point, the virtqueues will be allocated and configured by
+calling the appropriate ``virtio_find`` helper function, such as
+virtio_find_single_vq() or virtio_find_vqs(), which will end up calling
+a transport-specific ``find_vqs`` method.
+
+
+References
+==========
+
+_`[1]` Virtio Spec v1.2:
+https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html
+
+.. Check for later versions of the spec as well.
+
+_`[2]` Virtqueues and virtio ring: How the data travels
+https://www.redhat.com/en/blog/virtqueues-and-virtio-ring-how-data-travels
+
+.. rubric:: Footnotes
+
+.. [#f1] that's why they may be also referred to as virtrings.
diff --git a/Documentation/driver-api/virtio/writing_virtio_drivers.rst b/Documentation/driver-api/virtio/writing_virtio_drivers.rst
new file mode 100644
index 000000000000..e14c58796d25
--- /dev/null
+++ b/Documentation/driver-api/virtio/writing_virtio_drivers.rst
@@ -0,0 +1,197 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _writing_virtio_drivers:
+
+======================
+Writing Virtio Drivers
+======================
+
+Introduction
+============
+
+This document serves as a basic guideline for driver programmers that
+need to hack a new virtio driver or understand the essentials of the
+existing ones. See :ref:`Virtio on Linux <virtio>` for a general
+overview of virtio.
+
+
+Driver boilerplate
+==================
+
+As a bare minimum, a virtio driver needs to register in the virtio bus
+and configure the virtqueues for the device according to its spec, the
+configuration of the virtqueues in the driver side must match the
+virtqueue definitions in the device. A basic driver skeleton could look
+like this::
+
+ #include <linux/virtio.h>
+ #include <linux/virtio_ids.h>
+ #include <linux/virtio_config.h>
+ #include <linux/module.h>
+
+ /* device private data (one per device) */
+ struct virtio_dummy_dev {
+ struct virtqueue *vq;
+ };
+
+ static void virtio_dummy_recv_cb(struct virtqueue *vq)
+ {
+ struct virtio_dummy_dev *dev = vq->vdev->priv;
+ char *buf;
+ unsigned int len;
+
+ while ((buf = virtqueue_get_buf(dev->vq, &len)) != NULL) {
+ /* process the received data */
+ }
+ }
+
+ static int virtio_dummy_probe(struct virtio_device *vdev)
+ {
+ struct virtio_dummy_dev *dev = NULL;
+
+ /* initialize device data */
+ dev = kzalloc(sizeof(struct virtio_dummy_dev), GFP_KERNEL);
+ if (!dev)
+ return -ENOMEM;
+
+ /* the device has a single virtqueue */
+ dev->vq = virtio_find_single_vq(vdev, virtio_dummy_recv_cb, "input");
+ if (IS_ERR(dev->vq)) {
+ kfree(dev);
+ return PTR_ERR(dev->vq);
+
+ }
+ vdev->priv = dev;
+
+ /* from this point on, the device can notify and get callbacks */
+ virtio_device_ready(vdev);
+
+ return 0;
+ }
+
+ static void virtio_dummy_remove(struct virtio_device *vdev)
+ {
+ struct virtio_dummy_dev *dev = vdev->priv;
+
+ /*
+ * disable vq interrupts: equivalent to
+ * vdev->config->reset(vdev)
+ */
+ virtio_reset_device(vdev);
+
+ /* detach unused buffers */
+ while ((buf = virtqueue_detach_unused_buf(dev->vq)) != NULL) {
+ kfree(buf);
+ }
+
+ /* remove virtqueues */
+ vdev->config->del_vqs(vdev);
+
+ kfree(dev);
+ }
+
+ static const struct virtio_device_id id_table[] = {
+ { VIRTIO_ID_DUMMY, VIRTIO_DEV_ANY_ID },
+ { 0 },
+ };
+
+ static struct virtio_driver virtio_dummy_driver = {
+ .driver.name = KBUILD_MODNAME,
+ .driver.owner = THIS_MODULE,
+ .id_table = id_table,
+ .probe = virtio_dummy_probe,
+ .remove = virtio_dummy_remove,
+ };
+
+ module_virtio_driver(virtio_dummy_driver);
+ MODULE_DEVICE_TABLE(virtio, id_table);
+ MODULE_DESCRIPTION("Dummy virtio driver");
+ MODULE_LICENSE("GPL");
+
+The device id ``VIRTIO_ID_DUMMY`` here is a placeholder, virtio drivers
+should be added only for devices that are defined in the spec, see
+include/uapi/linux/virtio_ids.h. Device ids need to be at least reserved
+in the virtio spec before being added to that file.
+
+If your driver doesn't have to do anything special in its ``init`` and
+``exit`` methods, you can use the module_virtio_driver() helper to
+reduce the amount of boilerplate code.
+
+The ``probe`` method does the minimum driver setup in this case
+(memory allocation for the device data) and initializes the
+virtqueue. virtio_device_ready() is used to enable the virtqueue and to
+notify the device that the driver is ready to manage the device
+("DRIVER_OK"). The virtqueues are anyway enabled automatically by the
+core after ``probe`` returns.
+
+.. kernel-doc:: include/linux/virtio_config.h
+ :identifiers: virtio_device_ready
+
+In any case, the virtqueues need to be enabled before adding buffers to
+them.
+
+Sending and receiving data
+==========================
+
+The virtio_dummy_recv_cb() callback in the code above will be triggered
+when the device notifies the driver after it finishes processing a
+descriptor or descriptor chain, either for reading or writing. However,
+that's only the second half of the virtio device-driver communication
+process, as the communication is always started by the driver regardless
+of the direction of the data transfer.
+
+To configure a buffer transfer from the driver to the device, first you
+have to add the buffers -- packed as `scatterlists` -- to the
+appropriate virtqueue using any of the virtqueue_add_inbuf(),
+virtqueue_add_outbuf() or virtqueue_add_sgs(), depending on whether you
+need to add one input `scatterlist` (for the device to fill in), one
+output `scatterlist` (for the device to consume) or multiple
+`scatterlists`, respectively. Then, once the virtqueue is set up, a call
+to virtqueue_kick() sends a notification that will be serviced by the
+hypervisor that implements the device::
+
+ struct scatterlist sg[1];
+ sg_init_one(sg, buffer, BUFLEN);
+ virtqueue_add_inbuf(dev->vq, sg, 1, buffer, GFP_ATOMIC);
+ virtqueue_kick(dev->vq);
+
+.. kernel-doc:: drivers/virtio/virtio_ring.c
+ :identifiers: virtqueue_add_inbuf
+
+.. kernel-doc:: drivers/virtio/virtio_ring.c
+ :identifiers: virtqueue_add_outbuf
+
+.. kernel-doc:: drivers/virtio/virtio_ring.c
+ :identifiers: virtqueue_add_sgs
+
+Then, after the device has read or written the buffers prepared by the
+driver and notifies it back, the driver can call virtqueue_get_buf() to
+read the data produced by the device (if the virtqueue was set up with
+input buffers) or simply to reclaim the buffers if they were already
+consumed by the device:
+
+.. kernel-doc:: drivers/virtio/virtio_ring.c
+ :identifiers: virtqueue_get_buf_ctx
+
+The virtqueue callbacks can be disabled and re-enabled using the
+virtqueue_disable_cb() and the family of virtqueue_enable_cb() functions
+respectively. See drivers/virtio/virtio_ring.c for more details:
+
+.. kernel-doc:: drivers/virtio/virtio_ring.c
+ :identifiers: virtqueue_disable_cb
+
+.. kernel-doc:: drivers/virtio/virtio_ring.c
+ :identifiers: virtqueue_enable_cb
+
+But note that some spurious callbacks can still be triggered under
+certain scenarios. The way to disable callbacks reliably is to reset the
+device or the virtqueue (virtio_reset_device()).
+
+
+References
+==========
+
+_`[1]` Virtio Spec v1.2:
+https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html
+
+Check for later versions of the spec as well.