diff options
Diffstat (limited to 'Documentation/networking/device_drivers')
12 files changed, 710 insertions, 107 deletions
diff --git a/Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst b/Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst index 70643b58de05..4118384cf8eb 100644 --- a/Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst +++ b/Documentation/networking/device_drivers/cellular/qualcomm/rmnet.rst @@ -27,34 +27,136 @@ these MAP frames and send them to appropriate PDN's. 2. Packet format ================ -a. MAP packet (data / control) +a. MAP packet v1 (data / control) -MAP header has the same endianness of the IP packet. +MAP header fields are in big endian format. Packet format:: - Bit 0 1 2-7 8 - 15 16 - 31 + Bit 0 1 2-7 8-15 16-31 Function Command / Data Reserved Pad Multiplexer ID Payload length - Bit 32 - x - Function Raw Bytes + + Bit 32-x + Function Raw bytes Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command -or data packet. Control packet is used for transport level flow control. Data +or data packet. Command packet is used for transport level flow control. Data packets are standard IP packets. -Reserved bits are usually zeroed out and to be ignored by receiver. +Reserved bits must be zero when sent and ignored when received. -Padding is number of bytes to be added for 4 byte alignment if required by -hardware. +Padding is the number of bytes to be appended to the payload to +ensure 4 byte alignment. Multiplexer ID is to indicate the PDN on which data has to be sent. Payload length includes the padding length but does not include MAP header length. -b. MAP packet (command specific):: +b. Map packet v4 (data / control) + +MAP header fields are in big endian format. + +Packet format:: + + Bit 0 1 2-7 8-15 16-31 + Function Command / Data Reserved Pad Multiplexer ID Payload length + + Bit 32-(x-33) (x-32)-x + Function Raw bytes Checksum offload header + +Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command +or data packet. Command packet is used for transport level flow control. Data +packets are standard IP packets. + +Reserved bits must be zero when sent and ignored when received. + +Padding is the number of bytes to be appended to the payload to +ensure 4 byte alignment. + +Multiplexer ID is to indicate the PDN on which data has to be sent. + +Payload length includes the padding length but does not include MAP header +length. + +Checksum offload header, has the information about the checksum processing done +by the hardware.Checksum offload header fields are in big endian format. + +Packet format:: + + Bit 0-14 15 16-31 + Function Reserved Valid Checksum start offset + + Bit 31-47 48-64 + Function Checksum length Checksum value + +Reserved bits must be zero when sent and ignored when received. + +Valid bit indicates whether the partial checksum is calculated and is valid. +Set to 1, if its is valid. Set to 0 otherwise. + +Padding is the number of bytes to be appended to the payload to +ensure 4 byte alignment. + +Checksum start offset, Indicates the offset in bytes from the beginning of the +IP header, from which modem computed checksum. + +Checksum length is the Length in bytes starting from CKSUM_START_OFFSET, +over which checksum is computed. + +Checksum value, indicates the checksum computed. + +c. MAP packet v5 (data / control) + +MAP header fields are in big endian format. + +Packet format:: + + Bit 0 1 2-7 8-15 16-31 + Function Command / Data Next header Pad Multiplexer ID Payload length + + Bit 32-x + Function Raw bytes + +Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command +or data packet. Command packet is used for transport level flow control. Data +packets are standard IP packets. + +Next header is used to indicate the presence of another header, currently is +limited to checksum header. + +Padding is the number of bytes to be appended to the payload to +ensure 4 byte alignment. + +Multiplexer ID is to indicate the PDN on which data has to be sent. + +Payload length includes the padding length but does not include MAP header +length. + +d. Checksum offload header v5 + +Checksum offload header fields are in big endian format. + + Bit 0 - 6 7 8-15 16-31 + Function Header Type Next Header Checksum Valid Reserved + +Header Type is to indicate the type of header, this usually is set to CHECKSUM + +Header types += ========================================== +0 Reserved +1 Reserved +2 checksum header + +Checksum Valid is to indicate whether the header checksum is valid. Value of 1 +implies that checksum is calculated on this packet and is valid, value of 0 +indicates that the calculated packet checksum is invalid. + +Reserved bits must be zero when sent and ignored when received. + +e. MAP packet v1/v5 (command specific):: - Bit 0 1 2-7 8 - 15 16 - 31 + Bit 0 1 2-7 8 - 15 16 - 31 Function Command Reserved Pad Multiplexer ID Payload length Bit 32 - 39 40 - 45 46 - 47 48 - 63 Function Command name Reserved Command Type Reserved @@ -74,7 +176,7 @@ Command types 3 is for error during processing of commands = ========================================== -c. Aggregation +f. Aggregation Aggregation is multiple MAP packets (can be data or command) delivered to rmnet in a single linear skb. rmnet will process the individual diff --git a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst index f8c6469f2bd2..01b2a69b0cb0 100644 --- a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst +++ b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst @@ -11,12 +11,12 @@ ENA is a networking interface designed to make good use of modern CPU features and system architectures. The ENA device exposes a lightweight management interface with a -minimal set of memory mapped registers and extendable command set +minimal set of memory mapped registers and extendible command set through an Admin Queue. The driver supports a range of ENA devices, is link-speed independent -(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has -a negotiated and extendable feature set. +(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc), and has +a negotiated and extendible feature set. Some ENA devices support SR-IOV. This driver is used for both the SR-IOV Physical Function (PF) and Virtual Function (VF) devices. @@ -27,9 +27,9 @@ is advertised by the device via the Admin Queue), a dedicated MSI-X interrupt vector per Tx/Rx queue pair, adaptive interrupt moderation, and CPU cacheline optimized data placement. -The ENA driver supports industry standard TCP/IP offload features such -as checksum offload and TCP transmit segmentation offload (TSO). -Receive-side scaling (RSS) is supported for multi-core scaling. +The ENA driver supports industry standard TCP/IP offload features such as +checksum offload. Receive-side scaling (RSS) is supported for multi-core +scaling. The ENA driver and its corresponding devices implement health monitoring mechanisms such as watchdog, enabling the device and driver @@ -38,22 +38,20 @@ debug logs. Some of the ENA devices support a working mode called Low-latency Queue (LLQ), which saves several more microseconds. - ENA Source Code Directory Structure =================================== ================= ====================================================== ena_com.[ch] Management communication layer. This layer is - responsible for the handling all the management - (admin) communication between the device and the - driver. + responsible for the handling all the management + (admin) communication between the device and the + driver. ena_eth_com.[ch] Tx/Rx data path. ena_admin_defs.h Definition of ENA management interface. ena_eth_io_defs.h Definition of ENA data path interface. ena_common_defs.h Common definitions for ena_com layer. ena_regs_defs.h Definition of ENA PCI memory-mapped (MMIO) registers. ena_netdev.[ch] Main Linux kernel driver. -ena_syfsfs.[ch] Sysfs files. ena_ethtool.c ethtool callbacks. ena_pci_id_tbl.h Supported device IDs. ================= ====================================================== @@ -69,7 +67,7 @@ ENA management interface is exposed by means of: - Asynchronous Event Notification Queue (AENQ) ENA device MMIO Registers are accessed only during driver -initialization and are not involved in further normal device +initialization and are not used during further normal device operation. AQ is used for submitting management commands, and the @@ -100,28 +98,27 @@ group may have multiple syndromes, as shown below The events are: - ==================== =============== - Group Syndrome - ==================== =============== - Link state change **X** - Fatal error **X** - Notification Suspend traffic - Notification Resume traffic - Keep-Alive **X** - ==================== =============== +==================== =============== +Group Syndrome +==================== =============== +Link state change **X** +Fatal error **X** +Notification Suspend traffic +Notification Resume traffic +Keep-Alive **X** +==================== =============== ACQ and AENQ share the same MSI-X vector. -Keep-Alive is a special mechanism that allows monitoring of the -device's health. The driver maintains a watchdog (WD) handler which, -if fired, logs the current state and statistics then resets and -restarts the ENA device and driver. A Keep-Alive event is delivered by -the device every second. The driver re-arms the WD upon reception of a -Keep-Alive event. A missed Keep-Alive event causes the WD handler to -fire. +Keep-Alive is a special mechanism that allows monitoring the device's health. +A Keep-Alive event is delivered by the device every second. +The driver maintains a watchdog (WD) handler which logs the current state and +statistics. If the keep-alive events aren't delivered as expected the WD resets +the device and the driver. Data Path Interface =================== + I/O operations are based on Tx and Rx Submission Queues (Tx SQ and Rx SQ correspondingly). Each SQ has a completion queue (CQ) associated with it. @@ -131,26 +128,24 @@ physical memory. The ENA driver supports two Queue Operation modes for Tx SQs: -- Regular mode +- **Regular mode:** + In this mode the Tx SQs reside in the host's memory. The ENA + device fetches the ENA Tx descriptors and packet data from host + memory. - * In this mode the Tx SQs reside in the host's memory. The ENA - device fetches the ENA Tx descriptors and packet data from host - memory. +- **Low Latency Queue (LLQ) mode or "push-mode":** + In this mode the driver pushes the transmit descriptors and the + first 128 bytes of the packet directly to the ENA device memory + space. The rest of the packet payload is fetched by the + device. For this operation mode, the driver uses a dedicated PCI + device memory BAR, which is mapped with write-combine capability. -- Low Latency Queue (LLQ) mode or "push-mode". - - * In this mode the driver pushes the transmit descriptors and the - first 128 bytes of the packet directly to the ENA device memory - space. The rest of the packet payload is fetched by the - device. For this operation mode, the driver uses a dedicated PCI - device memory BAR, which is mapped with write-combine capability. + **Note that** not all ENA devices support LLQ, and this feature is negotiated + with the device upon initialization. If the ENA device does not + support LLQ mode, the driver falls back to the regular mode. The Rx SQs support only the regular mode. -Note: Not all ENA devices support LLQ, and this feature is negotiated - with the device upon initialization. If the ENA device does not - support LLQ mode, the driver falls back to the regular mode. - The driver supports multi-queue for both Tx and Rx. This has various benefits: @@ -165,6 +160,7 @@ benefits: Interrupt Modes =============== + The driver assigns a single MSI-X vector per queue pair (for both Tx and Rx directions). The driver assigns an additional dedicated MSI-X vector for management (for ACQ and AENQ). @@ -190,20 +186,21 @@ unmasked by the driver after NAPI processing is complete. Interrupt Moderation ==================== + ENA driver and device can operate in conventional or adaptive interrupt moderation mode. -In conventional mode the driver instructs device to postpone interrupt +**In conventional mode** the driver instructs device to postpone interrupt posting according to static interrupt delay value. The interrupt delay -value can be configured through ethtool(8). The following ethtool -parameters are supported by the driver: tx-usecs, rx-usecs +value can be configured through `ethtool(8)`. The following `ethtool` +parameters are supported by the driver: ``tx-usecs``, ``rx-usecs`` -In adaptive interrupt moderation mode the interrupt delay value is +**In adaptive interrupt** moderation mode the interrupt delay value is updated by the driver dynamically and adjusted every NAPI cycle according to the traffic nature. -Adaptive coalescing can be switched on/off through ethtool(8) -adaptive_rx on|off parameter. +Adaptive coalescing can be switched on/off through `ethtool(8)`'s +:code:`adaptive_rx on|off` parameter. More information about Adaptive Interrupt Moderation (DIM) can be found in Documentation/networking/net_dim.rst @@ -214,17 +211,10 @@ The rx_copybreak is initialized by default to ENA_DEFAULT_RX_COPYBREAK and can be configured by the ETHTOOL_STUNABLE command of the SIOCETHTOOL ioctl. -SKB -=== -The driver-allocated SKB for frames received from Rx handling using -NAPI context. The allocation method depends on the size of the packet. -If the frame length is larger than rx_copybreak, napi_get_frags() -is used, otherwise netdev_alloc_skb_ip_align() is used, the buffer -content is copied (by CPU) to the SKB, and the buffer is recycled. - Statistics ========== -The user can obtain ENA device and driver statistics using ethtool. + +The user can obtain ENA device and driver statistics using `ethtool`. The driver can collect regular or extended statistics (including per-queue stats) from the device. @@ -232,22 +222,23 @@ In addition the driver logs the stats to syslog upon device reset. MTU === + The driver supports an arbitrarily large MTU with a maximum that is negotiated with the device. The driver configures MTU using the SetFeature command (ENA_ADMIN_MTU property). The user can change MTU -via ip(8) and similar legacy tools. +via `ip(8)` and similar legacy tools. Stateless Offloads ================== + The ENA driver supports: -- TSO over IPv4/IPv6 -- TSO with ECN - IPv4 header checksum offload - TCP/UDP over IPv4/IPv6 checksum offloads RSS === + - The ENA device supports RSS that allows flexible Rx traffic steering. - Toeplitz and CRC32 hash functions are supported. @@ -260,41 +251,42 @@ RSS function delivered in the Rx CQ descriptor is set in the received SKB. - The user can provide a hash key, hash function, and configure the - indirection table through ethtool(8). + indirection table through `ethtool(8)`. DATA PATH ========= + Tx -- -ena_start_xmit() is called by the stack. This function does the following: +:code:`ena_start_xmit()` is called by the stack. This function does the following: -- Maps data buffers (skb->data and frags). -- Populates ena_buf for the push buffer (if the driver and device are - in push mode.) +- Maps data buffers (``skb->data`` and frags). +- Populates ``ena_buf`` for the push buffer (if the driver and device are + in push mode). - Prepares ENA bufs for the remaining frags. -- Allocates a new request ID from the empty req_id ring. The request +- Allocates a new request ID from the empty ``req_id`` ring. The request ID is the index of the packet in the Tx info. This is used for - out-of-order TX completions. + out-of-order Tx completions. - Adds the packet to the proper place in the Tx ring. -- Calls ena_com_prepare_tx(), an ENA communication layer that converts - the ena_bufs to ENA descriptors (and adds meta ENA descriptors as - needed.) +- Calls :code:`ena_com_prepare_tx()`, an ENA communication layer that converts + the ``ena_bufs`` to ENA descriptors (and adds meta ENA descriptors as + needed). * This function also copies the ENA descriptors and the push buffer - to the Device memory space (if in push mode.) + to the Device memory space (if in push mode). -- Writes doorbell to the ENA device. +- Writes a doorbell to the ENA device. - When the ENA device finishes sending the packet, a completion interrupt is raised. - The interrupt handler schedules NAPI. -- The ena_clean_tx_irq() function is called. This function handles the +- The :code:`ena_clean_tx_irq()` function is called. This function handles the completion descriptors generated by the ENA, with a single completion descriptor per completed packet. - * req_id is retrieved from the completion descriptor. The tx_info of - the packet is retrieved via the req_id. The data buffers are - unmapped and req_id is returned to the empty req_id ring. + * ``req_id`` is retrieved from the completion descriptor. The ``tx_info`` of + the packet is retrieved via the ``req_id``. The data buffers are + unmapped and ``req_id`` is returned to the empty ``req_id`` ring. * The function stops when the completion descriptors are completed or the budget is reached. @@ -303,12 +295,11 @@ Rx - When a packet is received from the ENA device. - The interrupt handler schedules NAPI. -- The ena_clean_rx_irq() function is called. This function calls - ena_rx_pkt(), an ENA communication layer function, which returns the - number of descriptors used for a new unhandled packet, and zero if +- The :code:`ena_clean_rx_irq()` function is called. This function calls + :code:`ena_com_rx_pkt()`, an ENA communication layer function, which returns the + number of descriptors used for a new packet, and zero if no new packet is found. -- Then it calls the ena_clean_rx_irq() function. -- ena_eth_rx_skb() checks packet length: +- :code:`ena_rx_skb()` checks packet length: * If the packet is small (len < rx_copybreak), the driver allocates a SKB for the new packet, and copies the packet payload into the @@ -317,9 +308,10 @@ Rx - In this way the original data buffer is not passed to the stack and is reused for future Rx packets. - * Otherwise the function unmaps the Rx buffer, then allocates the - new SKB structure and hooks the Rx buffer to the SKB frags. + * Otherwise the function unmaps the Rx buffer, sets the first + descriptor as `skb`'s linear part and the other descriptors as the + `skb`'s frags. - The new SKB is updated with the necessary information (protocol, - checksum hw verify result, etc.), and then passed to the network - stack, using the NAPI interface function napi_gro_receive(). + checksum hw verify result, etc), and then passed to the network + stack, using the NAPI interface function :code:`napi_gro_receive()`. diff --git a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/dpio-driver.rst b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/dpio-driver.rst index c50fd46631e0..e4ebfe62a183 100644 --- a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/dpio-driver.rst +++ b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/dpio-driver.rst @@ -1,5 +1,6 @@ .. include:: <isonum.txt> +=================================== DPAA2 DPIO (Data Path I/O) Overview =================================== diff --git a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst index ee40fcc5ddff..62f4a4aff6ec 100644 --- a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst +++ b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst @@ -9,3 +9,4 @@ DPAA2 Documentation dpio-driver ethernet-driver mac-phy-support + switch-driver diff --git a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst new file mode 100644 index 000000000000..8bf411b857d4 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst @@ -0,0 +1,217 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. include:: <isonum.txt> + +=================== +DPAA2 Switch driver +=================== + +:Copyright: |copy| 2021 NXP + +The DPAA2 Switch driver probes on the Datapath Switch (DPSW) object which can +be instantiated on the following DPAA2 SoCs and their variants: LS2088A and +LX2160A. + +The driver uses the switch device driver model and exposes each switch port as +a network interface, which can be included in a bridge or used as a standalone +interface. Traffic switched between ports is offloaded into the hardware. + +The DPSW can have ports connected to DPNIs or to DPMACs for external access. +:: + + [ethA] [ethB] [ethC] [ethD] [ethE] [ethF] + : : : : : : + : : : : : : + [dpaa2-eth] [dpaa2-eth] [ dpaa2-switch ] + : : : : : : kernel + ============================================================================= + : : : : : : hardware + [DPNI] [DPNI] [============= DPSW =================] + | | | | | | + | ---------- | [DPMAC] [DPMAC] + ------------------------------- | | + | | + [PHY] [PHY] + +Creating an Ethernet Switch +=========================== + +The dpaa2-switch driver probes on DPSW devices found on the fsl-mc bus. These +devices can be either created statically through the boot time configuration +file - DataPath Layout (DPL) - or at runtime using the DPAA2 object APIs +(incorporated already into the restool userspace tool). + +At the moment, the dpaa2-switch driver imposes the following restrictions on +the DPSW object that it will probe: + + * The minimum number of FDBs should be at least equal to the number of switch + interfaces. This is necessary so that separation of switch ports can be + done, ie when not under a bridge, each switch port will have its own FDB. + :: + + fsl_dpaa2_switch dpsw.0: The number of FDBs is lower than the number of ports, cannot probe + + * Both the broadcast and flooding configuration should be per FDB. This + enables the driver to restrict the broadcast and flooding domains of each + FDB depending on the switch ports that are sharing it (aka are under the + same bridge). + :: + + fsl_dpaa2_switch dpsw.0: Flooding domain is not per FDB, cannot probe + fsl_dpaa2_switch dpsw.0: Broadcast domain is not per FDB, cannot probe + + * The control interface of the switch should not be disabled + (DPSW_OPT_CTRL_IF_DIS not passed as a create time option). Without the + control interface, the driver is not capable to provide proper Rx/Tx traffic + support on the switch port netdevices. + :: + + fsl_dpaa2_switch dpsw.0: Control Interface is disabled, cannot probe + +Besides the configuration of the actual DPSW object, the dpaa2-switch driver +will need the following DPAA2 objects: + + * 1 DPMCP - A Management Command Portal object is needed for any interraction + with the MC firmware. + + * 1 DPBP - A Buffer Pool is used for seeding buffers intended for the Rx path + on the control interface. + + * Access to at least one DPIO object (Software Portal) is needed for any + enqueue/dequeue operation to be performed on the control interface queues. + The DPIO object will be shared, no need for a private one. + +Switching features +================== + +The driver supports the configuration of L2 forwarding rules in hardware for +port bridging as well as standalone usage of the independent switch interfaces. + +The hardware is not configurable with respect to VLAN awareness, thus any DPAA2 +switch port should be used only in usecases with a VLAN aware bridge:: + + $ ip link add dev br0 type bridge vlan_filtering 1 + + $ ip link add dev br1 type bridge + $ ip link set dev ethX master br1 + Error: fsl_dpaa2_switch: Cannot join a VLAN-unaware bridge + +Topology and loop detection through STP is supported when ``stp_state 1`` is +used at bridge create :: + + $ ip link add dev br0 type bridge vlan_filtering 1 stp_state 1 + +L2 FDB manipulation (add/delete/dump) is supported. + +HW FDB learning can be configured on each switch port independently through +bridge commands. When the HW learning is disabled, a fast age procedure will be +run and any previously learnt addresses will be removed. +:: + + $ bridge link set dev ethX learning off + $ bridge link set dev ethX learning on + +Restricting the unknown unicast and multicast flooding domain is supported, but +not independently of each other:: + + $ ip link set dev ethX type bridge_slave flood off mcast_flood off + $ ip link set dev ethX type bridge_slave flood off mcast_flood on + Error: fsl_dpaa2_switch: Cannot configure multicast flooding independently of unicast. + +Broadcast flooding on a switch port can be disabled/enabled through the brport sysfs:: + + $ echo 0 > /sys/bus/fsl-mc/devices/dpsw.Y/net/ethX/brport/broadcast_flood + +Offloads +======== + +Routing actions (redirect, trap, drop) +-------------------------------------- + +The DPAA2 switch is able to offload flow-based redirection of packets making +use of ACL tables. Shared filter blocks are supported by sharing a single ACL +table between multiple ports. + +The following flow keys are supported: + + * Ethernet: dst_mac/src_mac + * IPv4: dst_ip/src_ip/ip_proto/tos + * VLAN: vlan_id/vlan_prio/vlan_tpid/vlan_dei + * L4: dst_port/src_port + +Also, the matchall filter can be used to redirect the entire traffic received +on a port. + +As per flow actions, the following are supported: + + * drop + * mirred egress redirect + * trap + +Each ACL entry (filter) can be setup with only one of the listed +actions. + +Example 1: send frames received on eth4 with a SA of 00:01:02:03:04:05 to the +CPU:: + + $ tc qdisc add dev eth4 clsact + $ tc filter add dev eth4 ingress flower src_mac 00:01:02:03:04:05 skip_sw action trap + +Example 2: drop frames received on eth4 with VID 100 and PCP of 3:: + + $ tc filter add dev eth4 ingress protocol 802.1q flower skip_sw vlan_id 100 vlan_prio 3 action drop + +Example 3: redirect all frames received on eth4 to eth1:: + + $ tc filter add dev eth4 ingress matchall action mirred egress redirect dev eth1 + +Example 4: Use a single shared filter block on both eth5 and eth6:: + + $ tc qdisc add dev eth5 ingress_block 1 clsact + $ tc qdisc add dev eth6 ingress_block 1 clsact + $ tc filter add block 1 ingress flower dst_mac 00:01:02:03:04:04 skip_sw \ + action trap + $ tc filter add block 1 ingress protocol ipv4 flower src_ip 192.168.1.1 skip_sw \ + action mirred egress redirect dev eth3 + +Mirroring +~~~~~~~~~ + +The DPAA2 switch supports only per port mirroring and per VLAN mirroring. +Adding mirroring filters in shared blocks is also supported. + +When using the tc-flower classifier with the 802.1q protocol, only the +''vlan_id'' key will be accepted. Mirroring based on any other fields from the +802.1q protocol will be rejected:: + + $ tc qdisc add dev eth8 ingress_block 1 clsact + $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_prio 3 action mirred egress mirror dev eth6 + Error: fsl_dpaa2_switch: Only matching on VLAN ID supported. + We have an error talking to the kernel + +If a mirroring VLAN filter is requested on a port, the VLAN must to be +installed on the switch port in question either using ''bridge'' or by creating +a VLAN upper device if the switch port is used as a standalone interface:: + + $ tc qdisc add dev eth8 ingress_block 1 clsact + $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 200 action mirred egress mirror dev eth6 + Error: VLAN must be installed on the switch port. + We have an error talking to the kernel + + $ bridge vlan add vid 200 dev eth8 + $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 200 action mirred egress mirror dev eth6 + + $ ip link add link eth8 name eth8.200 type vlan id 200 + $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 200 action mirred egress mirror dev eth6 + +Also, it should be noted that the mirrored traffic will be subject to the same +egress restrictions as any other traffic. This means that when a mirrored +packet will reach the mirror port, if the VLAN found in the packet is not +installed on the port it will get dropped. + +The DPAA2 switch supports only a single mirroring destination, thus multiple +mirror rules can be installed but their ''to'' port has to be the same:: + + $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 200 action mirred egress mirror dev eth6 + $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 100 action mirred egress mirror dev eth7 + Error: fsl_dpaa2_switch: Multiple mirror ports not supported. + We have an error talking to the kernel diff --git a/Documentation/networking/device_drivers/ethernet/google/gve.rst b/Documentation/networking/device_drivers/ethernet/google/gve.rst index 793693cef6e3..6d73ee78f3d7 100644 --- a/Documentation/networking/device_drivers/ethernet/google/gve.rst +++ b/Documentation/networking/device_drivers/ethernet/google/gve.rst @@ -47,13 +47,24 @@ The driver interacts with the device in the following ways: - Transmit and Receive Queues - See description below +Descriptor Formats +------------------ +GVE supports two descriptor formats: GQI and DQO. These two formats have +entirely different descriptors, which will be described below. + Registers --------- -All registers are MMIO and big endian. +All registers are MMIO. The registers are used for initializing and configuring the device as well as querying device status in response to management interrupts. +Endianness +---------- +- Admin Queue messages and registers are all Big Endian. +- GQI descriptors and datapath registers are Big Endian. +- DQO descriptors and datapath registers are Little Endian. + Admin Queue (AQ) ---------------- The Admin Queue is a PAGE_SIZE memory block, treated as an array of AQ @@ -97,10 +108,10 @@ the queues associated with that interrupt. The handler for these irqs schedule the napi for that block to run and poll the queues. -Traffic Queues --------------- -gVNIC's queues are composed of a descriptor ring and a buffer and are -assigned to a notification block. +GQI Traffic Queues +------------------ +GQI queues are composed of a descriptor ring and a buffer and are assigned to a +notification block. The descriptor rings are power-of-two-sized ring buffers consisting of fixed-size descriptors. They advance their head pointer using a __be32 @@ -121,3 +132,35 @@ Receive The buffers for receive rings are put into a data ring that is the same length as the descriptor ring and the head and tail pointers advance over the rings together. + +DQO Traffic Queues +------------------ +- Every TX and RX queue is assigned a notification block. + +- TX and RX buffers queues, which send descriptors to the device, use MMIO + doorbells to notify the device of new descriptors. + +- RX and TX completion queues, which receive descriptors from the device, use a + "generation bit" to know when a descriptor was populated by the device. The + driver initializes all bits with the "current generation". The device will + populate received descriptors with the "next generation" which is inverted + from the current generation. When the ring wraps, the current/next generation + are swapped. + +- It's the driver's responsibility to ensure that the RX and TX completion + queues are not overrun. This can be accomplished by limiting the number of + descriptors posted to HW. + +- TX packets have a 16 bit completion_tag and RX buffers have a 16 bit + buffer_id. These will be returned on the TX completion and RX queues + respectively to let the driver know which packet/buffer was completed. + +Transmit +~~~~~~~~ +A packet's buffers are DMA mapped for the device to access before transmission. +After the packet was successfully transmitted, the buffers are unmapped. + +Receive +~~~~~~~ +The driver posts fixed sized buffers to HW on the RX buffer queue. The packet +received on the associated RX queue may span multiple descriptors. diff --git a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst index 2d3f6bd969a2..ac35bd472bdc 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst @@ -466,7 +466,7 @@ network. PTP support varies among Intel devices that support this driver. Use "ethtool -T <netdev name>" to get a definitive list of PTP capabilities supported by the device. -IEEE 802.1ad (QinQ) Support +IEEE 802.1ad (QinQ) Support --------------------------- The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as @@ -523,8 +523,8 @@ of a port's bandwidth (should it be available). The sum of all the values for Maximum Bandwidth is not restricted, because no more than 100% of a port's bandwidth can ever be used. -NOTE: X710/XXV710 devices fail to enable Max VFs (64) when Multiple Functions -per Port (MFP) and SR-IOV are enabled. An error from i40e is logged that says +NOTE: X710/XXV710 devices fail to enable Max VFs (64) when Multiple Functions +per Port (MFP) and SR-IOV are enabled. An error from i40e is logged that says "add vsi failed for VF N, aq_err 16". To workaround the issue, enable less than 64 virtual functions (VFs). diff --git a/Documentation/networking/device_drivers/ethernet/intel/iavf.rst b/Documentation/networking/device_drivers/ethernet/intel/iavf.rst index 25330b7b5168..151af0a8da9c 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/iavf.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/iavf.rst @@ -113,7 +113,7 @@ which the AVF is associated. The following are base mode features: - AVF device ID - HW mailbox is used for VF to PF communications (including on Windows) -IEEE 802.1ad (QinQ) Support +IEEE 802.1ad (QinQ) Support --------------------------- The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst index 936a10f1942c..4b59cf2c599f 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst @@ -12,6 +12,7 @@ Contents - `Enabling the driver and kconfig options`_ - `Devlink info`_ - `Devlink parameters`_ +- `Bridge offload`_ - `mlx5 subfunction`_ - `mlx5 function attributes`_ - `Devlink health reporters`_ @@ -217,6 +218,37 @@ users try to enable them. $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev +Bridge offload +============== +The mlx5 driver implements support for offloading bridge rules when in switchdev +mode. Linux bridge FDBs are automatically offloaded when mlx5 switchdev +representor is attached to bridge. + +- Change device to switchdev mode:: + + $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev + +- Attach mlx5 switchdev representor 'enp8s0f0' to bridge netdev 'bridge1':: + + $ ip link set enp8s0f0 master bridge1 + +VLANs +----- +Following bridge VLAN functions are supported by mlx5: + +- VLAN filtering (including multiple VLANs per port):: + + $ ip link set bridge1 type bridge vlan_filtering 1 + $ bridge vlan add dev enp8s0f0 vid 2-3 + +- VLAN push on bridge ingress:: + + $ bridge vlan add dev enp8s0f0 vid 3 pvid + +- VLAN pop on bridge egress:: + + $ bridge vlan add dev enp8s0f0 vid 3 untagged + mlx5 subfunction ================ mlx5 supports subfunction management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface. @@ -568,3 +600,103 @@ tc and eswitch offloads tracepoints: $ cat /sys/kernel/debug/tracing/trace ... kworker/u48:7-2221 [009] ...1 1475.387435: mlx5e_rep_neigh_update: netdev: ens1f0 MAC: 24:8a:07:9a:17:9a IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_connected=1 + +Bridge offloads tracepoints: + +- mlx5_esw_bridge_fdb_entry_init: trace bridge FDB entry offloaded to mlx5:: + + $ echo mlx5:mlx5_esw_bridge_fdb_entry_init >> set_event + $ cat /sys/kernel/debug/tracing/trace + ... + kworker/u20:9-2217 [003] ...1 318.582243: mlx5_esw_bridge_fdb_entry_init: net_device=enp8s0f0_0 addr=e4:fd:05:08:00:02 vid=0 flags=0 used=0 + +- mlx5_esw_bridge_fdb_entry_cleanup: trace bridge FDB entry deleted from mlx5:: + + $ echo mlx5:mlx5_esw_bridge_fdb_entry_cleanup >> set_event + $ cat /sys/kernel/debug/tracing/trace + ... + ip-2581 [005] ...1 318.629871: mlx5_esw_bridge_fdb_entry_cleanup: net_device=enp8s0f0_1 addr=e4:fd:05:08:00:03 vid=0 flags=0 used=16 + +- mlx5_esw_bridge_fdb_entry_refresh: trace bridge FDB entry offload refreshed in + mlx5:: + + $ echo mlx5:mlx5_esw_bridge_fdb_entry_refresh >> set_event + $ cat /sys/kernel/debug/tracing/trace + ... + kworker/u20:8-3849 [003] ...1 466716: mlx5_esw_bridge_fdb_entry_refresh: net_device=enp8s0f0_0 addr=e4:fd:05:08:00:02 vid=3 flags=0 used=0 + +- mlx5_esw_bridge_vlan_create: trace bridge VLAN object add on mlx5 + representor:: + + $ echo mlx5:mlx5_esw_bridge_vlan_create >> set_event + $ cat /sys/kernel/debug/tracing/trace + ... + ip-2560 [007] ...1 318.460258: mlx5_esw_bridge_vlan_create: vid=1 flags=6 + +- mlx5_esw_bridge_vlan_cleanup: trace bridge VLAN object delete from mlx5 + representor:: + + $ echo mlx5:mlx5_esw_bridge_vlan_cleanup >> set_event + $ cat /sys/kernel/debug/tracing/trace + ... + bridge-2582 [007] ...1 318.653496: mlx5_esw_bridge_vlan_cleanup: vid=2 flags=8 + +- mlx5_esw_bridge_vport_init: trace mlx5 vport assigned with bridge upper + device:: + + $ echo mlx5:mlx5_esw_bridge_vport_init >> set_event + $ cat /sys/kernel/debug/tracing/trace + ... + ip-2560 [007] ...1 318.458915: mlx5_esw_bridge_vport_init: vport_num=1 + +- mlx5_esw_bridge_vport_cleanup: trace mlx5 vport removed from bridge upper + device:: + + $ echo mlx5:mlx5_esw_bridge_vport_cleanup >> set_event + $ cat /sys/kernel/debug/tracing/trace + ... + ip-5387 [000] ...1 573713: mlx5_esw_bridge_vport_cleanup: vport_num=1 + +Eswitch QoS tracepoints: + +- mlx5_esw_vport_qos_create: trace creation of transmit scheduler arbiter for vport:: + + $ echo mlx5:mlx5_esw_vport_qos_create >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + <...>-23496 [018] .... 73136.838831: mlx5_esw_vport_qos_create: (0000:82:00.0) vport=2 tsar_ix=4 bw_share=0, max_rate=0 group=000000007b576bb3 + +- mlx5_esw_vport_qos_config: trace configuration of transmit scheduler arbiter for vport:: + + $ echo mlx5:mlx5_esw_vport_qos_config >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + <...>-26548 [023] .... 75754.223823: mlx5_esw_vport_qos_config: (0000:82:00.0) vport=1 tsar_ix=3 bw_share=34, max_rate=10000 group=000000007b576bb3 + +- mlx5_esw_vport_qos_destroy: trace deletion of transmit scheduler arbiter for vport:: + + $ echo mlx5:mlx5_esw_vport_qos_destroy >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + <...>-27418 [004] .... 76546.680901: mlx5_esw_vport_qos_destroy: (0000:82:00.0) vport=1 tsar_ix=3 + +- mlx5_esw_group_qos_create: trace creation of transmit scheduler arbiter for rate group:: + + $ echo mlx5:mlx5_esw_group_qos_create >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + <...>-26578 [008] .... 75776.022112: mlx5_esw_group_qos_create: (0000:82:00.0) group=000000008dac63ea tsar_ix=5 + +- mlx5_esw_group_qos_config: trace configuration of transmit scheduler arbiter for rate group:: + + $ echo mlx5:mlx5_esw_group_qos_config >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + <...>-27303 [020] .... 76461.455356: mlx5_esw_group_qos_config: (0000:82:00.0) group=000000008dac63ea tsar_ix=5 bw_share=100 max_rate=20000 + +- mlx5_esw_group_qos_destroy: trace deletion of transmit scheduler arbiter for group:: + + $ echo mlx5:mlx5_esw_group_qos_destroy >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + <...>-27418 [006] .... 76547.187258: mlx5_esw_group_qos_destroy: (0000:82:00.0) group=000000007b576bb3 tsar_ix=1 diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst index d8279de7bf25..3a5a1d46e77e 100644 --- a/Documentation/networking/device_drivers/index.rst +++ b/Documentation/networking/device_drivers/index.rst @@ -18,6 +18,7 @@ Contents: qlogic/index wan/index wifi/index + wwan/index .. only:: subproject and html diff --git a/Documentation/networking/device_drivers/wwan/index.rst b/Documentation/networking/device_drivers/wwan/index.rst new file mode 100644 index 000000000000..1cb8c7371401 --- /dev/null +++ b/Documentation/networking/device_drivers/wwan/index.rst @@ -0,0 +1,18 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +WWAN Device Drivers +=================== + +Contents: + +.. toctree:: + :maxdepth: 2 + + iosm + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/networking/device_drivers/wwan/iosm.rst b/Documentation/networking/device_drivers/wwan/iosm.rst new file mode 100644 index 000000000000..aceb0223eb46 --- /dev/null +++ b/Documentation/networking/device_drivers/wwan/iosm.rst @@ -0,0 +1,96 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +.. Copyright (C) 2020-21 Intel Corporation + +.. _iosm_driver_doc: + +=========================================== +IOSM Driver for Intel M.2 PCIe based Modems +=========================================== +The IOSM (IPC over Shared Memory) driver is a WWAN PCIe host driver developed +for linux or chrome platform for data exchange over PCIe interface between +Host platform & Intel M.2 Modem. The driver exposes interface conforming to the +MBIM protocol [1]. Any front end application ( eg: Modem Manager) could easily +manage the MBIM interface to enable data communication towards WWAN. + +Basic usage +=========== +MBIM functions are inactive when unmanaged. The IOSM driver only provides a +userspace interface MBIM "WWAN PORT" representing MBIM control channel and does +not play any role in managing the functionality. It is the job of a userspace +application to detect port enumeration and enable MBIM functionality. + +Examples of few such userspace application are: +- mbimcli (included with the libmbim [2] library), and +- Modem Manager [3] + +Management Applications to carry out below required actions for establishing +MBIM IP session: +- open the MBIM control channel +- configure network connection settings +- connect to network +- configure IP network interface + +Management application development +================================== +The driver and userspace interfaces are described below. The MBIM protocol is +described in [1] Mobile Broadband Interface Model v1.0 Errata-1. + +MBIM control channel userspace ABI +---------------------------------- + +/dev/wwan0mbim0 character device +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The driver exposes an MBIM interface to the MBIM function by implementing +MBIM WWAN Port. The userspace end of the control channel pipe is a +/dev/wwan0mbim0 character device. Application shall use this interface for +MBIM protocol communication. + +Fragmentation +~~~~~~~~~~~~~ +The userspace application is responsible for all control message fragmentation +and defragmentation as per MBIM specification. + +/dev/wwan0mbim0 write() +~~~~~~~~~~~~~~~~~~~~~~~ +The MBIM control messages from the management application must not exceed the +negotiated control message size. + +/dev/wwan0mbim0 read() +~~~~~~~~~~~~~~~~~~~~~~ +The management application must accept control messages of up the negotiated +control message size. + +MBIM data channel userspace ABI +------------------------------- + +wwan0-X network device +~~~~~~~~~~~~~~~~~~~~~~ +The IOSM driver exposes IP link interface "wwan0-X" of type "wwan" for IP +traffic. Iproute network utility is used for creating "wwan0-X" network +interface and for associating it with MBIM IP session. The Driver supports +upto 8 IP sessions for simultaneous IP communication. + +The userspace management application is responsible for creating new IP link +prior to establishing MBIM IP session where the SessionId is greater than 0. + +For example, creating new IP link for a MBIM IP session with SessionId 1: + + ip link add dev wwan0-1 parentdev-name wwan0 type wwan linkid 1 + +The driver will automatically map the "wwan0-1" network device to MBIM IP +session 1. + +References +========== +[1] "MBIM (Mobile Broadband Interface Model) Errata-1" + - https://www.usb.org/document-library/ + +[2] libmbim - "a glib-based library for talking to WWAN modems and + devices which speak the Mobile Interface Broadband Model (MBIM) + protocol" + - http://www.freedesktop.org/wiki/Software/libmbim/ + +[3] Modem Manager - "a DBus-activated daemon which controls mobile + broadband (2G/3G/4G) devices and connections" + - http://www.freedesktop.org/wiki/Software/ModemManager/ |