diff options
Diffstat (limited to 'Documentation/networking/device_drivers/ethernet')
6 files changed, 155 insertions, 41 deletions
diff --git a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst index 8bcb173e0353..5eaa3ab6c73e 100644 --- a/Documentation/networking/device_drivers/ethernet/amazon/ena.rst +++ b/Documentation/networking/device_drivers/ethernet/amazon/ena.rst @@ -38,6 +38,7 @@ debug logs. Some of the ENA devices support a working mode called Low-latency Queue (LLQ), which saves several more microseconds. + ENA Source Code Directory Structure =================================== @@ -205,6 +206,8 @@ Adaptive coalescing can be switched on/off through `ethtool(8)`'s More information about Adaptive Interrupt Moderation (DIM) can be found in Documentation/networking/net_dim.rst +.. _`RX copybreak`: + RX copybreak ============ The rx_copybreak is initialized by default to ENA_DEFAULT_RX_COPYBREAK @@ -315,3 +318,34 @@ Rx - The new SKB is updated with the necessary information (protocol, checksum hw verify result, etc), and then passed to the network stack, using the NAPI interface function :code:`napi_gro_receive()`. + +Dynamic RX Buffers (DRB) +------------------------ + +Each RX descriptor in the RX ring is a single memory page (which is either 4KB +or 16KB long depending on system's configurations). +To reduce the memory allocations required when dealing with a high rate of small +packets, the driver tries to reuse the remaining RX descriptor's space if more +than 2KB of this page remain unused. + +A simple example of this mechanism is the following sequence of events: + +:: + + 1. Driver allocates page-sized RX buffer and passes it to hardware + +----------------------+ + |4KB RX Buffer | + +----------------------+ + + 2. A 300Bytes packet is received on this buffer + + 3. The driver increases the ref count on this page and returns it back to + HW as an RX buffer of size 4KB - 300Bytes = 3796 Bytes + +----+--------------------+ + |****|3796 Bytes RX Buffer| + +----+--------------------+ + +This mechanism isn't used when an XDP program is loaded, or when the +RX packet is less than rx_copybreak bytes (in which case the packet is +copied out of the RX buffer into the linear part of a new skb allocated +for it and the RX buffer remains the same size, see `RX copybreak`_). diff --git a/Documentation/networking/device_drivers/ethernet/intel/ice.rst b/Documentation/networking/device_drivers/ethernet/intel/ice.rst index 69695e5511f4..e4d065c55ea8 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/ice.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/ice.rst @@ -84,24 +84,6 @@ Once the VM shuts down, or otherwise releases the VF, the command will complete. -Important notes for SR-IOV and Link Aggregation ------------------------------------------------ -Link Aggregation is mutually exclusive with SR-IOV. - -- If Link Aggregation is active, SR-IOV VFs cannot be created on the PF. -- If SR-IOV is active, you cannot set up Link Aggregation on the interface. - -Bridging and MACVLAN are also affected by this. If you wish to use bridging or -MACVLAN with SR-IOV, you must set up bridging or MACVLAN before enabling -SR-IOV. If you are using bridging or MACVLAN in conjunction with SR-IOV, and -you want to remove the interface from the bridge or MACVLAN, you must follow -these steps: - -1. Destroy SR-IOV VFs if they exist -2. Remove the interface from the bridge or MACVLAN -3. Recreate SRIOV VFs as needed - - Additional Features and Configurations ====================================== diff --git a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst index 5ba9015336e2..bfd233cfac35 100644 --- a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst +++ b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst @@ -13,6 +13,7 @@ Contents - `Drivers`_ - `Basic packet flow`_ - `Devlink health reporters`_ +- `Quality of service`_ Overview ======== @@ -287,3 +288,47 @@ For example:: NIX_AF_ERR: NIX Error Interrupt Reg : 64 Rx on unmapped PF_FUNC + + +Quality of service +================== + + +Hardware algorithms used in scheduling +-------------------------------------- + +octeontx2 silicon and CN10K transmit interface consists of five transmit levels +starting from SMQ/MDQ, TL4 to TL1. Each packet will traverse MDQ, TL4 to TL1 +levels. Each level contains an array of queues to support scheduling and shaping. +The hardware uses the below algorithms depending on the priority of scheduler queues. +once the usercreates tc classes with different priorities, the driver configures +schedulers allocated to the class with specified priority along with rate-limiting +configuration. + +1. Strict Priority + + - Once packets are submitted to MDQ, hardware picks all active MDQs having different priority + using strict priority. + +2. Round Robin + + - Active MDQs having the same priority level are chosen using round robin. + + +Setup HTB offload +----------------- + +1. Enable HW TC offload on the interface:: + + # ethtool -K <interface> hw-tc-offload on + +2. Crate htb root:: + + # tc qdisc add dev <interface> clsact + # tc qdisc replace dev <interface> root handle 1: htb offload + +3. Create tc classes with different priorities:: + + # tc class add dev <interface> parent 1: classid 1:1 htb rate 10Gbit prio 1 + + # tc class add dev <interface> parent 1: classid 1:2 htb rate 10Gbit prio 7 diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst index 6b2d1fe74ecf..a395df9c2751 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst @@ -797,6 +797,16 @@ Counters on the NIC port that is connected to a eSwitch. RoCE/UD/RC traffic) [#accel]_. - Acceleration + * - `vport_loopback_packets` + - Unicast, multicast and broadcast packets that were loop-back (received + and transmitted), IB/Eth [#accel]_. + - Acceleration + + * - `vport_loopback_bytes` + - Unicast, multicast and broadcast bytes that were loop-back (received + and transmitted), IB/Eth [#accel]_. + - Acceleration + * - `rx_steer_missed_packets` - Number of packets that was received by the NIC, however was discarded because it did not match any flow in the NIC flow table. diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst index 3a7a714cc08f..a4edf908b707 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst @@ -40,6 +40,7 @@ flow_steering_mode: Device flow steering mode --------------------------------------------- The flow steering mode parameter controls the flow steering mode of the driver. Two modes are supported: + 1. 'dmfs' - Device managed flow steering. 2. 'smfs' - Software/Driver managed flow steering. @@ -99,6 +100,7 @@ between representors and stacked devices. By default metadata is enabled on the supported devices in E-switch. Metadata is applicable only for E-switch in switchdev mode and users may disable it when NONE of the below use cases will be in use: + 1. HCA is in Dual/multi-port RoCE mode. 2. VF/SF representor bonding (Usually used for Live migration) 3. Stacked devices @@ -180,7 +182,8 @@ User commands examples: $ devlink health diagnose pci/0000:82:00.0 reporter tx -NOTE: This command has valid output only when interface is up, otherwise the command has empty output. +.. note:: + This command has valid output only when interface is up, otherwise the command has empty output. - Show number of tx errors indicated, number of recover flows ended successfully, is autorecover enabled and graceful period from last recover:: @@ -232,8 +235,9 @@ User commands examples: $ devlink health dump show pci/0000:82:00.0 reporter fw -NOTE: This command can run only on the PF which has fw tracer ownership, -running it on other PF or any VF will return "Operation not permitted". +.. note:: + This command can run only on the PF which has fw tracer ownership, + running it on other PF or any VF will return "Operation not permitted". fw fatal reporter ----------------- @@ -256,7 +260,8 @@ User commands examples: $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal -NOTE: This command can run only on PF. +.. note:: + This command can run only on PF. vnic reporter ------------- @@ -265,28 +270,44 @@ It is responsible for querying the vnic diagnostic counters from fw and displayi them in realtime. Description of the vnic counters: -total_q_under_processor_handle: number of queues in an error state due to -an async error or errored command. -send_queue_priority_update_flow: number of QP/SQ priority/SL update -events. -cq_overrun: number of times CQ entered an error state due to an -overflow. -async_eq_overrun: number of times an EQ mapped to async events was -overrun. -comp_eq_overrun: number of times an EQ mapped to completion events was -overrun. -quota_exceeded_command: number of commands issued and failed due to quota -exceeded. -invalid_command: number of commands issued and failed dues to any reason -other than quota exceeded. -nic_receive_steering_discard: number of packets that completed RX flow -steering but were discarded due to a mismatch in flow table. + +- total_q_under_processor_handle + number of queues in an error state due to + an async error or errored command. +- send_queue_priority_update_flow + number of QP/SQ priority/SL update events. +- cq_overrun + number of times CQ entered an error state due to an overflow. +- async_eq_overrun + number of times an EQ mapped to async events was overrun. + comp_eq_overrun number of times an EQ mapped to completion events was + overrun. +- quota_exceeded_command + number of commands issued and failed due to quota exceeded. +- invalid_command + number of commands issued and failed dues to any reason other than quota + exceeded. +- nic_receive_steering_discard + number of packets that completed RX flow + steering but were discarded due to a mismatch in flow table. +- generated_pkt_steering_fail + number of packets generated by the VNIC experiencing unexpected steering + failure (at any point in steering flow). +- handled_pkt_steering_fail + number of packets handled by the VNIC experiencing unexpected steering + failure (at any point in steering flow owned by the VNIC, including the FDB + for the eswitch owner). User commands examples: -- Diagnose PF/VF vnic counters + +- Diagnose PF/VF vnic counters:: + $ devlink health diagnose pci/0000:82:00.1 reporter vnic + - Diagnose representor vnic counters (performed by supplying devlink port of the - representor, which can be obtained via devlink port command) + representor, which can be obtained via devlink port command):: + $ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic -NOTE: This command can run over all interfaces such as PF/VF and representor ports. +.. note:: + This command can run over all interfaces such as PF/VF and representor ports. diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst index 01deedb71597..6e3f5ee8b0d0 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst @@ -45,6 +45,28 @@ Following bridge VLAN functions are supported by mlx5: Subfunction =========== +Subfunction which are spawned over the E-switch are created only with devlink +device, and by default all the SF auxiliary devices are disabled. +This will allow user to configure the SF before the SF have been fully probed, +which will save time. + +Usage example: + +- Create SF:: + + $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 + $ devlink port function set pci/0000:08:00.0/32768 hw_addr 00:00:00:00:00:11 state active + +- Enable ETH auxiliary device:: + + $ devlink dev param set auxiliary/mlx5_core.sf.1 name enable_eth value true cmode driverinit + +- Now, in order to fully probe the SF, use devlink reload:: + + $ devlink dev reload auxiliary/mlx5_core.sf.1 + +mlx5 supports ETH,rdma and vdpa (vnet) auxiliary devices devlink params (see :ref:`Documentation/networking/devlink/devlink-params.rst <devlink_params_generic>`). + mlx5 supports subfunction management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface. A subfunction has its own function capabilities and its own resources. This |