From ea5bacaa2cec6967ed337f4d0ad6034123ca737b Mon Sep 17 00:00:00 2001 From: Mauro Carvalho Chehab Date: Thu, 30 Apr 2020 18:04:03 +0200 Subject: docs: networking: convert netdev-features.txt to ReST Not much to be done here: - add SPDX header; - adjust titles and chapters, adding proper markups; - add to networking/index.rst. Signed-off-by: Mauro Carvalho Chehab Signed-off-by: David S. Miller --- Documentation/networking/checksum-offloads.rst | 2 +- Documentation/networking/index.rst | 1 + Documentation/networking/netdev-features.rst | 184 +++++++++++++++++++++++++ Documentation/networking/netdev-features.txt | 181 ------------------------ include/linux/netdev_features.h | 2 +- 5 files changed, 187 insertions(+), 183 deletions(-) create mode 100644 Documentation/networking/netdev-features.rst delete mode 100644 Documentation/networking/netdev-features.txt diff --git a/Documentation/networking/checksum-offloads.rst b/Documentation/networking/checksum-offloads.rst index 905c8a84b103..69b23cf6879e 100644 --- a/Documentation/networking/checksum-offloads.rst +++ b/Documentation/networking/checksum-offloads.rst @@ -59,7 +59,7 @@ recomputed for each resulting segment. See the skbuff.h comment (section 'E') for more details. A driver declares its offload capabilities in netdev->hw_features; see -Documentation/networking/netdev-features.txt for more. Note that a device +Documentation/networking/netdev-features.rst for more. Note that a device which only advertises NETIF_F_IP[V6]_CSUM must still obey the csum_start and csum_offset given in the SKB; if it tries to deduce these itself in hardware (as some NICs do) the driver should check that the values in the SKB match diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index e58f872d401d..4c6aa3db97d4 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -81,6 +81,7 @@ Contents: mpls-sysctl multiqueue netconsole + netdev-features .. only:: subproject and html diff --git a/Documentation/networking/netdev-features.rst b/Documentation/networking/netdev-features.rst new file mode 100644 index 000000000000..a2d7d7160e39 --- /dev/null +++ b/Documentation/networking/netdev-features.rst @@ -0,0 +1,184 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================================================== +Netdev features mess and how to get out from it alive +===================================================== + +Author: + Michał Mirosław + + + +Part I: Feature sets +==================== + +Long gone are the days when a network card would just take and give packets +verbatim. Today's devices add multiple features and bugs (read: offloads) +that relieve an OS of various tasks like generating and checking checksums, +splitting packets, classifying them. Those capabilities and their state +are commonly referred to as netdev features in Linux kernel world. + +There are currently three sets of features relevant to the driver, and +one used internally by network core: + + 1. netdev->hw_features set contains features whose state may possibly + be changed (enabled or disabled) for a particular device by user's + request. This set should be initialized in ndo_init callback and not + changed later. + + 2. netdev->features set contains features which are currently enabled + for a device. This should be changed only by network core or in + error paths of ndo_set_features callback. + + 3. netdev->vlan_features set contains features whose state is inherited + by child VLAN devices (limits netdev->features set). This is currently + used for all VLAN devices whether tags are stripped or inserted in + hardware or software. + + 4. netdev->wanted_features set contains feature set requested by user. + This set is filtered by ndo_fix_features callback whenever it or + some device-specific conditions change. This set is internal to + networking core and should not be referenced in drivers. + + + +Part II: Controlling enabled features +===================================== + +When current feature set (netdev->features) is to be changed, new set +is calculated and filtered by calling ndo_fix_features callback +and netdev_fix_features(). If the resulting set differs from current +set, it is passed to ndo_set_features callback and (if the callback +returns success) replaces value stored in netdev->features. +NETDEV_FEAT_CHANGE notification is issued after that whenever current +set might have changed. + +The following events trigger recalculation: + 1. device's registration, after ndo_init returned success + 2. user requested changes in features state + 3. netdev_update_features() is called + +ndo_*_features callbacks are called with rtnl_lock held. Missing callbacks +are treated as always returning success. + +A driver that wants to trigger recalculation must do so by calling +netdev_update_features() while holding rtnl_lock. This should not be done +from ndo_*_features callbacks. netdev->features should not be modified by +driver except by means of ndo_fix_features callback. + + + +Part III: Implementation hints +============================== + + * ndo_fix_features: + +All dependencies between features should be resolved here. The resulting +set can be reduced further by networking core imposed limitations (as coded +in netdev_fix_features()). For this reason it is safer to disable a feature +when its dependencies are not met instead of forcing the dependency on. + +This callback should not modify hardware nor driver state (should be +stateless). It can be called multiple times between successive +ndo_set_features calls. + +Callback must not alter features contained in NETIF_F_SOFT_FEATURES or +NETIF_F_NEVER_CHANGE sets. The exception is NETIF_F_VLAN_CHALLENGED but +care must be taken as the change won't affect already configured VLANs. + + * ndo_set_features: + +Hardware should be reconfigured to match passed feature set. The set +should not be altered unless some error condition happens that can't +be reliably detected in ndo_fix_features. In this case, the callback +should update netdev->features to match resulting hardware state. +Errors returned are not (and cannot be) propagated anywhere except dmesg. +(Note: successful return is zero, >0 means silent error.) + + + +Part IV: Features +================= + +For current list of features, see include/linux/netdev_features.h. +This section describes semantics of some of them. + + * Transmit checksumming + +For complete description, see comments near the top of include/linux/skbuff.h. + +Note: NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM + NETIF_F_IPV6_CSUM. +It means that device can fill TCP/UDP-like checksum anywhere in the packets +whatever headers there might be. + + * Transmit TCP segmentation offload + +NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit +set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6). + + * Transmit UDP segmentation offload + +NETIF_F_GSO_UDP_L4 accepts a single UDP header with a payload that exceeds +gso_size. On segmentation, it segments the payload on gso_size boundaries and +replicates the network and UDP headers (fixing up the last one if less than +gso_size). + + * Transmit DMA from high memory + +On platforms where this is relevant, NETIF_F_HIGHDMA signals that +ndo_start_xmit can handle skbs with frags in high memory. + + * Transmit scatter-gather + +Those features say that ndo_start_xmit can handle fragmented skbs: +NETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST --- +chained skbs (skb->next/prev list). + + * Software features + +Features contained in NETIF_F_SOFT_FEATURES are features of networking +stack. Driver should not change behaviour based on them. + + * LLTX driver (deprecated for hardware drivers) + +NETIF_F_LLTX is meant to be used by drivers that don't need locking at all, +e.g. software tunnels. + +This is also used in a few legacy drivers that implement their +own locking, don't use it for new (hardware) drivers. + + * netns-local device + +NETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between +network namespaces (e.g. loopback). + +Don't use it in drivers. + + * VLAN challenged + +NETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN +headers. Some drivers set this because the cards can't handle the bigger MTU. +[FIXME: Those cases could be fixed in VLAN code by allowing only reduced-MTU +VLANs. This may be not useful, though.] + +* rx-fcs + +This requests that the NIC append the Ethernet Frame Checksum (FCS) +to the end of the skb data. This allows sniffers and other tools to +read the CRC recorded by the NIC on receipt of the packet. + +* rx-all + +This requests that the NIC receive all possible frames, including errored +frames (such as bad FCS, etc). This can be helpful when sniffing a link with +bad packets on it. Some NICs may receive more packets if also put into normal +PROMISC mode. + +* rx-gro-hw + +This requests that the NIC enables Hardware GRO (generic receive offload). +Hardware GRO is basically the exact reverse of TSO, and is generally +stricter than Hardware LRO. A packet stream merged by Hardware GRO must +be re-segmentable by GSO or TSO back to the exact original packet stream. +Hardware GRO is dependent on RXCSUM since every packet successfully merged +by hardware must also have the checksum verified by hardware. diff --git a/Documentation/networking/netdev-features.txt b/Documentation/networking/netdev-features.txt deleted file mode 100644 index 58dd1c1e3c65..000000000000 --- a/Documentation/networking/netdev-features.txt +++ /dev/null @@ -1,181 +0,0 @@ -Netdev features mess and how to get out from it alive -===================================================== - -Author: - Michał Mirosław - - - - Part I: Feature sets -====================== - -Long gone are the days when a network card would just take and give packets -verbatim. Today's devices add multiple features and bugs (read: offloads) -that relieve an OS of various tasks like generating and checking checksums, -splitting packets, classifying them. Those capabilities and their state -are commonly referred to as netdev features in Linux kernel world. - -There are currently three sets of features relevant to the driver, and -one used internally by network core: - - 1. netdev->hw_features set contains features whose state may possibly - be changed (enabled or disabled) for a particular device by user's - request. This set should be initialized in ndo_init callback and not - changed later. - - 2. netdev->features set contains features which are currently enabled - for a device. This should be changed only by network core or in - error paths of ndo_set_features callback. - - 3. netdev->vlan_features set contains features whose state is inherited - by child VLAN devices (limits netdev->features set). This is currently - used for all VLAN devices whether tags are stripped or inserted in - hardware or software. - - 4. netdev->wanted_features set contains feature set requested by user. - This set is filtered by ndo_fix_features callback whenever it or - some device-specific conditions change. This set is internal to - networking core and should not be referenced in drivers. - - - - Part II: Controlling enabled features -======================================= - -When current feature set (netdev->features) is to be changed, new set -is calculated and filtered by calling ndo_fix_features callback -and netdev_fix_features(). If the resulting set differs from current -set, it is passed to ndo_set_features callback and (if the callback -returns success) replaces value stored in netdev->features. -NETDEV_FEAT_CHANGE notification is issued after that whenever current -set might have changed. - -The following events trigger recalculation: - 1. device's registration, after ndo_init returned success - 2. user requested changes in features state - 3. netdev_update_features() is called - -ndo_*_features callbacks are called with rtnl_lock held. Missing callbacks -are treated as always returning success. - -A driver that wants to trigger recalculation must do so by calling -netdev_update_features() while holding rtnl_lock. This should not be done -from ndo_*_features callbacks. netdev->features should not be modified by -driver except by means of ndo_fix_features callback. - - - - Part III: Implementation hints -================================ - - * ndo_fix_features: - -All dependencies between features should be resolved here. The resulting -set can be reduced further by networking core imposed limitations (as coded -in netdev_fix_features()). For this reason it is safer to disable a feature -when its dependencies are not met instead of forcing the dependency on. - -This callback should not modify hardware nor driver state (should be -stateless). It can be called multiple times between successive -ndo_set_features calls. - -Callback must not alter features contained in NETIF_F_SOFT_FEATURES or -NETIF_F_NEVER_CHANGE sets. The exception is NETIF_F_VLAN_CHALLENGED but -care must be taken as the change won't affect already configured VLANs. - - * ndo_set_features: - -Hardware should be reconfigured to match passed feature set. The set -should not be altered unless some error condition happens that can't -be reliably detected in ndo_fix_features. In this case, the callback -should update netdev->features to match resulting hardware state. -Errors returned are not (and cannot be) propagated anywhere except dmesg. -(Note: successful return is zero, >0 means silent error.) - - - - Part IV: Features -=================== - -For current list of features, see include/linux/netdev_features.h. -This section describes semantics of some of them. - - * Transmit checksumming - -For complete description, see comments near the top of include/linux/skbuff.h. - -Note: NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM + NETIF_F_IPV6_CSUM. -It means that device can fill TCP/UDP-like checksum anywhere in the packets -whatever headers there might be. - - * Transmit TCP segmentation offload - -NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit -set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6). - - * Transmit UDP segmentation offload - -NETIF_F_GSO_UDP_L4 accepts a single UDP header with a payload that exceeds -gso_size. On segmentation, it segments the payload on gso_size boundaries and -replicates the network and UDP headers (fixing up the last one if less than -gso_size). - - * Transmit DMA from high memory - -On platforms where this is relevant, NETIF_F_HIGHDMA signals that -ndo_start_xmit can handle skbs with frags in high memory. - - * Transmit scatter-gather - -Those features say that ndo_start_xmit can handle fragmented skbs: -NETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST --- -chained skbs (skb->next/prev list). - - * Software features - -Features contained in NETIF_F_SOFT_FEATURES are features of networking -stack. Driver should not change behaviour based on them. - - * LLTX driver (deprecated for hardware drivers) - -NETIF_F_LLTX is meant to be used by drivers that don't need locking at all, -e.g. software tunnels. - -This is also used in a few legacy drivers that implement their -own locking, don't use it for new (hardware) drivers. - - * netns-local device - -NETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between -network namespaces (e.g. loopback). - -Don't use it in drivers. - - * VLAN challenged - -NETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN -headers. Some drivers set this because the cards can't handle the bigger MTU. -[FIXME: Those cases could be fixed in VLAN code by allowing only reduced-MTU -VLANs. This may be not useful, though.] - -* rx-fcs - -This requests that the NIC append the Ethernet Frame Checksum (FCS) -to the end of the skb data. This allows sniffers and other tools to -read the CRC recorded by the NIC on receipt of the packet. - -* rx-all - -This requests that the NIC receive all possible frames, including errored -frames (such as bad FCS, etc). This can be helpful when sniffing a link with -bad packets on it. Some NICs may receive more packets if also put into normal -PROMISC mode. - -* rx-gro-hw - -This requests that the NIC enables Hardware GRO (generic receive offload). -Hardware GRO is basically the exact reverse of TSO, and is generally -stricter than Hardware LRO. A packet stream merged by Hardware GRO must -be re-segmentable by GSO or TSO back to the exact original packet stream. -Hardware GRO is dependent on RXCSUM since every packet successfully merged -by hardware must also have the checksum verified by hardware. diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h index 9d53c5ad272c..2cc3cf80b49a 100644 --- a/include/linux/netdev_features.h +++ b/include/linux/netdev_features.h @@ -89,7 +89,7 @@ enum { * Add your fresh new feature above and remember to update * netdev_features_strings[] in net/core/ethtool.c and maybe * some feature mask #defines below. Please also describe it - * in Documentation/networking/netdev-features.txt. + * in Documentation/networking/netdev-features.rst. */ /**/NETDEV_FEATURE_COUNT -- cgit v1.2.3