summaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/device_drivers/appletalk/cops.rst80
-rw-r--r--Documentation/networking/device_drivers/appletalk/index.rst18
-rw-r--r--Documentation/networking/device_drivers/ethernet/index.rst1
-rw-r--r--Documentation/networking/device_drivers/ethernet/intel/idpf.rst160
-rw-r--r--Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst2
-rw-r--r--Documentation/networking/device_drivers/index.rst1
-rw-r--r--Documentation/networking/devlink/i40e.rst59
-rw-r--r--Documentation/networking/devlink/index.rst29
-rw-r--r--Documentation/networking/filter.rst4
-rw-r--r--Documentation/networking/index.rst1
-rw-r--r--Documentation/networking/ip-sysctl.rst31
-rw-r--r--Documentation/networking/ipddp.rst78
-rw-r--r--Documentation/networking/msg_zerocopy.rst13
-rw-r--r--Documentation/networking/netconsole.rst22
-rw-r--r--Documentation/networking/pktgen.rst12
-rw-r--r--Documentation/networking/scaling.rst42
-rw-r--r--Documentation/networking/sfp-phylink.rst10
-rw-r--r--Documentation/networking/xdp-rx-metadata.rst7
18 files changed, 380 insertions, 190 deletions
diff --git a/Documentation/networking/device_drivers/appletalk/cops.rst b/Documentation/networking/device_drivers/appletalk/cops.rst
deleted file mode 100644
index 964ba80599a9..000000000000
--- a/Documentation/networking/device_drivers/appletalk/cops.rst
+++ /dev/null
@@ -1,80 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-========================================
-The COPS LocalTalk Linux driver (cops.c)
-========================================
-
-By Jay Schulist <jschlst@samba.org>
-
-This driver has two modes and they are: Dayna mode and Tangent mode.
-Each mode corresponds with the type of card. It has been found
-that there are 2 main types of cards and all other cards are
-the same and just have different names or only have minor differences
-such as more IO ports. As this driver is tested it will
-become more clear exactly what cards are supported.
-
-Right now these cards are known to work with the COPS driver. The
-LT-200 cards work in a somewhat more limited capacity than the
-DL200 cards, which work very well and are in use by many people.
-
-TANGENT driver mode:
- - Tangent ATB-II, Novell NL-1000, Daystar Digital LT-200
-
-DAYNA driver mode:
- - Dayna DL2000/DaynaTalk PC (Half Length), COPS LT-95,
- - Farallon PhoneNET PC III, Farallon PhoneNET PC II
-
-Other cards possibly supported mode unknown though:
- - Dayna DL2000 (Full length)
-
-The COPS driver defaults to using Dayna mode. To change the driver's
-mode if you built a driver with dual support use board_type=1 or
-board_type=2 for Dayna or Tangent with insmod.
-
-Operation/loading of the driver
-===============================
-
-Use modprobe like this: /sbin/modprobe cops.o (IO #) (IRQ #)
-If you do not specify any options the driver will try and use the IO = 0x240,
-IRQ = 5. As of right now I would only use IRQ 5 for the card, if autoprobing.
-
-To load multiple COPS driver Localtalk cards you can do one of the following::
-
- insmod cops io=0x240 irq=5
- insmod -o cops2 cops io=0x260 irq=3
-
-Or in lilo.conf put something like this::
-
- append="ether=5,0x240,lt0 ether=3,0x260,lt1"
-
-Then bring up the interface with ifconfig. It will look something like this::
-
- lt0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-F7-00-00-00-00-00-00-00-00
- inet addr:192.168.1.2 Bcast:192.168.1.255 Mask:255.255.255.0
- UP BROADCAST RUNNING NOARP MULTICAST MTU:600 Metric:1
- RX packets:0 errors:0 dropped:0 overruns:0 frame:0
- TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 coll:0
-
-Netatalk Configuration
-======================
-
-You will need to configure atalkd with something like the following to make
-it work with the cops.c driver.
-
-* For single LTalk card use::
-
- dummy -seed -phase 2 -net 2000 -addr 2000.10 -zone "1033"
- lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"
-
-* For multiple cards, Ethernet and LocalTalk::
-
- eth0 -seed -phase 2 -net 3000 -addr 3000.20 -zone "1033"
- lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"
-
-* For multiple LocalTalk cards, and an Ethernet card.
-
-* Order seems to matter here, Ethernet last::
-
- lt0 -seed -phase 1 -net 1000 -addr 1000.10 -zone "LocalTalk1"
- lt1 -seed -phase 1 -net 2000 -addr 2000.20 -zone "LocalTalk2"
- eth0 -seed -phase 2 -net 3000 -addr 3000.30 -zone "EtherTalk"
diff --git a/Documentation/networking/device_drivers/appletalk/index.rst b/Documentation/networking/device_drivers/appletalk/index.rst
deleted file mode 100644
index c196baeb0856..000000000000
--- a/Documentation/networking/device_drivers/appletalk/index.rst
+++ /dev/null
@@ -1,18 +0,0 @@
-.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
-
-AppleTalk Device Drivers
-========================
-
-Contents:
-
-.. toctree::
- :maxdepth: 2
-
- cops
-
-.. only:: subproject and html
-
- Indices
- =======
-
- * :ref:`genindex`
diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst
index 9827e816084b..43de285b8a92 100644
--- a/Documentation/networking/device_drivers/ethernet/index.rst
+++ b/Documentation/networking/device_drivers/ethernet/index.rst
@@ -32,6 +32,7 @@ Contents:
intel/e1000
intel/e1000e
intel/fm10k
+ intel/idpf
intel/igb
intel/igbvf
intel/ixgbe
diff --git a/Documentation/networking/device_drivers/ethernet/intel/idpf.rst b/Documentation/networking/device_drivers/ethernet/intel/idpf.rst
new file mode 100644
index 000000000000..adb16e2abd21
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/intel/idpf.rst
@@ -0,0 +1,160 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+==========================================================================
+idpf Linux* Base Driver for the Intel(R) Infrastructure Data Path Function
+==========================================================================
+
+Intel idpf Linux driver.
+Copyright(C) 2023 Intel Corporation.
+
+.. contents::
+
+The idpf driver serves as both the Physical Function (PF) and Virtual Function
+(VF) driver for the Intel(R) Infrastructure Data Path Function.
+
+Driver information can be obtained using ethtool, lspci, and ip.
+
+For questions related to hardware requirements, refer to the documentation
+supplied with your Intel adapter. All hardware requirements listed apply to use
+with Linux.
+
+
+Identifying Your Adapter
+========================
+For information on how to identify your adapter, and for the latest Intel
+network drivers, refer to the Intel Support website:
+http://www.intel.com/support
+
+
+Additional Features and Configurations
+======================================
+
+ethtool
+-------
+The driver utilizes the ethtool interface for driver configuration and
+diagnostics, as well as displaying statistical information. The latest ethtool
+version is required for this functionality. If you don't have one yet, you can
+obtain it at:
+https://kernel.org/pub/software/network/ethtool/
+
+
+Viewing Link Messages
+---------------------
+Link messages will not be displayed to the console if the distribution is
+restricting system messages. In order to see network driver link messages on
+your console, set dmesg to eight by entering the following::
+
+ # dmesg -n 8
+
+.. note::
+ This setting is not saved across reboots.
+
+
+Jumbo Frames
+------------
+Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU)
+to a value larger than the default value of 1500.
+
+Use the ip command to increase the MTU size. For example, enter the following
+where <ethX> is the interface number::
+
+ # ip link set mtu 9000 dev <ethX>
+ # ip link set up dev <ethX>
+
+.. note::
+ The maximum MTU setting for jumbo frames is 9706. This corresponds to the
+ maximum jumbo frame size of 9728 bytes.
+
+.. note::
+ This driver will attempt to use multiple page sized buffers to receive
+ each jumbo packet. This should help to avoid buffer starvation issues when
+ allocating receive packets.
+
+.. note::
+ Packet loss may have a greater impact on throughput when you use jumbo
+ frames. If you observe a drop in performance after enabling jumbo frames,
+ enabling flow control may mitigate the issue.
+
+
+Performance Optimization
+========================
+Driver defaults are meant to fit a wide variety of workloads, but if further
+optimization is required, we recommend experimenting with the following
+settings.
+
+
+Interrupt Rate Limiting
+-----------------------
+This driver supports an adaptive interrupt throttle rate (ITR) mechanism that
+is tuned for general workloads. The user can customize the interrupt rate
+control for specific workloads, via ethtool, adjusting the number of
+microseconds between interrupts.
+
+To set the interrupt rate manually, you must disable adaptive mode::
+
+ # ethtool -C <ethX> adaptive-rx off adaptive-tx off
+
+For lower CPU utilization:
+ - Disable adaptive ITR and lower Rx and Tx interrupts. The examples below
+ affect every queue of the specified interface.
+
+ - Setting rx-usecs and tx-usecs to 80 will limit interrupts to about
+ 12,500 interrupts per second per queue::
+
+ # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 80
+ tx-usecs 80
+
+For reduced latency:
+ - Disable adaptive ITR and ITR by setting rx-usecs and tx-usecs to 0
+ using ethtool::
+
+ # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 0
+ tx-usecs 0
+
+Per-queue interrupt rate settings:
+ - The following examples are for queues 1 and 3, but you can adjust other
+ queues.
+
+ - To disable Rx adaptive ITR and set static Rx ITR to 10 microseconds or
+ about 100,000 interrupts/second, for queues 1 and 3::
+
+ # ethtool --per-queue <ethX> queue_mask 0xa --coalesce adaptive-rx off
+ rx-usecs 10
+
+ - To show the current coalesce settings for queues 1 and 3::
+
+ # ethtool --per-queue <ethX> queue_mask 0xa --show-coalesce
+
+
+
+Virtualized Environments
+------------------------
+In addition to the other suggestions in this section, the following may be
+helpful to optimize performance in VMs.
+
+ - Using the appropriate mechanism (vcpupin) in the VM, pin the CPUs to
+ individual LCPUs, making sure to use a set of CPUs included in the
+ device's local_cpulist: /sys/class/net/<ethX>/device/local_cpulist.
+
+ - Configure as many Rx/Tx queues in the VM as available. (See the idpf driver
+ documentation for the number of queues supported.) For example::
+
+ # ethtool -L <virt_interface> rx <max> tx <max>
+
+
+Support
+=======
+For general information, go to the Intel support website at:
+http://www.intel.com/support/
+
+If an issue is identified with the released source code on a supported kernel
+with a supported adapter, email the specific information related to the issue
+to intel-wired-lan@lists.osuosl.org.
+
+
+Trademarks
+==========
+Intel is a trademark or registered trademark of Intel Corporation or its
+subsidiaries in the United States and/or other countries.
+
+* Other names and brands may be claimed as the property of others.
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst
index 0a42c3395ffa..20d3b7e87049 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst
@@ -67,7 +67,7 @@ Enabling the driver and kconfig options
| Enables :ref:`IPSec XFRM cryptography-offload acceleration <xfrm_device>`.
-**CONFIG_MLX5_EN_MACSEC=(y/n)**
+**CONFIG_MLX5_MACSEC=(y/n)**
| Build support for MACsec cryptography-offload acceleration in the NIC.
diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 601eacaf12f3..2f0285a5bc80 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -8,7 +8,6 @@ Contents:
.. toctree::
:maxdepth: 2
- appletalk/index
atm/index
cable/index
can/index
diff --git a/Documentation/networking/devlink/i40e.rst b/Documentation/networking/devlink/i40e.rst
new file mode 100644
index 000000000000..d3cb5bb5197e
--- /dev/null
+++ b/Documentation/networking/devlink/i40e.rst
@@ -0,0 +1,59 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+i40e devlink support
+====================
+
+This document describes the devlink features implemented by the ``i40e``
+device driver.
+
+Info versions
+=============
+
+The ``i40e`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+ :widths: 5 5 5 90
+
+ * - Name
+ - Type
+ - Example
+ - Description
+ * - ``board.id``
+ - fixed
+ - K15190-000
+ - The Product Board Assembly (PBA) identifier of the board.
+ * - ``fw.mgmt``
+ - running
+ - 9.130
+ - 2-digit version number of the management firmware that controls the
+ PHY, link, etc.
+ * - ``fw.mgmt.api``
+ - running
+ - 1.15
+ - 2-digit version number of the API exported over the AdminQ by the
+ management firmware. Used by the driver to identify what commands
+ are supported.
+ * - ``fw.mgmt.build``
+ - running
+ - 73618
+ - Build number of the source for the management firmware.
+ * - ``fw.undi``
+ - running
+ - 1.3429.0
+ - Version of the Option ROM containing the UEFI driver. The version is
+ reported in ``major.minor.patch`` format. The major version is
+ incremented whenever a major breaking change occurs, or when the
+ minor version would overflow. The minor version is incremented for
+ non-breaking changes and reset to 1 when the major version is
+ incremented. The patch version is normally 0 but is incremented when
+ a fix is delivered as a patch against an older base Option ROM.
+ * - ``fw.psid.api``
+ - running
+ - 9.30
+ - Version defining the format of the flash contents.
+ * - ``fw.bundle_id``
+ - running
+ - 0x8000e5f3
+ - Unique identifier of the firmware image file that was loaded onto
+ the device. Also referred to as the EETRACK identifier of the NVM.
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index b49749e2b9a6..e14d7a701b72 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -18,6 +18,34 @@ netlink commands.
Drivers are encouraged to use the devlink instance lock for their own needs.
+Drivers need to be cautious when taking devlink instance lock and
+taking RTNL lock at the same time. Devlink instance lock needs to be taken
+first, only after that RTNL lock could be taken.
+
+Nested instances
+----------------
+
+Some objects, like linecards or port functions, could have another
+devlink instances created underneath. In that case, drivers should make
+sure to respect following rules:
+
+ - Lock ordering should be maintained. If driver needs to take instance
+ lock of both nested and parent instances at the same time, devlink
+ instance lock of the parent instance should be taken first, only then
+ instance lock of the nested instance could be taken.
+ - Driver should use object-specific helpers to setup the
+ nested relationship:
+
+ - ``devl_nested_devlink_set()`` - called to setup devlink -> nested
+ devlink relationship (could be user for multiple nested instances.
+ - ``devl_port_fn_devlink_set()`` - called to setup port function ->
+ nested devlink relationship.
+ - ``devlink_linecard_nested_dl_set()`` - called to setup linecard ->
+ nested devlink relationship.
+
+The nested devlink info is exposed to the userspace over object-specific
+attributes of devlink netlink.
+
Interface documentation
-----------------------
@@ -52,6 +80,7 @@ parameters, info versions, and other features it supports.
bnxt
etas_es58x
hns3
+ i40e
ionic
ice
mlx4
diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
index f69da5074860..7d8c5380492f 100644
--- a/Documentation/networking/filter.rst
+++ b/Documentation/networking/filter.rst
@@ -650,8 +650,8 @@ before a conversion to the new layout is being done behind the scenes!
Currently, the classic BPF format is being used for JITing on most
32-bit architectures, whereas x86-64, aarch64, s390x, powerpc64,
-sparc64, arm32, riscv64, riscv32 perform JIT compilation from eBPF
-instruction set.
+sparc64, arm32, riscv64, riscv32, loongarch64 perform JIT compilation
+from eBPF instruction set.
Testing
-------
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 5b75c3f7a137..2ffc5ad10295 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -59,7 +59,6 @@ Contents:
gtp
ila
ioam6-sysctl
- ipddp
ip_dynaddr
ipsec
ip-sysctl
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index a66054d0763a..e7ec9026e5db 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -745,6 +745,13 @@ tcp_comp_sack_nr - INTEGER
Default : 44
+tcp_backlog_ack_defer - BOOLEAN
+ If set, user thread processing socket backlog tries sending
+ one ACK for the whole queue. This helps to avoid potential
+ long latencies at end of a TCP socket syscall.
+
+ Default : true
+
tcp_slow_start_after_idle - BOOLEAN
If set, provide RFC2861 behavior and time out the congestion
window after an idle period. An idle period is defined at
@@ -1176,6 +1183,19 @@ tcp_plb_cong_thresh - INTEGER
Default: 128
+tcp_pingpong_thresh - INTEGER
+ The number of estimated data replies sent for estimated incoming data
+ requests that must happen before TCP considers that a connection is a
+ "ping-pong" (request-response) connection for which delayed
+ acknowledgments can provide benefits.
+
+ This threshold is 1 by default, but some applications may need a higher
+ threshold for optimal performance.
+
+ Possible Values: 1 - 255
+
+ Default: 1
+
UDP variables
=============
@@ -2304,6 +2324,17 @@ accept_ra_pinfo - BOOLEAN
- enabled if accept_ra is enabled.
- disabled if accept_ra is disabled.
+ra_honor_pio_life - BOOLEAN
+ Whether to use RFC4862 Section 5.5.3e to determine the valid
+ lifetime of an address matching a prefix sent in a Router
+ Advertisement Prefix Information Option.
+
+ - If enabled, the PIO valid lifetime will always be honored.
+ - If disabled, RFC4862 section 5.5.3e is used to determine
+ the valid lifetime of the address.
+
+ Default: 0 (disabled)
+
accept_ra_rt_info_min_plen - INTEGER
Minimum prefix length of Route Information in RA.
diff --git a/Documentation/networking/ipddp.rst b/Documentation/networking/ipddp.rst
deleted file mode 100644
index be7091b77927..000000000000
--- a/Documentation/networking/ipddp.rst
+++ /dev/null
@@ -1,78 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-=========================================================
-AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation
-=========================================================
-
-Documentation ipddp.c
-
-This file is written by Jay Schulist <jschlst@samba.org>
-
-Introduction
-------------
-
-AppleTalk-IP (IPDDP) is the method computers connected to AppleTalk
-networks can use to communicate via IP. AppleTalk-IP is simply IP datagrams
-inside AppleTalk packets.
-
-Through this driver you can either allow your Linux box to communicate
-IP over an AppleTalk network or you can provide IP gatewaying functions
-for your AppleTalk users.
-
-You can currently encapsulate or decapsulate AppleTalk-IP on LocalTalk,
-EtherTalk and PPPTalk. The only limit on the protocol is that of what
-kernel AppleTalk layer and drivers are available.
-
-Each mode requires its own user space software.
-
-Compiling AppleTalk-IP Decapsulation/Encapsulation
-==================================================
-
-AppleTalk-IP decapsulation needs to be compiled into your kernel. You
-will need to turn on AppleTalk-IP driver support. Then you will need to
-select ONE of the two options; IP to AppleTalk-IP encapsulation support or
-AppleTalk-IP to IP decapsulation support. If you compile the driver
-statically you will only be able to use the driver for the function you have
-enabled in the kernel. If you compile the driver as a module you can
-select what mode you want it to run in via a module loading param.
-ipddp_mode=1 for AppleTalk-IP encapsulation and ipddp_mode=2 for
-AppleTalk-IP to IP decapsulation.
-
-Basic instructions for user space tools
-=======================================
-
-I will briefly describe the operation of the tools, but you will
-need to consult the supporting documentation for each set of tools.
-
-Decapsulation - You will need to download a software package called
-MacGate. In this distribution there will be a tool called MacRoute
-which enables you to add routes to the kernel for your Macs by hand.
-Also the tool MacRegGateWay is included to register the
-proper IP Gateway and IP addresses for your machine. Included in this
-distribution is a patch to netatalk-1.4b2+asun2.0a17.2 (available from
-ftp.u.washington.edu/pub/user-supported/asun/) this patch is optional
-but it allows automatic adding and deleting of routes for Macs. (Handy
-for locations with large Mac installations)
-
-Encapsulation - You will need to download a software daemon called ipddpd.
-This software expects there to be an AppleTalk-IP gateway on the network.
-You will also need to add the proper routes to route your Linux box's IP
-traffic out the ipddp interface.
-
-Common Uses of ipddp.c
-----------------------
-Of course AppleTalk-IP decapsulation and encapsulation, but specifically
-decapsulation is being used most for connecting LocalTalk networks to
-IP networks. Although it has been used on EtherTalk networks to allow
-Macs that are only able to tunnel IP over EtherTalk.
-
-Encapsulation has been used to allow a Linux box stuck on a LocalTalk
-network to use IP. It should work equally well if you are stuck on an
-EtherTalk only network.
-
-Further Assistance
--------------------
-You can contact me (Jay Schulist <jschlst@samba.org>) with any
-questions regarding decapsulation or encapsulation. Bradford W. Johnson
-<johns393@maroon.tc.umn.edu> originally wrote the ipddp.c driver for IP
-encapsulation in AppleTalk.
diff --git a/Documentation/networking/msg_zerocopy.rst b/Documentation/networking/msg_zerocopy.rst
index b3ea96af9b49..78fb70e748b7 100644
--- a/Documentation/networking/msg_zerocopy.rst
+++ b/Documentation/networking/msg_zerocopy.rst
@@ -7,7 +7,8 @@ Intro
=====
The MSG_ZEROCOPY flag enables copy avoidance for socket send calls.
-The feature is currently implemented for TCP and UDP sockets.
+The feature is currently implemented for TCP, UDP and VSOCK (with
+virtio transport) sockets.
Opportunity and Caveats
@@ -174,7 +175,9 @@ read_notification() call in the previous snippet. A notification
is encoded in the standard error format, sock_extended_err.
The level and type fields in the control data are protocol family
-specific, IP_RECVERR or IPV6_RECVERR.
+specific, IP_RECVERR or IPV6_RECVERR (for TCP or UDP socket).
+For VSOCK socket, cmsg_level will be SOL_VSOCK and cmsg_type will be
+VSOCK_RECVERR.
Error origin is the new type SO_EE_ORIGIN_ZEROCOPY. ee_errno is zero,
as explained before, to avoid blocking read and write system calls on
@@ -235,12 +238,15 @@ Implementation
Loopback
--------
+For TCP and UDP:
Data sent to local sockets can be queued indefinitely if the receive
process does not read its socket. Unbound notification latency is not
acceptable. For this reason all packets generated with MSG_ZEROCOPY
that are looped to a local socket will incur a deferred copy. This
includes looping onto packet sockets (e.g., tcpdump) and tun devices.
+For VSOCK:
+Data path sent to local sockets is the same as for non-local sockets.
Testing
=======
@@ -254,3 +260,6 @@ instance when run with msg_zerocopy.sh between a veth pair across
namespaces, the test will not show any improvement. For testing, the
loopback restriction can be temporarily relaxed by making
skb_orphan_frags_rx identical to skb_orphan_frags.
+
+For VSOCK type of socket example can be found in
+tools/testing/vsock/vsock_test_zerocopy.c.
diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst
index 7a9de0568e84..390730a74332 100644
--- a/Documentation/networking/netconsole.rst
+++ b/Documentation/networking/netconsole.rst
@@ -99,9 +99,6 @@ Dynamic reconfiguration:
Dynamic reconfigurability is a useful addition to netconsole that enables
remote logging targets to be dynamically added, removed, or have their
parameters reconfigured at runtime from a configfs-based userspace interface.
-[ Note that the parameters of netconsole targets that were specified/created
-from the boot/module option are not exposed via this interface, and hence
-cannot be modified dynamically. ]
To include this feature, select CONFIG_NETCONSOLE_DYNAMIC when building the
netconsole module (or kernel, if netconsole is built-in).
@@ -155,6 +152,25 @@ You can also update the local interface dynamically. This is especially
useful if you want to use interfaces that have newly come up (and may not
have existed when netconsole was loaded / initialized).
+Netconsole targets defined at boot time (or module load time) with the
+`netconsole=` param are assigned the name `cmdline<index>`. For example, the
+first target in the parameter is named `cmdline0`. You can control and modify
+these targets by creating configfs directories with the matching name.
+
+Let's suppose you have two netconsole targets defined at boot time::
+
+ netconsole=4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc;4444@10.0.0.1/eth1,9353@10.0.0.3/12:34:56:78:9a:bc
+
+You can modify these targets in runtime by creating the following targets::
+
+ mkdir cmdline0
+ cat cmdline0/remote_ip
+ 10.0.0.2
+
+ mkdir cmdline1
+ cat cmdline1/remote_ip
+ 10.0.0.3
+
Extended console:
=================
diff --git a/Documentation/networking/pktgen.rst b/Documentation/networking/pktgen.rst
index 1225f0f63ff0..c945218946e1 100644
--- a/Documentation/networking/pktgen.rst
+++ b/Documentation/networking/pktgen.rst
@@ -178,6 +178,7 @@ Examples::
IPSEC # IPsec encapsulation (needs CONFIG_XFRM)
NODE_ALLOC # node specific memory allocation
NO_TIMESTAMP # disable timestamping
+ SHARED # enable shared SKB
pgset 'flag ![name]' Clear a flag to determine behaviour.
Note that you might need to use single quote in
interactive mode, so that your shell wouldn't expand
@@ -288,6 +289,16 @@ To avoid breaking existing testbed scripts for using AH type and tunnel mode,
you can use "pgset spi SPI_VALUE" to specify which transformation mode
to employ.
+Disable shared SKB
+==================
+By default, SKBs sent by pktgen are shared (user count > 1).
+To test with non-shared SKBs, remove the "SHARED" flag by simply setting::
+
+ pg_set "flag !SHARED"
+
+However, if the "clone_skb" or "burst" parameters are configured, the skb
+still needs to be held by pktgen for further access. Hence the skb must be
+shared.
Current commands and configuration options
==========================================
@@ -357,6 +368,7 @@ Current commands and configuration options
IPSEC
NODE_ALLOC
NO_TIMESTAMP
+ SHARED
spi (ipsec)
diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 92c9fb46d6a2..03ae19a689fc 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -105,6 +105,48 @@ a separate CPU. For interrupt handling, HT has shown no benefit in
initial tests, so limit the number of queues to the number of CPU cores
in the system.
+Dedicated RSS contexts
+~~~~~~~~~~~~~~~~~~~~~~
+
+Modern NICs support creating multiple co-existing RSS configurations
+which are selected based on explicit matching rules. This can be very
+useful when application wants to constrain the set of queues receiving
+traffic for e.g. a particular destination port or IP address.
+The example below shows how to direct all traffic to TCP port 22
+to queues 0 and 1.
+
+To create an additional RSS context use::
+
+ # ethtool -X eth0 hfunc toeplitz context new
+ New RSS context is 1
+
+Kernel reports back the ID of the allocated context (the default, always
+present RSS context has ID of 0). The new context can be queried and
+modified using the same APIs as the default context::
+
+ # ethtool -x eth0 context 1
+ RX flow hash indirection table for eth0 with 13 RX ring(s):
+ 0: 0 1 2 3 4 5 6 7
+ 8: 8 9 10 11 12 0 1 2
+ [...]
+ # ethtool -X eth0 equal 2 context 1
+ # ethtool -x eth0 context 1
+ RX flow hash indirection table for eth0 with 13 RX ring(s):
+ 0: 0 1 0 1 0 1 0 1
+ 8: 0 1 0 1 0 1 0 1
+ [...]
+
+To make use of the new context direct traffic to it using an n-tuple
+filter::
+
+ # ethtool -N eth0 flow-type tcp6 dst-port 22 context 1
+ Added rule with ID 1023
+
+When done, remove the context and the rule::
+
+ # ethtool -N eth0 delete 1023
+ # ethtool -X eth0 context 1 delete
+
RPS: Receive Packet Steering
============================
diff --git a/Documentation/networking/sfp-phylink.rst b/Documentation/networking/sfp-phylink.rst
index 55b65f607a64..8054d33f449f 100644
--- a/Documentation/networking/sfp-phylink.rst
+++ b/Documentation/networking/sfp-phylink.rst
@@ -200,10 +200,12 @@ this documentation.
when the in-band link state changes - otherwise the link will never
come up.
- The :c:func:`validate` method should mask the supplied supported mask,
- and ``state->advertising`` with the supported ethtool link modes.
- These are the new ethtool link modes, so bitmask operations must be
- used. For an example, see ``drivers/net/ethernet/marvell/mvneta.c``.
+ The :c:func:`mac_get_caps` method is optional, and if provided should
+ return the phylink MAC capabilities that are supported for the passed
+ ``interface`` mode. In general, there is no need to implement this method.
+ Phylink will use these capabilities in combination with permissible
+ capabilities for ``interface`` to determine the allowable ethtool link
+ modes.
The :c:func:`mac_link_state` method is used to read the link state
from the MAC, and report back the settings that the MAC is currently
diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index 25ce72af81c2..205696780b78 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -105,6 +105,13 @@ bpf_tail_call
Adding programs that access metadata kfuncs to the ``BPF_MAP_TYPE_PROG_ARRAY``
is currently not supported.
+Supported Devices
+=================
+
+It is possible to query which kfunc the particular netdev implements via
+netlink. See ``xdp-rx-metadata-features`` attribute set in
+``Documentation/netlink/specs/netdev.yaml``.
+
Example
=======