summaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/bonding.rst13
-rw-r--r--Documentation/networking/filter.rst61
-rw-r--r--Documentation/networking/phy.rst5
-rw-r--r--Documentation/networking/snmp_counter.rst28
4 files changed, 82 insertions, 25 deletions
diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst
index adc314639085..5f690f0ad0e4 100644
--- a/Documentation/networking/bonding.rst
+++ b/Documentation/networking/bonding.rst
@@ -951,6 +951,19 @@ xmit_hash_policy
packets will be distributed according to the encapsulated
flows.
+ vlan+srcmac
+
+ This policy uses a very rudimentary vlan ID and source mac
+ hash to load-balance traffic per-vlan, with failover
+ should one leg fail. The intended use case is for a bond
+ shared by multiple virtual machines, all configured to
+ use their own vlan, to give lacp-like functionality
+ without requiring lacp-capable switching hardware.
+
+ The formula for the hash is simply
+
+ hash = (vlan ID) XOR (source MAC vendor) XOR (source MAC dev)
+
The default value is layer2. This option was added in bonding
version 2.6.3. In earlier versions of bonding, this parameter
does not exist, and the layer2 policy is the only policy. The
diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
index debb59e374de..f6d8f90e9a56 100644
--- a/Documentation/networking/filter.rst
+++ b/Documentation/networking/filter.rst
@@ -1006,13 +1006,13 @@ Size modifier is one of ...
Mode modifier is one of::
- BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
- BPF_ABS 0x20
- BPF_IND 0x40
- BPF_MEM 0x60
- BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */
- BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */
- BPF_XADD 0xc0 /* eBPF only, exclusive add */
+ BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
+ BPF_ABS 0x20
+ BPF_IND 0x40
+ BPF_MEM 0x60
+ BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */
+ BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */
+ BPF_ATOMIC 0xc0 /* eBPF only, atomic operations */
eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
(BPF_IND | <size> | BPF_LD) which are used to access packet data.
@@ -1044,11 +1044,50 @@ Unlike classic BPF instruction set, eBPF has generic load/store operations::
BPF_MEM | <size> | BPF_STX: *(size *) (dst_reg + off) = src_reg
BPF_MEM | <size> | BPF_ST: *(size *) (dst_reg + off) = imm32
BPF_MEM | <size> | BPF_LDX: dst_reg = *(size *) (src_reg + off)
- BPF_XADD | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
- BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
-Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
-2 byte atomic increments are not supported.
+Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.
+
+It also includes atomic operations, which use the immediate field for extra
+encoding.
+
+ .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
+ .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
+
+The basic atomic operations supported are:
+
+ BPF_ADD
+ BPF_AND
+ BPF_OR
+ BPF_XOR
+
+Each having equivalent semantics with the ``BPF_ADD`` example, that is: the
+memory location addresed by ``dst_reg + off`` is atomically modified, with
+``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the
+immediate, then these operations also overwrite ``src_reg`` with the
+value that was in memory before it was modified.
+
+The more special operations are:
+
+ BPF_XCHG
+
+This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg +
+off``.
+
+ BPF_CMPXCHG
+
+This atomically compares the value addressed by ``dst_reg + off`` with
+``R0``. If they match it is replaced with ``src_reg``, The value that was there
+before is loaded back to ``R0``.
+
+Note that 1 and 2 byte atomic operations are not supported.
+
+Except ``BPF_ADD`` _without_ ``BPF_FETCH`` (for legacy reasons), all 4 byte
+atomic operations require alu32 mode. Clang enables this mode by default in
+architecture v3 (``-mcpu=v3``). For older versions it can be enabled with
+``-Xclang -target-feature -Xclang +alu32``.
+
+You may encounter BPF_XADD - this is a legacy name for BPF_ATOMIC, referring to
+the exclusive-add operation encoded when the immediate field is zero.
eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst
index b2f7ec794bc8..399f17976a6c 100644
--- a/Documentation/networking/phy.rst
+++ b/Documentation/networking/phy.rst
@@ -286,6 +286,11 @@ Some of the interface modes are described below:
Note: due to legacy usage, some 10GBASE-R usage incorrectly makes
use of this definition.
+``PHY_INTERFACE_MODE_100BASEX``
+ This defines IEEE 802.3 Clause 24. The link operates at a fixed data
+ rate of 125Mpbs using a 4B/5B encoding scheme, resulting in an underlying
+ data rate of 100Mpbs.
+
Pause frames / flow control
===========================
diff --git a/Documentation/networking/snmp_counter.rst b/Documentation/networking/snmp_counter.rst
index 4edd0d38779e..423d138b5ff3 100644
--- a/Documentation/networking/snmp_counter.rst
+++ b/Documentation/networking/snmp_counter.rst
@@ -314,7 +314,7 @@ https://lwn.net/Articles/576263/
* TcpExtTCPOrigDataSent
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
-explaination below::
+explanation below::
TCPOrigDataSent: number of outgoing packets with original data (excluding
retransmission but including data-in-SYN). This counter is different from
@@ -324,7 +324,7 @@ explaination below::
* TCPSynRetrans
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
-explaination below::
+explanation below::
TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
retransmissions into SYN, fast-retransmits, timeout retransmits, etc.
@@ -332,7 +332,7 @@ explaination below::
* TCPFastOpenActiveFail
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
-explaination below::
+explanation below::
TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed because
the remote does not accept it or the attempts timed out.
@@ -382,7 +382,7 @@ Defined in `RFC1213 tcpAttemptFails`_.
Defined in `RFC1213 tcpOutRsts`_. The RFC says this counter indicates
the 'segments sent containing the RST flag', but in linux kernel, this
-couner indicates the segments kerenl tried to send. The sending
+counter indicates the segments kernel tried to send. The sending
process might be failed due to some errors (e.g. memory alloc failed).
.. _RFC1213 tcpOutRsts: https://tools.ietf.org/html/rfc1213#page-52
@@ -700,7 +700,7 @@ SACK option could have up to 4 blocks, they are checked
individually. E.g., if 3 blocks of a SACk is invalid, the
corresponding counter would be updated 3 times. The comment of the
`Add counters for discarded SACK blocks`_ patch has additional
-explaination:
+explanation:
.. _Add counters for discarded SACK blocks: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18f02545a9a16c9a89778b91a162ad16d510bb32
@@ -829,7 +829,7 @@ PAWS check fails or the received sequence number is out of window.
* TcpExtTCPACKSkippedTimeWait
-Tha ACK is skipped in Time-Wait status, the reason would be either
+The ACK is skipped in Time-Wait status, the reason would be either
PAWS check failed or the received sequence number is out of window.
* TcpExtTCPACKSkippedChallenge
@@ -984,7 +984,7 @@ TcpExtSyncookiesRecv counter wont be updated.
Challenge ACK
=============
-For details of challenge ACK, please refer the explaination of
+For details of challenge ACK, please refer the explanation of
TcpExtTCPACKSkippedChallenge.
* TcpExtTCPChallengeACK
@@ -1002,7 +1002,7 @@ prune
=====
When a socket is under memory pressure, the TCP stack will try to
reclaim memory from the receiving queue and out of order queue. One of
-the reclaiming method is 'collapse', which means allocate a big sbk,
+the reclaiming method is 'collapse', which means allocate a big skb,
copy the contiguous skbs to the single big skb, and free these
contiguous skbs.
@@ -1163,7 +1163,7 @@ The server side nstat output::
IpExtOutOctets 52 0.0
IpExtInNoECTPkts 1 0.0
-Input a string in nc client side again ('world' in our exmaple)::
+Input a string in nc client side again ('world' in our example)::
nstatuser@nstat-a:~$ nc -v nstat-b 9000
Connection to nstat-b 9000 port [tcp/*] succeeded!
@@ -1211,7 +1211,7 @@ replied an ACK. But kernel handled them in different ways. When the
TCP window scale option is not used, kernel will try to enable fast
path immediately when the connection comes into the established state,
but if the TCP window scale option is used, kernel will disable the
-fast path at first, and try to enable it after kerenl receives
+fast path at first, and try to enable it after kernel receives
packets. We could use the 'ss' command to verify whether the window
scale option is used. e.g. run below command on either server or
client::
@@ -1343,7 +1343,7 @@ Check TcpExtTCPAbortOnMemory on client::
nstatuser@nstat-a:~$ nstat | grep -i abort
TcpExtTCPAbortOnMemory 54 0.0
-Check orphane socket count on client::
+Check orphaned socket count on client::
nstatuser@nstat-a:~$ ss -s
Total: 131 (kernel 0)
@@ -1685,7 +1685,7 @@ Send 3 SYN repeatly to nstat-b::
nstatuser@nstat-a:~$ for i in {1..3}; do sudo tcpreplay -i ens3 /tmp/syn_fixcsum.pcap; done
-Check snmp cunter on nstat-b::
+Check snmp counter on nstat-b::
nstatuser@nstat-b:~$ nstat | grep -i skip
TcpExtTCPACKSkippedSynRecv 1 0.0
@@ -1770,7 +1770,7 @@ string 'foo' in our example::
Connection from nstat-a 42132 received!
foo
-On nstat-a, the tcpdump should have caputred the ACK. We should check
+On nstat-a, the tcpdump should have captured the ACK. We should check
the source port numbers of the two nc clients::
nstatuser@nstat-a:~$ ss -ta '( dport = :9000 || dport = :9001 )' | tee
@@ -1778,7 +1778,7 @@ the source port numbers of the two nc clients::
ESTAB 0 0 192.168.122.250:50208 192.168.122.251:9000
ESTAB 0 0 192.168.122.250:42132 192.168.122.251:9001
-Run tcprewrite, change port 9001 to port 9000, chagne port 42132 to
+Run tcprewrite, change port 9001 to port 9000, change port 42132 to
port 50208::
nstatuser@nstat-a:~$ tcprewrite --infile /tmp/seq_pre.pcap --outfile /tmp/seq.pcap -r 9001:9000 -r 42132:50208 --fixcsum