summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide')
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst34
-rw-r--r--Documentation/admin-guide/cifs/todo.rst29
-rw-r--r--Documentation/admin-guide/hw-vuln/spectre.rst88
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt43
-rw-r--r--Documentation/admin-guide/laptops/thinkpad-acpi.rst23
-rw-r--r--Documentation/admin-guide/perf/imx-ddr.rst52
-rw-r--r--Documentation/admin-guide/sysctl/net.rst29
7 files changed, 238 insertions, 60 deletions
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 3b29005aa981..5f1c266131b0 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -951,6 +951,13 @@ controller implements weight and absolute bandwidth limit models for
normal scheduling policy and absolute bandwidth allocation model for
realtime scheduling policy.
+In all the above models, cycles distribution is defined only on a temporal
+base and it does not account for the frequency at which tasks are executed.
+The (optional) utilization clamping support allows to hint the schedutil
+cpufreq governor about the minimum desired frequency which should always be
+provided by a CPU, as well as the maximum desired frequency, which should not
+be exceeded by a CPU.
+
WARNING: cgroup2 doesn't yet support control of realtime processes and
the cpu controller can only be enabled when all RT processes are in
the root cgroup. Be aware that system management software may already
@@ -1016,6 +1023,33 @@ All time durations are in microseconds.
Shows pressure stall information for CPU. See
Documentation/accounting/psi.rst for details.
+ cpu.uclamp.min
+ A read-write single value file which exists on non-root cgroups.
+ The default is "0", i.e. no utilization boosting.
+
+ The requested minimum utilization (protection) as a percentage
+ rational number, e.g. 12.34 for 12.34%.
+
+ This interface allows reading and setting minimum utilization clamp
+ values similar to the sched_setattr(2). This minimum utilization
+ value is used to clamp the task specific minimum utilization clamp.
+
+ The requested minimum utilization (protection) is always capped by
+ the current value for the maximum utilization (limit), i.e.
+ `cpu.uclamp.max`.
+
+ cpu.uclamp.max
+ A read-write single value file which exists on non-root cgroups.
+ The default is "max". i.e. no utilization capping
+
+ The requested maximum utilization (limit) as a percentage rational
+ number, e.g. 98.76 for 98.76%.
+
+ This interface allows reading and setting maximum utilization clamp
+ values similar to the sched_setattr(2). This maximum utilization
+ value is used to clamp the task specific maximum utilization clamp.
+
+
Memory
------
diff --git a/Documentation/admin-guide/cifs/todo.rst b/Documentation/admin-guide/cifs/todo.rst
index 95f18e8c9b8a..084c25f92dcb 100644
--- a/Documentation/admin-guide/cifs/todo.rst
+++ b/Documentation/admin-guide/cifs/todo.rst
@@ -18,7 +18,8 @@ a) SMB3 (and SMB3.1.1) missing optional features:
- T10 copy offload ie "ODX" (copy chunk, and "Duplicate Extents" ioctl
currently the only two server side copy mechanisms supported)
-b) improved sparse file support
+b) improved sparse file support (fiemap and SEEK_HOLE are implemented
+ but additional features would be supportable by the protocol).
c) Directory entry caching relies on a 1 second timer, rather than
using Directory Leases, currently only the root file handle is cached longer
@@ -26,10 +27,14 @@ c) Directory entry caching relies on a 1 second timer, rather than
d) quota support (needs minor kernel change since quota calls
to make it to network filesystems or deviceless filesystems)
-e) Additional use cases where we use "compoounding" (e.g. open/query/close
- and open/setinfo/close) to reduce the number of roundtrips, and also
- open to reduce redundant opens (using deferred close and reference counts
- more).
+e) Additional use cases can be optimized to use "compounding" (e.g.
+ open/query/close and open/setinfo/close) to reduce the number of
+ roundtrips to the server and improve performance. Various cases
+ (stat, statfs, create, unlink, mkdir) already have been improved by
+ using compounding but more can be done. In addition we could
+ significantly reduce redundant opens by using deferred close (with
+ handle caching leases) and better using reference counters on file
+ handles.
f) Finish inotify support so kde and gnome file list windows
will autorefresh (partially complete by Asser). Needs minor kernel
@@ -49,18 +54,18 @@ j) Create UID mapping facility so server UIDs can be mapped on a per
exists. Also better integration with winbind for resolving SID owners
k) Add tools to take advantage of more smb3 specific ioctls and features
- (passthrough ioctl/fsctl for sending various SMB3 fsctls to the server
- is in progress, and a passthrough query_info call is already implemented
- in cifs.ko to allow smb3 info levels queries to be sent from userspace)
+ (passthrough ioctl/fsctl is now implemented in cifs.ko to allow
+ sending various SMB3 fsctls and query info and set info calls
+ directly from user space) Add tools to make setting various non-POSIX
+ metadata attributes easier from tools (e.g. extending what was done
+ in smb-info tool).
l) encrypted file support
m) improved stats gathering tools (perhaps integration with nfsometer?)
to extend and make easier to use what is currently in /proc/fs/cifs/Stats
-n) allow setting more NTFS/SMB3 file attributes remotely (currently limited to
- compressed file attribute via chflags) and improve user space tools for
- managing and viewing them.
+n) Add support for claims based ACLs ("DAC")
o) mount helper GUI (to simplify the various configuration options on mount)
@@ -88,6 +93,8 @@ v) POSIX Extensions for SMB3.1.1 (started, create and mkdir support added
w) Add support for additional strong encryption types, and additional spnego
authentication mechanisms (see MS-SMB2)
+x) Finish support for SMB3.1.1 compression
+
Known Bugs
==========
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
index 25f3b2532198..e05e581af5cf 100644
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -41,10 +41,11 @@ Related CVEs
The following CVE entries describe Spectre variants:
- ============= ======================= =================
+ ============= ======================= ==========================
CVE-2017-5753 Bounds check bypass Spectre variant 1
CVE-2017-5715 Branch target injection Spectre variant 2
- ============= ======================= =================
+ CVE-2019-1125 Spectre v1 swapgs Spectre variant 1 (swapgs)
+ ============= ======================= ==========================
Problem
-------
@@ -78,6 +79,13 @@ There are some extensions of Spectre variant 1 attacks for reading data
over the network, see :ref:`[12] <spec_ref12>`. However such attacks
are difficult, low bandwidth, fragile, and are considered low risk.
+Note that, despite "Bounds Check Bypass" name, Spectre variant 1 is not
+only about user-controlled array bounds checks. It can affect any
+conditional checks. The kernel entry code interrupt, exception, and NMI
+handlers all have conditional swapgs checks. Those may be problematic
+in the context of Spectre v1, as kernel code can speculatively run with
+a user GS.
+
Spectre variant 2 (Branch Target Injection)
-------------------------------------------
@@ -132,6 +140,9 @@ not cover all possible attack vectors.
1. A user process attacking the kernel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Spectre variant 1
+~~~~~~~~~~~~~~~~~
+
The attacker passes a parameter to the kernel via a register or
via a known address in memory during a syscall. Such parameter may
be used later by the kernel as an index to an array or to derive
@@ -144,7 +155,40 @@ not cover all possible attack vectors.
potentially be influenced for Spectre attacks, new "nospec" accessor
macros are used to prevent speculative loading of data.
- Spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
+Spectre variant 1 (swapgs)
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ An attacker can train the branch predictor to speculatively skip the
+ swapgs path for an interrupt or exception. If they initialize
+ the GS register to a user-space value, if the swapgs is speculatively
+ skipped, subsequent GS-related percpu accesses in the speculation
+ window will be done with the attacker-controlled GS value. This
+ could cause privileged memory to be accessed and leaked.
+
+ For example:
+
+ ::
+
+ if (coming from user space)
+ swapgs
+ mov %gs:<percpu_offset>, %reg
+ mov (%reg), %reg1
+
+ When coming from user space, the CPU can speculatively skip the
+ swapgs, and then do a speculative percpu load using the user GS
+ value. So the user can speculatively force a read of any kernel
+ value. If a gadget exists which uses the percpu value as an address
+ in another load/store, then the contents of the kernel value may
+ become visible via an L1 side channel attack.
+
+ A similar attack exists when coming from kernel space. The CPU can
+ speculatively do the swapgs, causing the user GS to get used for the
+ rest of the speculative window.
+
+Spectre variant 2
+~~~~~~~~~~~~~~~~~
+
+ A spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
target buffer (BTB) before issuing syscall to launch an attack.
After entering the kernel, the kernel could use the poisoned branch
target buffer on indirect jump and jump to gadget code in speculative
@@ -280,11 +324,18 @@ The sysfs file showing Spectre variant 1 mitigation status is:
The possible values in this file are:
- ======================================= =================================
- 'Mitigation: __user pointer sanitation' Protection in kernel on a case by
- case base with explicit pointer
- sanitation.
- ======================================= =================================
+ .. list-table::
+
+ * - 'Not affected'
+ - The processor is not vulnerable.
+ * - 'Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers'
+ - The swapgs protections are disabled; otherwise it has
+ protection in the kernel on a case by case base with explicit
+ pointer sanitation and usercopy LFENCE barriers.
+ * - 'Mitigation: usercopy/swapgs barriers and __user pointer sanitization'
+ - Protection in the kernel on a case by case base with explicit
+ pointer sanitation, usercopy LFENCE barriers, and swapgs LFENCE
+ barriers.
However, the protections are put in place on a case by case basis,
and there is no guarantee that all possible attack vectors for Spectre
@@ -366,12 +417,27 @@ Turning on mitigation for Spectre variant 1 and Spectre variant 2
1. Kernel mitigation
^^^^^^^^^^^^^^^^^^^^
+Spectre variant 1
+~~~~~~~~~~~~~~~~~
+
For the Spectre variant 1, vulnerable kernel code (as determined
by code audit or scanning tools) is annotated on a case by case
basis to use nospec accessor macros for bounds clipping :ref:`[2]
<spec_ref2>` to avoid any usable disclosure gadgets. However, it may
not cover all attack vectors for Spectre variant 1.
+ Copy-from-user code has an LFENCE barrier to prevent the access_ok()
+ check from being mis-speculated. The barrier is done by the
+ barrier_nospec() macro.
+
+ For the swapgs variant of Spectre variant 1, LFENCE barriers are
+ added to interrupt, exception and NMI entry where needed. These
+ barriers are done by the FENCE_SWAPGS_KERNEL_ENTRY and
+ FENCE_SWAPGS_USER_ENTRY macros.
+
+Spectre variant 2
+~~~~~~~~~~~~~~~~~
+
For Spectre variant 2 mitigation, the compiler turns indirect calls or
jumps in the kernel into equivalent return trampolines (retpolines)
:ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` to go to the target
@@ -473,6 +539,12 @@ Mitigation control on the kernel command line
Spectre variant 2 mitigation can be disabled or force enabled at the
kernel command line.
+ nospectre_v1
+
+ [X86,PPC] Disable mitigations for Spectre Variant 1
+ (bounds check bypass). With this option data leaks are
+ possible in the system.
+
nospectre_v2
[X86] Disable all mitigations for the Spectre variant 2
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ce3696d8256c..3e2d22eb5e59 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1736,6 +1736,11 @@
Note that using this option lowers the security
provided by tboot because it makes the system
vulnerable to DMA attacks.
+ nobounce [Default off]
+ Disable bounce buffer for unstrusted devices such as
+ the Thunderbolt devices. This will treat the untrusted
+ devices as the trusted ones, hence might expose security
+ risks of DMA attacks.
intel_idle.max_cstate= [KNL,HW,ACPI,X86]
0 disables intel_idle and fall back on acpi_idle.
@@ -1815,7 +1820,7 @@
synchronously.
iommu.passthrough=
- [ARM64] Configure DMA to bypass the IOMMU by default.
+ [ARM64, X86] Configure DMA to bypass the IOMMU by default.
Format: { "0" | "1" }
0 - Use IOMMU translation for DMA.
1 - Bypass the IOMMU for DMA.
@@ -2377,7 +2382,7 @@
machvec= [IA-64] Force the use of a particular machine-vector
(machvec) in a generic kernel.
- Example: machvec=hpzx1_swiotlb
+ Example: machvec=hpzx1
machtype= [Loongson] Share the same kernel image file between different
yeeloong laptop.
@@ -2549,7 +2554,7 @@
mem_encrypt=on: Activate SME
mem_encrypt=off: Do not activate SME
- Refer to Documentation/virtual/kvm/amd-memory-encryption.rst
+ Refer to Documentation/virt/kvm/amd-memory-encryption.rst
for details on when memory encryption can be activated.
mem_sleep_default= [SUSPEND] Default system suspend mode:
@@ -2608,7 +2613,7 @@
expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC]
kpti=0 [ARM64]
- nospectre_v1 [PPC]
+ nospectre_v1 [X86,PPC]
nobp=0 [S390]
nospectre_v2 [X86,PPC,S390,ARM64]
spectre_v2_user=off [X86]
@@ -2969,9 +2974,9 @@
nosmt=force: Force disable SMT, cannot be undone
via the sysfs control file.
- nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds
- check bypass). With this option data leaks are possible
- in the system.
+ nospectre_v1 [X86,PPC] Disable mitigations for Spectre Variant 1
+ (bounds check bypass). With this option data leaks are
+ possible in the system.
nospectre_v2 [X86,PPC_FSL_BOOK3E,ARM64] Disable all mitigations for
the Spectre variant 2 (indirect branch prediction)
@@ -3841,12 +3846,13 @@
RCU_BOOST is not set, valid values are 0-99 and
the default is zero (non-realtime operation).
- rcutree.rcu_nocb_leader_stride= [KNL]
- Set the number of NOCB kthread groups, which
- defaults to the square root of the number of
- CPUs. Larger numbers reduces the wakeup overhead
- on the per-CPU grace-period kthreads, but increases
- that same overhead on each group's leader.
+ rcutree.rcu_nocb_gp_stride= [KNL]
+ Set the number of NOCB callback kthreads in
+ each group, which defaults to the square root
+ of the number of CPUs. Larger numbers reduce
+ the wakeup overhead on the global grace-period
+ kthread, but increases that same overhead on
+ each group's NOCB grace-period kthread.
rcutree.qhimark= [KNL]
Set threshold of queued RCU callbacks beyond which
@@ -4051,6 +4057,10 @@
rcutorture.verbose= [KNL]
Enable additional printk() statements.
+ rcupdate.rcu_cpu_stall_ftrace_dump= [KNL]
+ Dump ftrace buffer after reporting RCU CPU
+ stall warning.
+
rcupdate.rcu_cpu_stall_suppress= [KNL]
Suppress RCU CPU stall warning messages.
@@ -4094,6 +4104,13 @@
Run specified binary instead of /init from the ramdisk,
used for early userspace startup. See initrd.
+ rdrand= [X86]
+ force - Override the decision by the kernel to hide the
+ advertisement of RDRAND support (this affects
+ certain AMD processors because of buggy BIOS
+ support, specifically around the suspend/resume
+ path).
+
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
diff --git a/Documentation/admin-guide/laptops/thinkpad-acpi.rst b/Documentation/admin-guide/laptops/thinkpad-acpi.rst
index adea0bf2acc5..822907dcc845 100644
--- a/Documentation/admin-guide/laptops/thinkpad-acpi.rst
+++ b/Documentation/admin-guide/laptops/thinkpad-acpi.rst
@@ -49,6 +49,7 @@ detailed description):
- Fan control and monitoring: fan speed, fan enable/disable
- WAN enable and disable
- UWB enable and disable
+ - LCD Shadow (PrivacyGuard) enable and disable
A compatibility table by model and feature is maintained on the web
site, http://ibm-acpi.sf.net/. I appreciate any success or failure
@@ -1409,6 +1410,28 @@ Sysfs notes
Documentation/driver-api/rfkill.rst for details.
+LCD Shadow control
+------------------
+
+procfs: /proc/acpi/ibm/lcdshadow
+
+Some newer T480s and T490s ThinkPads provide a feature called
+PrivacyGuard. By turning this feature on, the usable vertical and
+horizontal viewing angles of the LCD can be limited (as if some privacy
+screen was applied manually in front of the display).
+
+procfs notes
+^^^^^^^^^^^^
+
+The available commands are::
+
+ echo '0' >/proc/acpi/ibm/lcdshadow
+ echo '1' >/proc/acpi/ibm/lcdshadow
+
+The first command ensures the best viewing angle and the latter one turns
+on the feature, restricting the viewing angles.
+
+
EXPERIMENTAL: UWB
-----------------
diff --git a/Documentation/admin-guide/perf/imx-ddr.rst b/Documentation/admin-guide/perf/imx-ddr.rst
new file mode 100644
index 000000000000..517a205abad6
--- /dev/null
+++ b/Documentation/admin-guide/perf/imx-ddr.rst
@@ -0,0 +1,52 @@
+=====================================================
+Freescale i.MX8 DDR Performance Monitoring Unit (PMU)
+=====================================================
+
+There are no performance counters inside the DRAM controller, so performance
+signals are brought out to the edge of the controller where a set of 4 x 32 bit
+counters is implemented. This is controlled by the CSV modes programed in counter
+control register which causes a large number of PERF signals to be generated.
+
+Selection of the value for each counter is done via the config registers. There
+is one register for each counter. Counter 0 is special in that it always counts
+“time” and when expired causes a lock on itself and the other counters and an
+interrupt is raised. If any other counter overflows, it continues counting, and
+no interrupt is raised.
+
+The "format" directory describes format of the config (event ID) and config1
+(AXI filtering) fields of the perf_event_attr structure, see /sys/bus/event_source/
+devices/imx8_ddr0/format/. The "events" directory describes the events types
+hardware supported that can be used with perf tool, see /sys/bus/event_source/
+devices/imx8_ddr0/events/.
+ e.g.::
+ perf stat -a -e imx8_ddr0/cycles/ cmd
+ perf stat -a -e imx8_ddr0/read/,imx8_ddr0/write/ cmd
+
+AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write)
+to count reading or writing matches filter setting. Filter setting is various
+from different DRAM controller implementations, which is distinguished by quirks
+in the driver.
+
+* With DDR_CAP_AXI_ID_FILTER quirk.
+ Filter is defined with two configuration parts:
+ --AXI_ID defines AxID matching value.
+ --AXI_MASKING defines which bits of AxID are meaningful for the matching.
+ 0:corresponding bit is masked.
+ 1: corresponding bit is not masked, i.e. used to do the matching.
+
+ AXI_ID and AXI_MASKING are mapped on DPCR1 register in performance counter.
+ When non-masked bits are matching corresponding AXI_ID bits then counter is
+ incremented. Perf counter is incremented if
+ AxID && AXI_MASKING == AXI_ID && AXI_MASKING
+
+ This filter doesn't support filter different AXI ID for axid-read and axid-write
+ event at the same time as this filter is shared between counters.
+ e.g.::
+ perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD/ cmd
+ perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD/ cmd
+
+ NOTE: axi_mask is inverted in userspace(i.e. set bits are bits to mask), and
+ it will be reverted in driver automatically. so that the user can just specify
+ axi_id to monitor a specific id, rather than having to specify axi_mask.
+ e.g.::
+ perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index a7d44e71019d..287b98708a40 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -39,7 +39,6 @@ Table : Subdirectories in /proc/sys/net
802 E802 protocol ax25 AX25
ethernet Ethernet protocol rose X.25 PLP layer
ipv4 IP version 4 x25 X.25 protocol
- ipx IPX token-ring IBM token ring
bridge Bridging decnet DEC net
ipv6 IP version 6 tipc TIPC
========= =================== = ========== ==================
@@ -401,33 +400,7 @@ interface.
(network) that the route leads to, the router (may be directly connected), the
route flags, and the device the route is using.
-
-5. IPX
-------
-
-The IPX protocol has no tunable values in proc/sys/net.
-
-The IPX protocol does, however, provide proc/net/ipx. This lists each IPX
-socket giving the local and remote addresses in Novell format (that is
-network:node:port). In accordance with the strange Novell tradition,
-everything but the port is in hex. Not_Connected is displayed for sockets that
-are not tied to a specific remote address. The Tx and Rx queue sizes indicate
-the number of bytes pending for transmission and reception. The state
-indicates the state the socket is in and the uid is the owning uid of the
-socket.
-
-The /proc/net/ipx_interface file lists all IPX interfaces. For each interface
-it gives the network number, the node number, and indicates if the network is
-the primary network. It also indicates which device it is bound to (or
-Internal for internal networks) and the Frame Type if appropriate. Linux
-supports 802.3, 802.2, 802.2 SNAP and DIX (Blue Book) ethernet framing for
-IPX.
-
-The /proc/net/ipx_route table holds a list of IPX routes. For each route it
-gives the destination network, the router node (or Directly) and the network
-address of the router (or Connected) for internal networks.
-
-6. TIPC
+5. TIPC
-------
tipc_rmem