summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide/cgroup-v2.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide/cgroup-v2.rst')
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst73
1 files changed, 42 insertions, 31 deletions
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index f67c0829350b..4ef890191196 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1213,23 +1213,25 @@ PAGE_SIZE multiple when read back.
A read-write single value file which exists on non-root
cgroups. The default is "max".
- Memory usage throttle limit. This is the main mechanism to
- control memory usage of a cgroup. If a cgroup's usage goes
+ Memory usage throttle limit. If a cgroup's usage goes
over the high boundary, the processes of the cgroup are
throttled and put under heavy reclaim pressure.
Going over the high limit never invokes the OOM killer and
- under extreme conditions the limit may be breached.
+ under extreme conditions the limit may be breached. The high
+ limit should be used in scenarios where an external process
+ monitors the limited cgroup to alleviate heavy reclaim
+ pressure.
memory.max
A read-write single value file which exists on non-root
cgroups. The default is "max".
- Memory usage hard limit. This is the final protection
- mechanism. If a cgroup's memory usage reaches this limit and
- can't be reduced, the OOM killer is invoked in the cgroup.
- Under certain circumstances, the usage may go over the limit
- temporarily.
+ Memory usage hard limit. This is the main mechanism to limit
+ memory usage of a cgroup. If a cgroup's memory usage reaches
+ this limit and can't be reduced, the OOM killer is invoked in
+ the cgroup. Under certain circumstances, the usage may go
+ over the limit temporarily.
In default configuration regular 0-order allocations always
succeed unless OOM killer chooses current task as a victim.
@@ -1238,10 +1240,6 @@ PAGE_SIZE multiple when read back.
Caller could retry them differently, return into userspace
as -ENOMEM or silently ignore in cases like disk readahead.
- This is the ultimate protection mechanism. As long as the
- high limit is used and monitored properly, this limit's
- utility is limited to providing the final safety net.
-
memory.reclaim
A write-only nested-keyed file which exists for all cgroups.
@@ -1582,6 +1580,13 @@ PAGE_SIZE multiple when read back.
Healthy workloads are not expected to reach this limit.
+ memory.swap.peak
+ A read-only single value file which exists on non-root
+ cgroups.
+
+ The max swap usage recorded for the cgroup and its
+ descendants since the creation of the cgroup.
+
memory.swap.max
A read-write single value file which exists on non-root
cgroups. The default is "max".
@@ -2024,31 +2029,33 @@ that attribute:
no-change
Do not modify the I/O priority class.
- none-to-rt
- For requests that do not have an I/O priority class (NONE),
- change the I/O priority class into RT. Do not modify
- the I/O priority class of other requests.
+ promote-to-rt
+ For requests that have a non-RT I/O priority class, change it into RT.
+ Also change the priority level of these requests to 4. Do not modify
+ the I/O priority of requests that have priority class RT.
restrict-to-be
For requests that do not have an I/O priority class or that have I/O
- priority class RT, change it into BE. Do not modify the I/O priority
- class of requests that have priority class IDLE.
+ priority class RT, change it into BE. Also change the priority level
+ of these requests to 0. Do not modify the I/O priority class of
+ requests that have priority class IDLE.
idle
Change the I/O priority class of all requests into IDLE, the lowest
I/O priority class.
+ none-to-rt
+ Deprecated. Just an alias for promote-to-rt.
+
The following numerical values are associated with the I/O priority policies:
-+-------------+---+
-| no-change | 0 |
-+-------------+---+
-| none-to-rt | 1 |
-+-------------+---+
-| rt-to-be | 2 |
-+-------------+---+
-| all-to-idle | 3 |
-+-------------+---+
++----------------+---+
+| no-change | 0 |
++----------------+---+
+| rt-to-be | 2 |
++----------------+---+
+| all-to-idle | 3 |
++----------------+---+
The numerical value that corresponds to each I/O priority class is as follows:
@@ -2064,9 +2071,13 @@ The numerical value that corresponds to each I/O priority class is as follows:
The algorithm to set the I/O priority class for a request is as follows:
-- Translate the I/O priority class policy into a number.
-- Change the request I/O priority class into the maximum of the I/O priority
- class policy number and the numerical I/O priority class.
+- If I/O priority class policy is promote-to-rt, change the request I/O
+ priority class to IOPRIO_CLASS_RT and change the request I/O priority
+ level to 4.
+- If I/O priorityt class is not promote-to-rt, translate the I/O priority
+ class policy into a number, then change the request I/O priority class
+ into the maximum of the I/O priority class policy number and the numerical
+ I/O priority class.
PID
---
@@ -2439,7 +2450,7 @@ Miscellaneous controller provides 3 interface files. If two misc resources (res_
res_b 10
misc.current
- A read-only flat-keyed file shown in the non-root cgroups. It shows
+ A read-only flat-keyed file shown in the all cgroups. It shows
the current usage of the resources in the cgroup and its children.::
$ cat misc.current