summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorIngo Molnar <mingo@kernel.org>2018-07-17 10:16:02 +0300
committerIngo Molnar <mingo@kernel.org>2018-07-17 10:16:02 +0300
commitea73a5c6929b9c7d30b7b424414645641cb7d1d9 (patch)
tree19de3c2469c6476be82b143535a018d0053f9460
parent9d3cce1e8b8561fed5f383d22a4d6949db4eadbe (diff)
parent18952651dae8efcc6d565c97f8fe5629b399cb3e (diff)
downloadlinux-ea73a5c6929b9c7d30b7b424414645641cb7d1d9.tar.xz
Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney: - An optimization and a fix for RCU expedited grace periods, with the fix being from Boqun Feng. - Miscellaneous fixes, including a lockdep-annotation fix from Boqun Feng. - SRCU updates. - Updates to rcutorture and associated scripting. - Introduce grace-period sequence numbers to the RCU-bh, RCU-preempt, and RCU-sched flavors, replacing the old ->gpnum and ->completed pair of fields. This change allows lockless code to obtain the complete grace-period state with a single READ_ONCE(), which is needed to maintain tolerable lock contention during the upcoming consolidation of the three RCU flavors. Note that grace-period sequence numbers are already used by rcu_barrier(), expedited RCU grace periods, and SRCU, and are thus already heavily used and well-tested. Joel Fernandes contributed a number of excellent fixes and improvements. - Clean up some grace-period-reporting loose ends, including improving the handling of quiescent states from offline CPUs and fixing some false-positive WARN_ON_ONCE() invocations. (Strictly speaking, the WARN_ON_ONCE() invocations were quite correct, but their invariants were (harmlessly) violated by the earlier sloppy handling of quiescent states from offline CPUs.) In addition, improve grace-period forward-progress guarantees so as to allow removal of fail-safe checks that required otherwise needless lock acquisitions. Finally, add more diagnostics to help debug the upcoming consolidation of the RCU-bh, RCU-preempt, and RCU-sched flavors. - Additional miscellaneous fixes, including those contributed by Byungchul Park, Mauro Carvalho Chehab, Joe Perches, Joel Fernandes, Steven Rostedt, Andrea Parri, and Neil Brown. - Additional torture-test changes, including several contributed by Arnd Bergmann and Joel Fernandes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
-rw-r--r--Documentation/RCU/Design/Data-Structures/Data-Structures.html118
-rw-r--r--Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html22
-rw-r--r--Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-cleanup.svg123
-rw-r--r--Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-1.svg16
-rw-r--r--Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-3.svg56
-rw-r--r--Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg237
-rw-r--r--Documentation/RCU/Design/Memory-Ordering/TreeRCU-qs.svg12
-rw-r--r--Documentation/RCU/stallwarn.txt24
-rw-r--r--Documentation/RCU/whatisRCU.txt18
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt4
-rw-r--r--MAINTAINERS9
-rw-r--r--include/linux/rculist.h19
-rw-r--r--include/linux/rcupdate.h20
-rw-r--r--include/linux/rcutiny.h2
-rw-r--r--include/linux/srcu.h17
-rw-r--r--include/linux/torture.h4
-rw-r--r--include/trace/events/rcu.h112
-rw-r--r--kernel/locking/locktorture.c5
-rw-r--r--kernel/rcu/rcu.h104
-rw-r--r--kernel/rcu/rcuperf.c57
-rw-r--r--kernel/rcu/rcutorture.c462
-rw-r--r--kernel/rcu/srcutree.c39
-rw-r--r--kernel/rcu/tiny.c4
-rw-r--r--kernel/rcu/tree.c1019
-rw-r--r--kernel/rcu/tree.h71
-rw-r--r--kernel/rcu/tree_exp.h14
-rw-r--r--kernel/rcu/tree_plugin.h176
-rw-r--r--kernel/rcu/update.c45
-rw-r--r--kernel/torture.c15
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/configinit.sh26
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm-build.sh11
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh1
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm-recheck.sh1
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh5
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/kvm.sh2
-rwxr-xr-xtools/testing/selftests/rcutorture/bin/parse-console.sh7
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot4
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/TREE08-T.boot1
-rw-r--r--tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh2
39 files changed, 1675 insertions, 1209 deletions
diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html
index 6c06e10bd04b..f5120a00f511 100644
--- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html
+++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html
@@ -380,31 +380,26 @@ and therefore need no protection.
as follows:
<pre>
- 1 unsigned long gpnum;
- 2 unsigned long completed;
+ 1 unsigned long gp_seq;
</pre>
<p>RCU grace periods are numbered, and
-the <tt>-&gt;gpnum</tt> field contains the number of the grace
-period that started most recently.
-The <tt>-&gt;completed</tt> field contains the number of the
-grace period that completed most recently.
-If the two fields are equal, the RCU grace period that most recently
-started has already completed, and therefore the corresponding
-flavor of RCU is idle.
-If <tt>-&gt;gpnum</tt> is one greater than <tt>-&gt;completed</tt>,
-then <tt>-&gt;gpnum</tt> gives the number of the current RCU
-grace period, which has not yet completed.
-Any other combination of values indicates that something is broken.
-These two fields are protected by the root <tt>rcu_node</tt>'s
+the <tt>-&gt;gp_seq</tt> field contains the current grace-period
+sequence number.
+The bottom two bits are the state of the current grace period,
+which can be zero for not yet started or one for in progress.
+In other words, if the bottom two bits of <tt>-&gt;gp_seq</tt> are
+zero, the corresponding flavor of RCU is idle.
+Any other value in the bottom two bits indicates that something is broken.
+This field is protected by the root <tt>rcu_node</tt> structure's
<tt>-&gt;lock</tt> field.
-</p><p>There are <tt>-&gt;gpnum</tt> and <tt>-&gt;completed</tt> fields
+</p><p>There are <tt>-&gt;gp_seq</tt> fields
in the <tt>rcu_node</tt> and <tt>rcu_data</tt> structures
as well.
The fields in the <tt>rcu_state</tt> structure represent the
-most current values, and those of the other structures are compared
-in order to detect the start of a new grace period in a distributed
+most current value, and those of the other structures are compared
+in order to detect the beginnings and ends of grace periods in a distributed
fashion.
The values flow from <tt>rcu_state</tt> to <tt>rcu_node</tt>
(down the tree from the root to the leaves) to <tt>rcu_data</tt>.
@@ -512,27 +507,47 @@ than to be heisenbugged out of existence.
as follows:
<pre>
- 1 unsigned long gpnum;
- 2 unsigned long completed;
+ 1 unsigned long gp_seq;
+ 2 unsigned long gp_seq_needed;
</pre>
-<p>These fields are the counterparts of the fields of the same name in
-the <tt>rcu_state</tt> structure.
-They each may lag up to one behind their <tt>rcu_state</tt>
-counterparts.
-If a given <tt>rcu_node</tt> structure's <tt>-&gt;gpnum</tt> and
-<tt>-&gt;complete</tt> fields are equal, then this <tt>rcu_node</tt>
+<p>The <tt>rcu_node</tt> structures' <tt>-&gt;gp_seq</tt> fields are
+the counterparts of the field of the same name in the <tt>rcu_state</tt>
+structure.
+They each may lag up to one step behind their <tt>rcu_state</tt>
+counterpart.
+If the bottom two bits of a given <tt>rcu_node</tt> structure's
+<tt>-&gt;gp_seq</tt> field is zero, then this <tt>rcu_node</tt>
structure believes that RCU is idle.
-Otherwise, as with the <tt>rcu_state</tt> structure,
-the <tt>-&gt;gpnum</tt> field will be one greater than the
-<tt>-&gt;complete</tt> fields, with <tt>-&gt;gpnum</tt>
-indicating which grace period this <tt>rcu_node</tt> believes
-is still being waited for.
+</p><p>The <tt>&gt;gp_seq</tt> field of each <tt>rcu_node</tt>
+structure is updated at the beginning and the end
+of each grace period.
+
+<p>The <tt>-&gt;gp_seq_needed</tt> fields record the
+furthest-in-the-future grace period request seen by the corresponding
+<tt>rcu_node</tt> structure. The request is considered fulfilled when
+the value of the <tt>-&gt;gp_seq</tt> field equals or exceeds that of
+the <tt>-&gt;gp_seq_needed</tt> field.
-</p><p>The <tt>&gt;gpnum</tt> field of each <tt>rcu_node</tt>
-structure is updated at the beginning
-of each grace period, and the <tt>-&gt;completed</tt> fields are
-updated at the end of each grace period.
+<table>
+<tr><th>&nbsp;</th></tr>
+<tr><th align="left">Quick Quiz:</th></tr>
+<tr><td>
+ Suppose that this <tt>rcu_node</tt> structure doesn't see
+ a request for a very long time.
+ Won't wrapping of the <tt>-&gt;gp_seq</tt> field cause
+ problems?
+</td></tr>
+<tr><th align="left">Answer:</th></tr>
+<tr><td bgcolor="#ffffff"><font color="ffffff">
+ No, because if the <tt>-&gt;gp_seq_needed</tt> field lags behind the
+ <tt>-&gt;gp_seq</tt> field, the <tt>-&gt;gp_seq_needed</tt> field
+ will be updated at the end of the grace period.
+ Modulo-arithmetic comparisons therefore will always get the
+ correct answer, even with wrapping.
+</font></td></tr>
+<tr><td>&nbsp;</td></tr>
+</table>
<h5>Quiescent-State Tracking</h5>
@@ -626,9 +641,8 @@ normal and expedited grace periods, respectively.
</ol>
<p><font color="ffffff">So the locking is absolutely required in
- order to coordinate
- clearing of the bits with the grace-period numbers in
- <tt>-&gt;gpnum</tt> and <tt>-&gt;completed</tt>.
+ order to coordinate clearing of the bits with updating of the
+ grace-period sequence number in <tt>-&gt;gp_seq</tt>.
</font></td></tr>
<tr><td>&nbsp;</td></tr>
</table>
@@ -1038,15 +1052,15 @@ out any <tt>rcu_data</tt> structure for which this flag is not set.
as follows:
<pre>
- 1 unsigned long completed;
- 2 unsigned long gpnum;
+ 1 unsigned long gp_seq;
+ 2 unsigned long gp_seq_needed;
3 bool cpu_no_qs;
4 bool core_needs_qs;
5 bool gpwrap;
6 unsigned long rcu_qs_ctr_snap;
</pre>
-<p>The <tt>completed</tt> and <tt>gpnum</tt>
+<p>The <tt>-&gt;gp_seq</tt> and <tt>-&gt;gp_seq_needed</tt>
fields are the counterparts of the fields of the same name
in the <tt>rcu_state</tt> and <tt>rcu_node</tt> structures.
They may each lag up to one behind their <tt>rcu_node</tt>
@@ -1054,15 +1068,9 @@ counterparts, but in <tt>CONFIG_NO_HZ_IDLE</tt> and
<tt>CONFIG_NO_HZ_FULL</tt> kernels can lag
arbitrarily far behind for CPUs in dyntick-idle mode (but these counters
will catch up upon exit from dyntick-idle mode).
-If a given <tt>rcu_data</tt> structure's <tt>-&gt;gpnum</tt> and
-<tt>-&gt;complete</tt> fields are equal, then this <tt>rcu_data</tt>
+If the lower two bits of a given <tt>rcu_data</tt> structure's
+<tt>-&gt;gp_seq</tt> are zero, then this <tt>rcu_data</tt>
structure believes that RCU is idle.
-Otherwise, as with the <tt>rcu_state</tt> and <tt>rcu_node</tt>
-structure,
-the <tt>-&gt;gpnum</tt> field will be one greater than the
-<tt>-&gt;complete</tt> fields, with <tt>-&gt;gpnum</tt>
-indicating which grace period this <tt>rcu_data</tt> believes
-is still being waited for.
<table>
<tr><th>&nbsp;</th></tr>
@@ -1070,13 +1078,13 @@ is still being waited for.
<tr><td>
All this replication of the grace period numbers can only cause
massive confusion.
- Why not just keep a global pair of counters and be done with it???
+ Why not just keep a global sequence number and be done with it???
</td></tr>
<tr><th align="left">Answer:</th></tr>
<tr><td bgcolor="#ffffff"><font color="ffffff">
- Because if there was only a single global pair of grace-period
+ Because if there was only a single global sequence
numbers, there would need to be a single global lock to allow
- safely accessing and updating them.
+ safely accessing and updating it.
And if we are not going to have a single global lock, we need
to carefully manage the numbers on a per-node basis.
Recall from the answer to a previous Quick Quiz that the consequences
@@ -1091,8 +1099,8 @@ CPU has not yet passed through a quiescent state,
while the <tt>-&gt;core_needs_qs</tt> flag indicates that the
RCU core needs a quiescent state from the corresponding CPU.
The <tt>-&gt;gpwrap</tt> field indicates that the corresponding
-CPU has remained idle for so long that the <tt>completed</tt>
-and <tt>gpnum</tt> counters are in danger of overflow, which
+CPU has remained idle for so long that the
+<tt>gp_seq</tt> counter is in danger of overflow, which
will cause the CPU to disregard the values of its counters on
its next exit from idle.
Finally, the <tt>rcu_qs_ctr_snap</tt> field is used to detect
@@ -1130,10 +1138,10 @@ The CPU advances the callbacks in its <tt>rcu_data</tt> structure
whenever it notices that another RCU grace period has completed.
The CPU detects the completion of an RCU grace period by noticing
that the value of its <tt>rcu_data</tt> structure's
-<tt>-&gt;completed</tt> field differs from that of its leaf
+<tt>-&gt;gp_seq</tt> field differs from that of its leaf
<tt>rcu_node</tt> structure.
Recall that each <tt>rcu_node</tt> structure's
-<tt>-&gt;completed</tt> field is updated at the end of each
+<tt>-&gt;gp_seq</tt> field is updated at the beginnings and ends of each
grace period.
<p>
diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
index 8651b0b4fd79..a346ce0116eb 100644
--- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
+++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
@@ -357,7 +357,7 @@ parts, starting in this section with the various phases of
grace-period initialization.
<p>The first ordering-related grace-period initialization action is to
-increment the <tt>rcu_state</tt> structure's <tt>-&gt;gpnum</tt>
+advance the <tt>rcu_state</tt> structure's <tt>-&gt;gp_seq</tt>
grace-period-number counter, as shown below:
</p><p><img src="TreeRCU-gp-init-1.svg" alt="TreeRCU-gp-init-1.svg" width="75%">
@@ -388,7 +388,7 @@ its last CPU and if the next <tt>rcu_node</tt> structure has no online CPUs).
<p>The final <tt>rcu_gp_init()</tt> pass through the <tt>rcu_node</tt>
tree traverses breadth-first, setting each <tt>rcu_node</tt> structure's
-<tt>-&gt;gpnum</tt> field to the newly incremented value from the
+<tt>-&gt;gp_seq</tt> field to the newly advanced value from the
<tt>rcu_state</tt> structure, as shown in the following diagram.
</p><p><img src="TreeRCU-gp-init-3.svg" alt="TreeRCU-gp-init-1.svg" width="75%">
@@ -398,9 +398,9 @@ tree traverses breadth-first, setting each <tt>rcu_node</tt> structure's
to notice that a new grace period has started, as described in the next
section.
But because the grace-period kthread started the grace period at the
-root (with the increment of the <tt>rcu_state</tt> structure's
-<tt>-&gt;gpnum</tt> field) before setting each leaf <tt>rcu_node</tt>
-structure's <tt>-&gt;gpnum</tt> field, each CPU's observation of
+root (with the advancing of the <tt>rcu_state</tt> structure's
+<tt>-&gt;gp_seq</tt> field) before setting each leaf <tt>rcu_node</tt>
+structure's <tt>-&gt;gp_seq</tt> field, each CPU's observation of
the start of the grace period will happen after the actual start
of the grace period.
@@ -466,7 +466,7 @@ section that the grace period must wait on.
<tr><td>
But a RCU read-side critical section might have started
after the beginning of the grace period
- (the <tt>-&gt;gpnum++</tt> from earlier), so why should
+ (the advancing of <tt>-&gt;gp_seq</tt> from earlier), so why should
the grace period wait on such a critical section?
</td></tr>
<tr><th align="left">Answer:</th></tr>
@@ -609,10 +609,8 @@ states outstanding from other CPUs.
<h4><a name="Grace-Period Cleanup">Grace-Period Cleanup</a></h4>
<p>Grace-period cleanup first scans the <tt>rcu_node</tt> tree
-breadth-first setting all the <tt>-&gt;completed</tt> fields equal
-to the number of the newly completed grace period, then it sets
-the <tt>rcu_state</tt> structure's <tt>-&gt;completed</tt> field,
-again to the number of the newly completed grace period.
+breadth-first advancing all the <tt>-&gt;gp_seq</tt> fields, then it
+advances the <tt>rcu_state</tt> structure's <tt>-&gt;gp_seq</tt> field.
The ordering effects are shown below:
</p><p><img src="TreeRCU-gp-cleanup.svg" alt="TreeRCU-gp-cleanup.svg" width="75%">
@@ -634,7 +632,7 @@ grace-period cleanup is complete, the next grace period can begin.
CPU has reported its quiescent state, but it may be some
milliseconds before RCU becomes aware of this.
The latest reasonable candidate is once the <tt>rcu_state</tt>
- structure's <tt>-&gt;completed</tt> field has been updated,
+ structure's <tt>-&gt;gp_seq</tt> field has been updated,
but it is quite possible that some CPUs have already completed
phase two of their updates by that time.
In short, if you are going to work with RCU, you need to
@@ -647,7 +645,7 @@ grace-period cleanup is complete, the next grace period can begin.
<h4><a name="Callback Invocation">Callback Invocation</a></h4>
<p>Once a given CPU's leaf <tt>rcu_node</tt> structure's
-<tt>-&gt;completed</tt> field has been updated, that CPU can begin
+<tt>-&gt;gp_seq</tt> field has been updated, that CPU can begin
invoking its RCU callbacks that were waiting for this grace period
to end.
These callbacks are identified by <tt>rcu_advance_cbs()</tt>,
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-cleanup.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-cleanup.svg
index 754f426b297a..bf84fbab27ee 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-cleanup.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-cleanup.svg
@@ -384,11 +384,11 @@
inkscape:window-height="1144"
id="namedview208"
showgrid="true"
- inkscape:zoom="0.70710678"
- inkscape:cx="617.89017"
- inkscape:cy="542.52419"
- inkscape:window-x="86"
- inkscape:window-y="28"
+ inkscape:zoom="0.78716603"
+ inkscape:cx="513.06403"
+ inkscape:cy="623.1214"
+ inkscape:window-x="102"
+ inkscape:window-y="38"
inkscape:window-maximized="0"
inkscape:current-layer="g3188-3"
fit-margin-top="5"
@@ -417,13 +417,15 @@
id="g3188">
<text
xml:space="preserve"
- x="3199.1516"
+ x="3145.9592"
y="13255.592"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3143">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
<g
id="g3107"
transform="translate(947.90548,11584.029)">
@@ -502,13 +504,15 @@
</g>
<text
xml:space="preserve"
- x="5324.5371"
- y="15414.598"
+ x="5264.4731"
+ y="15428.84"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-753"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ id="text202-36-7"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-5">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<g
style="fill:none;stroke-width:0.025in"
@@ -547,15 +551,6 @@
sodipodi:linespacing="125%"><tspan
style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
id="tspan3104-6-5-6-0">Leaf</tspan></text>
- <text
- xml:space="preserve"
- x="7479.5796"
- y="17699.943"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-9"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
<path
sodipodi:nodetypes="cc"
inkscape:connector-curvature="0"
@@ -566,15 +561,6 @@
style="fill:none;stroke-width:0.025in"
transform="translate(-737.93887,7732.6672)"
id="g3188-3">
- <text
- xml:space="preserve"
- x="3225.7478"
- y="13175.802"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-60"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">rsp-&gt;completed =</text>
<g
id="g3107-62"
transform="translate(947.90548,11584.029)">
@@ -607,15 +593,6 @@
sodipodi:linespacing="125%"><tspan
style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
id="tspan3104-6-5-7">Root</tspan></text>
- <text
- xml:space="preserve"
- x="3225.7478"
- y="13390.038"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-60-3"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"> rnp-&gt;completed</text>
<flowRoot
xml:space="preserve"
id="flowRoot3356"
@@ -627,7 +604,18 @@
height="63.63961"
x="332.34018"
y="681.87292" /></flowRegion><flowPara
- id="flowPara3362" /></flowRoot> </g>
+ id="flowPara3362" /></flowRoot> <text
+ xml:space="preserve"
+ x="3156.6121"
+ y="13317.754"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-36-6"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-0">rcu_seq_end(&amp;rsp-&gt;gp_seq)</tspan></text>
+ </g>
<g
style="fill:none;stroke-width:0.025in"
transform="translate(-858.40227,7769.0342)"
@@ -859,6 +847,17 @@
id="path3414-8-3-6-6"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
+ <text
+ xml:space="preserve"
+ x="7418.769"
+ y="17646.104"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-36-70"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-93">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<g
transform="translate(-1642.5377,-11611.245)"
@@ -887,13 +886,15 @@
</g>
<text
xml:space="preserve"
- x="5327.3057"
+ x="5274.1133"
y="15428.84"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202-36"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<g
transform="translate(-151.71746,-11647.612)"
@@ -972,13 +973,15 @@
id="tspan3104-6-5-6-0-92">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7486.4907"
- y="17670.119"
+ x="7408.5918"
+ y="17619.504"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-6"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ id="text202-36-2"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-9">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<g
transform="translate(-6817.1997,-11647.612)"
@@ -1019,13 +1022,15 @@
id="tspan3104-6-5-6-0-1">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7474.1382"
- y="17688.926"
+ x="7416.8003"
+ y="17619.504"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-5"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ id="text202-36-3"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-56">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<path
style="fill:none;stroke:#000000;stroke-width:13.29812908px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#Arrow2Lend)"
@@ -1059,15 +1064,6 @@
id="path3414-8-3-6"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
- <text
- xml:space="preserve"
- x="7318.9653"
- y="6031.6353"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-2"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
<g
style="fill:none;stroke-width:0.025in"
id="g4504-3-9"
@@ -1123,4 +1119,15 @@
id="path3134-9-0-3-5"
d="m 6875.6003,15833.906 1595.7755,0"
style="fill:none;stroke:#969696;stroke-width:53.19251633;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;marker-end:url(#Arrow1Send-36)" />
+ <text
+ xml:space="preserve"
+ x="7275.2612"
+ y="5971.8916"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-36-1"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-2">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</svg>
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-1.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-1.svg
index 0161262904ec..8c207550818f 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-1.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-1.svg
@@ -272,13 +272,13 @@
inkscape:window-height="1144"
id="namedview208"
showgrid="true"
- inkscape:zoom="0.70710678"
- inkscape:cx="617.89019"
- inkscape:cy="636.57143"
- inkscape:window-x="697"
+ inkscape:zoom="2.6330492"
+ inkscape:cx="524.82797"
+ inkscape:cy="519.31194"
+ inkscape:window-x="79"
inkscape:window-y="28"
inkscape:window-maximized="0"
- inkscape:current-layer="svg2"
+ inkscape:current-layer="g3188"
fit-margin-top="5"
fit-margin-right="5"
fit-margin-left="5"
@@ -305,13 +305,15 @@
id="g3188">
<text
xml:space="preserve"
- x="3305.5364"
+ x="3119.363"
y="13255.592"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">rsp-&gt;gpnum++</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3071">rcu_seq_start(rsp-&gt;gp_seq)</tspan></text>
<g
id="g3107"
transform="translate(947.90548,11584.029)">
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-3.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-3.svg
index de6ecc51b00e..d24d7d555dbc 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-3.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-3.svg
@@ -19,7 +19,7 @@
id="svg2"
version="1.1"
inkscape:version="0.48.4 r9939"
- sodipodi:docname="TreeRCU-gp-init-2.svg">
+ sodipodi:docname="TreeRCU-gp-init-3.svg">
<metadata
id="metadata212">
<rdf:RDF>
@@ -257,18 +257,22 @@
inkscape:window-width="1087"
inkscape:window-height="1144"
id="namedview208"
- showgrid="false"
- inkscape:zoom="0.70710678"
+ showgrid="true"
+ inkscape:zoom="0.68224756"
inkscape:cx="617.89019"
inkscape:cy="625.84293"
- inkscape:window-x="697"
+ inkscape:window-x="54"
inkscape:window-y="28"
inkscape:window-maximized="0"
- inkscape:current-layer="svg2"
+ inkscape:current-layer="g3153"
fit-margin-top="5"
fit-margin-right="5"
fit-margin-left="5"
- fit-margin-bottom="5" />
+ fit-margin-bottom="5">
+ <inkscape:grid
+ type="xygrid"
+ id="grid3090" />
+ </sodipodi:namedview>
<path
sodipodi:nodetypes="cccccccccccccccccccccccc"
inkscape:connector-curvature="0"
@@ -281,13 +285,13 @@
id="g3188">
<text
xml:space="preserve"
- x="3305.5364"
+ x="3145.9592"
y="13255.592"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
<g
id="g3107"
transform="translate(947.90548,11584.029)">
@@ -366,13 +370,13 @@
</g>
<text
xml:space="preserve"
- x="5392.3345"
- y="15407.104"
+ x="5253.6904"
+ y="15407.032"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202-6"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
style="fill:none;stroke-width:0.025in"
@@ -413,13 +417,13 @@
id="tspan3104-6-5-6-0">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7536.4883"
- y="17640.934"
+ x="7415.4365"
+ y="17670.572"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202-9"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
transform="translate(-1642.5375,-11610.962)"
@@ -448,13 +452,13 @@
</g>
<text
xml:space="preserve"
- x="5378.4146"
- y="15436.927"
+ x="5258.0688"
+ y="15412.313"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202-3"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
transform="translate(-151.71726,-11647.329)"
@@ -533,13 +537,13 @@
id="tspan3104-6-5-6-0-92">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7520.1294"
- y="17673.639"
+ x="7405.2607"
+ y="17670.572"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202-35"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
transform="translate(-6817.1998,-11647.329)"
@@ -580,13 +584,13 @@
id="tspan3104-6-5-6-0-1">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7521.4663"
- y="17666.062"
+ x="7413.4688"
+ y="17670.566"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202-75"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<path
style="fill:none;stroke:#000000;stroke-width:13.29812908px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#Arrow2Lend)"
@@ -622,11 +626,11 @@
sodipodi:nodetypes="cc" />
<text
xml:space="preserve"
- x="7370.856"
- y="5997.5972"
+ x="7271.9297"
+ y="6023.2412"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202-62"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</svg>
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
index b13b7b01bb3a..acd73c7ad0f4 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
@@ -1070,13 +1070,13 @@
inkscape:window-height="1144"
id="namedview208"
showgrid="true"
- inkscape:zoom="0.6004608"
- inkscape:cx="826.65969"
- inkscape:cy="483.3047"
- inkscape:window-x="66"
- inkscape:window-y="28"
+ inkscape:zoom="0.81932583"
+ inkscape:cx="840.45848"
+ inkscape:cy="5052.4242"
+ inkscape:window-x="787"
+ inkscape:window-y="24"
inkscape:window-maximized="0"
- inkscape:current-layer="svg2"
+ inkscape:current-layer="g4"
fit-margin-top="5"
fit-margin-right="5"
fit-margin-left="5"
@@ -1543,15 +1543,6 @@
style="fill:none;stroke-width:0.025in"
transform="translate(1749.0282,658.72243)"
id="g3188">
- <text
- xml:space="preserve"
- x="3305.5364"
- y="13255.592"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-5"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">rsp-&gt;gpnum++</text>
<g
id="g3107-62"
transform="translate(947.90548,11584.029)">
@@ -1584,6 +1575,17 @@
sodipodi:linespacing="125%"><tspan
style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
id="tspan3104-6-5-7">Root</tspan></text>
+ <text
+ xml:space="preserve"
+ x="3137.9988"
+ y="13271.316"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-626"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3071">rcu_seq_start(rsp-&gt;gp_seq)</tspan></text>
</g>
<rect
ry="0"
@@ -2318,15 +2320,6 @@
style="fill:none;stroke-width:0.025in"
transform="translate(1739.0986,17188.625)"
id="g3188-6">
- <text
- xml:space="preserve"
- x="3305.5364"
- y="13255.592"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-1"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
<g
id="g3107-5"
transform="translate(947.90548,11584.029)">
@@ -2359,6 +2352,15 @@
sodipodi:linespacing="125%"><tspan
style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
id="tspan3104-6-5-1">Root</tspan></text>
+ <text
+ xml:space="preserve"
+ x="3147.9268"
+ y="13240.524"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-1"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
style="fill:none;stroke-width:0.025in"
@@ -2387,13 +2389,13 @@
</g>
<text
xml:space="preserve"
- x="5392.3345"
- y="15407.104"
+ x="5263.1094"
+ y="15411.646"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-6-7"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ id="text202-92"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
style="fill:none;stroke-width:0.025in"
@@ -2434,13 +2436,13 @@
id="tspan3104-6-5-6-0-94">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7536.4883"
- y="17640.934"
+ x="7417.4053"
+ y="17655.502"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-9"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ id="text202-759"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
transform="translate(-2353.8462,17224.992)"
@@ -2469,13 +2471,13 @@
</g>
<text
xml:space="preserve"
- x="5378.4146"
- y="15436.927"
+ x="5246.1548"
+ y="15411.648"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-3"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ id="text202-87"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
transform="translate(-863.02613,17188.625)"
@@ -2554,13 +2556,13 @@
id="tspan3104-6-5-6-0-92-6">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7520.1294"
- y="17673.639"
+ x="7433.8257"
+ y="17682.098"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-35"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ id="text202-2"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<g
transform="translate(-7528.5085,17188.625)"
@@ -2601,13 +2603,13 @@
id="tspan3104-6-5-6-0-1-8">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7521.4663"
- y="17666.062"
+ x="7415.4404"
+ y="17682.098"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-75-1"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
+ id="text202-0"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
<path
style="fill:none;stroke:#000000;stroke-width:13.29812813px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#Arrow2Lend)"
@@ -2641,15 +2643,6 @@
id="path3414-8-3-6-4"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
- <text
- xml:space="preserve"
- x="6659.5469"
- y="34833.551"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-62"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
<path
sodipodi:nodetypes="ccc"
inkscape:connector-curvature="0"
@@ -3844,7 +3837,7 @@
font-weight="bold"
font-size="192"
id="text202-6-6-5"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rdp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rdp-&gt;gp_seq</text>
<text
xml:space="preserve"
x="5035.4155"
@@ -4284,15 +4277,6 @@
style="fill:none;stroke-width:0.025in"
transform="translate(1874.038,53203.538)"
id="g3188-7">
- <text
- xml:space="preserve"
- x="3199.1516"
- y="13255.592"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-82"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
<g
id="g3107-53"
transform="translate(947.90548,11584.029)">
@@ -4325,6 +4309,17 @@
sodipodi:linespacing="125%"><tspan
style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
id="tspan3104-6-5-19">Root</tspan></text>
+ <text
+ xml:space="preserve"
+ x="3175.896"
+ y="13240.11"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-36-3"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<rect
ry="0"
@@ -4371,13 +4366,15 @@
</g>
<text
xml:space="preserve"
- x="5324.5371"
- y="15414.598"
+ x="5264.4829"
+ y="15411.231"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-753"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ id="text202-36-7"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-5">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<g
style="fill:none;stroke-width:0.025in"
@@ -4412,30 +4409,12 @@
sodipodi:linespacing="125%"><tspan
style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
id="tspan3104-6-5-6-0-4">Leaf</tspan></text>
- <text
- xml:space="preserve"
- x="10084.225"
- y="70903.312"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-9-0"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
<path
sodipodi:nodetypes="ccc"
inkscape:connector-curvature="0"
id="path3134-9-0-3-9"
d="m 6315.6122,72629.054 -20.9533,8108.684 1648.968,0"
style="fill:none;stroke:#969696;stroke-width:53.19251251;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;marker-end:url(#Arrow1Send)" />
- <text
- xml:space="preserve"
- x="5092.4683"
- y="74111.672"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-60"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rsp-&gt;completed =</text>
<g
style="fill:none;stroke-width:0.025in"
id="g3107-62-6"
@@ -4469,15 +4448,6 @@
sodipodi:linespacing="125%"><tspan
style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
id="tspan3104-6-5-7-7">Root</tspan></text>
- <text
- xml:space="preserve"
- x="5092.4683"
- y="74325.906"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-60-3"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"> rnp-&gt;completed</text>
<g
style="fill:none;stroke-width:0.025in"
transform="translate(1746.2528,60972.572)"
@@ -4736,13 +4706,15 @@
</g>
<text
xml:space="preserve"
- x="5327.3057"
- y="15428.84"
+ x="5274.1216"
+ y="15411.231"
font-style="normal"
font-weight="bold"
font-size="192"
id="text202-36"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-6">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<g
transform="translate(-728.08545,53203.538)"
@@ -4821,13 +4793,15 @@
id="tspan3104-6-5-6-0-92-5">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7486.4907"
- y="17670.119"
+ x="7435.1987"
+ y="17708.281"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-6-2"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ id="text202-36-9"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-1">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<g
transform="translate(-7393.5687,53203.538)"
@@ -4868,13 +4842,15 @@
id="tspan3104-6-5-6-0-1-5">Leaf</tspan></text>
<text
xml:space="preserve"
- x="7474.1382"
- y="17688.926"
+ x="7416.8125"
+ y="17708.281"
font-style="normal"
font-weight="bold"
font-size="192"
- id="text202-5-1"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
+ id="text202-36-35"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-62">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
</g>
<path
style="fill:none;stroke:#000000;stroke-width:13.29812813px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#Arrow2Lend)"
@@ -4908,15 +4884,6 @@
id="path3414-8-3-6-67"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
- <text
- xml:space="preserve"
- x="6742.6001"
- y="70882.617"
- font-style="normal"
- font-weight="bold"
- font-size="192"
- id="text202-2"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
<g
style="fill:none;stroke-width:0.025in"
id="g4504-3-9-6"
@@ -5131,5 +5098,47 @@
font-size="192"
id="text202-7-9-6-6-7"
style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_do_batch()</text>
+ <text
+ xml:space="preserve"
+ x="6698.9019"
+ y="70885.211"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-36-2"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-7">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
+ <text
+ xml:space="preserve"
+ x="10023.457"
+ y="70885.234"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-36-0"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-9">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
+ <text
+ xml:space="preserve"
+ x="5023.3389"
+ y="74209.773"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-36-36"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
+ style="font-size:172.87567139px"
+ id="tspan3166-0">rcu_seq_end(&amp;rsp-&gt;gp_seq)</tspan></text>
+ <text
+ xml:space="preserve"
+ x="6562.5884"
+ y="34870.727"
+ font-style="normal"
+ font-weight="bold"
+ font-size="192"
+ id="text202-3"
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
</g>
</svg>
diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-qs.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-qs.svg
index de3992f4cbe1..149bec2a4493 100644
--- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-qs.svg
+++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-qs.svg
@@ -300,13 +300,13 @@
inkscape:window-height="1144"
id="namedview208"
showgrid="true"
- inkscape:zoom="0.70710678"
- inkscape:cx="616.47598"
- inkscape:cy="595.41964"
- inkscape:window-x="813"
+ inkscape:zoom="0.96484375"
+ inkscape:cx="507.0191"
+ inkscape:cy="885.62207"
+ inkscape:window-x="47"
inkscape:window-y="28"
inkscape:window-maximized="0"
- inkscape:current-layer="g4405"
+ inkscape:current-layer="g3115"
fit-margin-top="5"
fit-margin-right="5"
fit-margin-left="5"
@@ -710,7 +710,7 @@
font-weight="bold"
font-size="192"
id="text202-6-6"
- style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rdp-&gt;gpnum</text>
+ style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rdp-&gt;gp_seq</text>
<text
xml:space="preserve"
x="5035.4155"
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 4259f95c3261..f99cf11b314b 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -172,7 +172,7 @@ it will print a message similar to the following:
INFO: rcu_sched detected stalls on CPUs/tasks:
2-...: (3 GPs behind) idle=06c/0/0 softirq=1453/1455 fqs=0
16-...: (0 ticks this GP) idle=81c/0/0 softirq=764/764 fqs=0
- (detected by 32, t=2603 jiffies, g=7073, c=7072, q=625)
+ (detected by 32, t=2603 jiffies, g=7075, q=625)
This message indicates that CPU 32 detected that CPUs 2 and 16 were both
causing stalls, and that the stall was affecting RCU-sched. This message
@@ -215,11 +215,10 @@ CPU since the last time that this CPU noted the beginning of a grace
period.
The "detected by" line indicates which CPU detected the stall (in this
-case, CPU 32), how many jiffies have elapsed since the start of the
-grace period (in this case 2603), the number of the last grace period
-to start and to complete (7073 and 7072, respectively), and an estimate
-of the total number of RCU callbacks queued across all CPUs (625 in
-this case).
+case, CPU 32), how many jiffies have elapsed since the start of the grace
+period (in this case 2603), the grace-period sequence number (7075), and
+an estimate of the total number of RCU callbacks queued across all CPUs
+(625 in this case).
In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed
for each CPU:
@@ -266,15 +265,16 @@ If the relevant grace-period kthread has been unable to run prior to
the stall warning, as was the case in the "All QSes seen" line above,
the following additional line is printed:
- kthread starved for 23807 jiffies! g7073 c7072 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
+ kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
Starving the grace-period kthreads of CPU time can of course result
in RCU CPU stall warnings even when all CPUs and tasks have passed
-through the required quiescent states. The "g" and "c" numbers flag the
-number of the last grace period started and completed, respectively,
-the "f" precedes the ->gp_flags command to the grace-period kthread,
-the "RCU_GP_WAIT_FQS" indicates that the kthread is waiting for a short
-timeout, and the "state" precedes value of the task_struct ->state field.
+through the required quiescent states. The "g" number shows the current
+grace-period sequence number, the "f" precedes the ->gp_flags command
+to the grace-period kthread, the "RCU_GP_WAIT_FQS" indicates that the
+kthread is waiting for a short timeout, the "state" precedes value of the
+task_struct ->state field, and the "cpu" indicates that the grace-period
+kthread last ran on CPU 5.
Multiple Warnings From One Stall
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 65eb856526b7..c2a7facf7ff9 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -588,6 +588,7 @@ It is extremely simple:
void synchronize_rcu(void)
{
write_lock(&rcu_gp_mutex);
+ smp_mb__after_spinlock();
write_unlock(&rcu_gp_mutex);
}
@@ -609,12 +610,15 @@ don't forget about them when submitting patches making use of RCU!]
The rcu_read_lock() and rcu_read_unlock() primitive read-acquire
and release a global reader-writer lock. The synchronize_rcu()
-primitive write-acquires this same lock, then immediately releases
-it. This means that once synchronize_rcu() exits, all RCU read-side
-critical sections that were in progress before synchronize_rcu() was
-called are guaranteed to have completed -- there is no way that
-synchronize_rcu() would have been able to write-acquire the lock
-otherwise.
+primitive write-acquires this same lock, then releases it. This means
+that once synchronize_rcu() exits, all RCU read-side critical sections
+that were in progress before synchronize_rcu() was called are guaranteed
+to have completed -- there is no way that synchronize_rcu() would have
+been able to write-acquire the lock otherwise. The smp_mb__after_spinlock()
+promotes synchronize_rcu() to a full memory barrier in compliance with
+the "Memory-Barrier Guarantees" listed in:
+
+ Documentation/RCU/Design/Requirements/Requirements.html.
It is possible to nest rcu_read_lock(), since reader-writer locks may
be recursively acquired. Note also that rcu_read_lock() is immune
@@ -816,11 +820,13 @@ RCU list traversal:
list_next_rcu
list_for_each_entry_rcu
list_for_each_entry_continue_rcu
+ list_for_each_entry_from_rcu
hlist_first_rcu
hlist_next_rcu
hlist_pprev_rcu
hlist_for_each_entry_rcu
hlist_for_each_entry_rcu_bh
+ hlist_for_each_entry_from_rcu
hlist_for_each_entry_continue_rcu
hlist_for_each_entry_continue_rcu_bh
hlist_nulls_first_rcu
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 533ff5c68970..c370f5f0eb38 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3632,8 +3632,8 @@
Set time (s) after boot for CPU-hotplug testing.
rcutorture.onoff_interval= [KNL]
- Set time (s) between CPU-hotplug operations, or
- zero to disable CPU-hotplug testing.
+ Set time (jiffies) between CPU-hotplug operations,
+ or zero to disable CPU-hotplug testing.
rcutorture.shuffle_interval= [KNL]
Set task-shuffle interval (s). Shuffling tasks
diff --git a/MAINTAINERS b/MAINTAINERS
index 192d7f73fd01..16b07255574d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12038,9 +12038,9 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
F: Documentation/RCU/
X: Documentation/RCU/torture.txt
F: include/linux/rcu*
-X: include/linux/srcu.h
+X: include/linux/srcu*.h
F: kernel/rcu/
-X: kernel/torture.c
+X: kernel/rcu/srcu*.c
REAL TIME CLOCK (RTC) SUBSYSTEM
M: Alessandro Zummo <a.zummo@towertech.it>
@@ -13077,8 +13077,8 @@ L: linux-kernel@vger.kernel.org
W: http://www.rdrop.com/users/paulmck/RCU/
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
-F: include/linux/srcu.h
-F: kernel/rcu/srcu.c
+F: include/linux/srcu*.h
+F: kernel/rcu/srcu*.c
SERIAL LOW-POWER INTER-CHIP MEDIA BUS (SLIMbus)
M: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
@@ -14437,6 +14437,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
F: Documentation/RCU/torture.txt
F: kernel/torture.c
F: kernel/rcu/rcutorture.c
+F: kernel/rcu/rcuperf.c
F: kernel/locking/locktorture.c
TOSHIBA ACPI EXTRAS DRIVER
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 36df6ccbc874..4786c2235b98 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -396,7 +396,16 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
* @member: the name of the list_head within the struct.
*
* Continue to iterate over list of given type, continuing after
- * the current position.
+ * the current position which must have been in the list when the RCU read
+ * lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_from_rcu() except
+ * this starts after the given position and that one starts at the given
+ * position.
*/
#define list_for_each_entry_continue_rcu(pos, head, member) \
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
@@ -411,6 +420,14 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
*
* Iterate over the tail of a list starting from a given position,
* which must have been in the list when the RCU read lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_continue_rcu() except
+ * this starts from the given position and that one starts from the position
+ * after the given position.
*/
#define list_for_each_entry_from_rcu(pos, head, member) \
for (; &(pos)->member != (head); \
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 65163aa0bb04..75e5b393cf44 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -64,7 +64,6 @@ void rcu_barrier_tasks(void);
void __rcu_read_lock(void);
void __rcu_read_unlock(void);
-void rcu_read_unlock_special(struct task_struct *t);
void synchronize_rcu(void);
/*
@@ -159,11 +158,11 @@ static inline void rcu_init_nohz(void) { }
} while (0)
/*
- * Note a voluntary context switch for RCU-tasks benefit. This is a
- * macro rather than an inline function to avoid #include hell.
+ * Note a quasi-voluntary context switch for RCU-tasks's benefit.
+ * This is a macro rather than an inline function to avoid #include hell.
*/
#ifdef CONFIG_TASKS_RCU
-#define rcu_note_voluntary_context_switch_lite(t) \
+#define rcu_tasks_qs(t) \
do { \
if (READ_ONCE((t)->rcu_tasks_holdout)) \
WRITE_ONCE((t)->rcu_tasks_holdout, false); \
@@ -171,14 +170,14 @@ static inline void rcu_init_nohz(void) { }
#define rcu_note_voluntary_context_switch(t) \
do { \
rcu_all_qs(); \
- rcu_note_voluntary_context_switch_lite(t); \
+ rcu_tasks_qs(t); \
} while (0)
void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
void synchronize_rcu_tasks(void);
void exit_tasks_rcu_start(void);
void exit_tasks_rcu_finish(void);
#else /* #ifdef CONFIG_TASKS_RCU */
-#define rcu_note_voluntary_context_switch_lite(t) do { } while (0)
+#define rcu_tasks_qs(t) do { } while (0)
#define rcu_note_voluntary_context_switch(t) rcu_all_qs()
#define call_rcu_tasks call_rcu_sched
#define synchronize_rcu_tasks synchronize_sched
@@ -195,8 +194,8 @@ static inline void exit_tasks_rcu_finish(void) { }
*/
#define cond_resched_tasks_rcu_qs() \
do { \
- if (!cond_resched()) \
- rcu_note_voluntary_context_switch_lite(current); \
+ rcu_tasks_qs(current); \
+ cond_resched(); \
} while (0)
/*
@@ -567,8 +566,8 @@ static inline void rcu_preempt_sleep_check(void) { }
* This is simply an identity function, but it documents where a pointer
* is handed off from RCU to some other synchronization mechanism, for
* example, reference counting or locking. In C11, it would map to
- * kill_dependency(). It could be used as follows:
- * ``
+ * kill_dependency(). It could be used as follows::
+ *
* rcu_read_lock();
* p = rcu_dereference(gp);
* long_lived = is_long_lived(p);
@@ -579,7 +578,6 @@ static inline void rcu_preempt_sleep_check(void) { }
* p = rcu_pointer_handoff(p);
* }
* rcu_read_unlock();
- *``
*/
#define rcu_pointer_handoff(p) (p)
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 7b3c82e8a625..8d9a0ea8f0b5 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -93,7 +93,7 @@ static inline void kfree_call_rcu(struct rcu_head *head,
#define rcu_note_context_switch(preempt) \
do { \
rcu_sched_qs(); \
- rcu_note_voluntary_context_switch_lite(current); \
+ rcu_tasks_qs(current); \
} while (0)
static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 91494d7e8e41..3e72a291c401 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -195,6 +195,16 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
return retval;
}
+/* Used by tracing, cannot be traced and cannot invoke lockdep. */
+static inline notrace int
+srcu_read_lock_notrace(struct srcu_struct *sp) __acquires(sp)
+{
+ int retval;
+
+ retval = __srcu_read_lock(sp);
+ return retval;
+}
+
/**
* srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
* @sp: srcu_struct in which to unregister the old reader.
@@ -209,6 +219,13 @@ static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
__srcu_read_unlock(sp, idx);
}
+/* Used by tracing, cannot be traced and cannot call lockdep. */
+static inline notrace void
+srcu_read_unlock_notrace(struct srcu_struct *sp, int idx) __releases(sp)
+{
+ __srcu_read_unlock(sp, idx);
+}
+
/**
* smp_mb__after_srcu_read_unlock - ensure full ordering after srcu_read_unlock
*
diff --git a/include/linux/torture.h b/include/linux/torture.h
index 66272862070b..61dfd93b6ee4 100644
--- a/include/linux/torture.h
+++ b/include/linux/torture.h
@@ -64,6 +64,8 @@ struct torture_random_state {
long trs_count;
};
#define DEFINE_TORTURE_RANDOM(name) struct torture_random_state name = { 0, 0 }
+#define DEFINE_TORTURE_RANDOM_PERCPU(name) \
+ DEFINE_PER_CPU(struct torture_random_state, name)
unsigned long torture_random(struct torture_random_state *trsp);
/* Task shuffler, which causes CPUs to occasionally go idle. */
@@ -79,7 +81,7 @@ void stutter_wait(const char *title);
int torture_stutter_init(int s);
/* Initialization and cleanup. */
-bool torture_init_begin(char *ttype, bool v);
+bool torture_init_begin(char *ttype, int v);
void torture_init_end(void);
bool torture_cleanup_begin(void);
void torture_cleanup_end(void);
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 5936aac357ab..a8d07feff6a0 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -52,6 +52,7 @@ TRACE_EVENT(rcu_utilization,
* "cpuqs": CPU passes through a quiescent state.
* "cpuonl": CPU comes online.
* "cpuofl": CPU goes offline.
+ * "cpuofl-bgp": CPU goes offline while blocking a grace period.
* "reqwait": GP kthread sleeps waiting for grace-period request.
* "reqwaitsig": GP kthread awakened by signal from reqwait state.
* "fqswait": GP kthread waiting until time to force quiescent states.
@@ -63,24 +64,24 @@ TRACE_EVENT(rcu_utilization,
*/
TRACE_EVENT(rcu_grace_period,
- TP_PROTO(const char *rcuname, unsigned long gpnum, const char *gpevent),
+ TP_PROTO(const char *rcuname, unsigned long gp_seq, const char *gpevent),
- TP_ARGS(rcuname, gpnum, gpevent),
+ TP_ARGS(rcuname, gp_seq, gpevent),
TP_STRUCT__entry(
__field(const char *, rcuname)
- __field(unsigned long, gpnum)
+ __field(unsigned long, gp_seq)
__field(const char *, gpevent)
),
TP_fast_assign(
__entry->rcuname = rcuname;
- __entry->gpnum = gpnum;
+ __entry->gp_seq = gp_seq;
__entry->gpevent = gpevent;
),
TP_printk("%s %lu %s",
- __entry->rcuname, __entry->gpnum, __entry->gpevent)
+ __entry->rcuname, __entry->gp_seq, __entry->gpevent)
);
/*
@@ -90,8 +91,8 @@ TRACE_EVENT(rcu_grace_period,
*
* "Startleaf": Request a grace period based on leaf-node data.
* "Prestarted": Someone beat us to the request
- * "Startedleaf": Leaf-node start proved sufficient.
- * "Startedleafroot": Leaf-node start proved sufficient after checking root.
+ * "Startedleaf": Leaf node marked for future GP.
+ * "Startedleafroot": All nodes from leaf to root marked for future GP.
* "Startedroot": Requested a nocb grace period based on root-node data.
* "NoGPkthread": The RCU grace-period kthread has not yet started.
* "StartWait": Start waiting for the requested grace period.
@@ -102,17 +103,16 @@ TRACE_EVENT(rcu_grace_period,
*/
TRACE_EVENT(rcu_future_grace_period,
- TP_PROTO(const char *rcuname, unsigned long gpnum, unsigned long completed,
- unsigned long c, u8 level, int grplo, int grphi,
+ TP_PROTO(const char *rcuname, unsigned long gp_seq,
+ unsigned long gp_seq_req, u8 level, int grplo, int grphi,
const char *gpevent),
- TP_ARGS(rcuname, gpnum, completed, c, level, grplo, grphi, gpevent),
+ TP_ARGS(rcuname, gp_seq, gp_seq_req, level, grplo, grphi, gpevent),
TP_STRUCT__entry(
__field(const char *, rcuname)
- __field(unsigned long, gpnum)
- __field(unsigned long, completed)
- __field(unsigned long, c)
+ __field(unsigned long, gp_seq)
+ __field(unsigned long, gp_seq_req)
__field(u8, level)
__field(int, grplo)
__field(int, grphi)
@@ -121,19 +121,17 @@ TRACE_EVENT(rcu_future_grace_period,
TP_fast_assign(
__entry->rcuname = rcuname;
- __entry->gpnum = gpnum;
- __entry->completed = completed;
- __entry->c = c;
+ __entry->gp_seq = gp_seq;
+ __entry->gp_seq_req = gp_seq_req;
__entry->level = level;
__entry->grplo = grplo;
__entry->grphi = grphi;
__entry->gpevent = gpevent;
),
- TP_printk("%s %lu %lu %lu %u %d %d %s",
- __entry->rcuname, __entry->gpnum, __entry->completed,
- __entry->c, __entry->level, __entry->grplo, __entry->grphi,
- __entry->gpevent)
+ TP_printk("%s %lu %lu %u %d %d %s",
+ __entry->rcuname, __entry->gp_seq, __entry->gp_seq_req, __entry->level,
+ __entry->grplo, __entry->grphi, __entry->gpevent)
);
/*
@@ -145,14 +143,14 @@ TRACE_EVENT(rcu_future_grace_period,
*/
TRACE_EVENT(rcu_grace_period_init,
- TP_PROTO(const char *rcuname, unsigned long gpnum, u8 level,
+ TP_PROTO(const char *rcuname, unsigned long gp_seq, u8 level,
int grplo, int grphi, unsigned long qsmask),
- TP_ARGS(rcuname, gpnum, level, grplo, grphi, qsmask),
+ TP_ARGS(rcuname, gp_seq, level, grplo, grphi, qsmask),
TP_STRUCT__entry(
__field(const char *, rcuname)
- __field(unsigned long, gpnum)
+ __field(unsigned long, gp_seq)
__field(u8, level)
__field(int, grplo)
__field(int, grphi)
@@ -161,7 +159,7 @@ TRACE_EVENT(rcu_grace_period_init,
TP_fast_assign(
__entry->rcuname = rcuname;
- __entry->gpnum = gpnum;
+ __entry->gp_seq = gp_seq;
__entry->level = level;
__entry->grplo = grplo;
__entry->grphi = grphi;
@@ -169,7 +167,7 @@ TRACE_EVENT(rcu_grace_period_init,
),
TP_printk("%s %lu %u %d %d %lx",
- __entry->rcuname, __entry->gpnum, __entry->level,
+ __entry->rcuname, __entry->gp_seq, __entry->level,
__entry->grplo, __entry->grphi, __entry->qsmask)
);
@@ -301,24 +299,24 @@ TRACE_EVENT(rcu_nocb_wake,
*/
TRACE_EVENT(rcu_preempt_task,
- TP_PROTO(const char *rcuname, int pid, unsigned long gpnum),
+ TP_PROTO(const char *rcuname, int pid, unsigned long gp_seq),
- TP_ARGS(rcuname, pid, gpnum),
+ TP_ARGS(rcuname, pid, gp_seq),
TP_STRUCT__entry(
__field(const char *, rcuname)
- __field(unsigned long, gpnum)
+ __field(unsigned long, gp_seq)
__field(int, pid)
),
TP_fast_assign(
__entry->rcuname = rcuname;
- __entry->gpnum = gpnum;
+ __entry->gp_seq = gp_seq;
__entry->pid = pid;
),
TP_printk("%s %lu %d",
- __entry->rcuname, __entry->gpnum, __entry->pid)
+ __entry->rcuname, __entry->gp_seq, __entry->pid)
);
/*
@@ -328,23 +326,23 @@ TRACE_EVENT(rcu_preempt_task,
*/
TRACE_EVENT(rcu_unlock_preempted_task,
- TP_PROTO(const char *rcuname, unsigned long gpnum, int pid),
+ TP_PROTO(const char *rcuname, unsigned long gp_seq, int pid),
- TP_ARGS(rcuname, gpnum, pid),
+ TP_ARGS(rcuname, gp_seq, pid),
TP_STRUCT__entry(
__field(const char *, rcuname)
- __field(unsigned long, gpnum)
+ __field(unsigned long, gp_seq)
__field(int, pid)
),
TP_fast_assign(
__entry->rcuname = rcuname;
- __entry->gpnum = gpnum;
+ __entry->gp_seq = gp_seq;
__entry->pid = pid;
),
- TP_printk("%s %lu %d", __entry->rcuname, __entry->gpnum, __entry->pid)
+ TP_printk("%s %lu %d", __entry->rcuname, __entry->gp_seq, __entry->pid)
);
/*
@@ -357,15 +355,15 @@ TRACE_EVENT(rcu_unlock_preempted_task,
*/
TRACE_EVENT(rcu_quiescent_state_report,
- TP_PROTO(const char *rcuname, unsigned long gpnum,
+ TP_PROTO(const char *rcuname, unsigned long gp_seq,
unsigned long mask, unsigned long qsmask,
u8 level, int grplo, int grphi, int gp_tasks),
- TP_ARGS(rcuname, gpnum, mask, qsmask, level, grplo, grphi, gp_tasks),
+ TP_ARGS(rcuname, gp_seq, mask, qsmask, level, grplo, grphi, gp_tasks),
TP_STRUCT__entry(
__field(const char *, rcuname)
- __field(unsigned long, gpnum)
+ __field(unsigned long, gp_seq)
__field(unsigned long, mask)
__field(unsigned long, qsmask)
__field(u8, level)
@@ -376,7 +374,7 @@ TRACE_EVENT(rcu_quiescent_state_report,
TP_fast_assign(
__entry->rcuname = rcuname;
- __entry->gpnum = gpnum;
+ __entry->gp_seq = gp_seq;
__entry->mask = mask;
__entry->qsmask = qsmask;
__entry->level = level;
@@ -386,41 +384,41 @@ TRACE_EVENT(rcu_quiescent_state_report,
),
TP_printk("%s %lu %lx>%lx %u %d %d %u",
- __entry->rcuname, __entry->gpnum,
+ __entry->rcuname, __entry->gp_seq,
__entry->mask, __entry->qsmask, __entry->level,
__entry->grplo, __entry->grphi, __entry->gp_tasks)
);
/*
* Tracepoint for quiescent states detected by force_quiescent_state().
- * These trace events include the type of RCU, the grace-period number that
- * was blocked by the CPU, the CPU itself, and the type of quiescent state,
- * which can be "dti" for dyntick-idle mode, "ofl" for CPU offline, "kick"
- * when kicking a CPU that has been in dyntick-idle mode for too long, or
- * "rqc" if the CPU got a quiescent state via its rcu_qs_ctr.
+ * These trace events include the type of RCU, the grace-period number
+ * that was blocked by the CPU, the CPU itself, and the type of quiescent
+ * state, which can be "dti" for dyntick-idle mode, "kick" when kicking
+ * a CPU that has been in dyntick-idle mode for too long, or "rqc" if the
+ * CPU got a quiescent state via its rcu_qs_ctr.
*/
TRACE_EVENT(rcu_fqs,
- TP_PROTO(const char *rcuname, unsigned long gpnum, int cpu, const char *qsevent),
+ TP_PROTO(const char *rcuname, unsigned long gp_seq, int cpu, const char *qsevent),
- TP_ARGS(rcuname, gpnum, cpu, qsevent),
+ TP_ARGS(rcuname, gp_seq, cpu, qsevent),
TP_STRUCT__entry(
__field(const char *, rcuname)
- __field(unsigned long, gpnum)
+ __field(unsigned long, gp_seq)
__field(int, cpu)
__field(const char *, qsevent)
),
TP_fast_assign(
__entry->rcuname = rcuname;
- __entry->gpnum = gpnum;
+ __entry->gp_seq = gp_seq;
__entry->cpu = cpu;
__entry->qsevent = qsevent;
),
TP_printk("%s %lu %d %s",
- __entry->rcuname, __entry->gpnum,
+ __entry->rcuname, __entry->gp_seq,
__entry->cpu, __entry->qsevent)
);
@@ -753,23 +751,23 @@ TRACE_EVENT(rcu_barrier,
#else /* #ifdef CONFIG_RCU_TRACE */
-#define trace_rcu_grace_period(rcuname, gpnum, gpevent) do { } while (0)
-#define trace_rcu_future_grace_period(rcuname, gpnum, completed, c, \
+#define trace_rcu_grace_period(rcuname, gp_seq, gpevent) do { } while (0)
+#define trace_rcu_future_grace_period(rcuname, gp_seq, gp_seq_req, \
level, grplo, grphi, event) \
do { } while (0)
-#define trace_rcu_grace_period_init(rcuname, gpnum, level, grplo, grphi, \
+#define trace_rcu_grace_period_init(rcuname, gp_seq, level, grplo, grphi, \
qsmask) do { } while (0)
#define trace_rcu_exp_grace_period(rcuname, gqseq, gpevent) \
do { } while (0)
#define trace_rcu_exp_funnel_lock(rcuname, level, grplo, grphi, gpevent) \
do { } while (0)
#define trace_rcu_nocb_wake(rcuname, cpu, reason) do { } while (0)
-#define trace_rcu_preempt_task(rcuname, pid, gpnum) do { } while (0)
-#define trace_rcu_unlock_preempted_task(rcuname, gpnum, pid) do { } while (0)
-#define trace_rcu_quiescent_state_report(rcuname, gpnum, mask, qsmask, level, \
+#define trace_rcu_preempt_task(rcuname, pid, gp_seq) do { } while (0)
+#define trace_rcu_unlock_preempted_task(rcuname, gp_seq, pid) do { } while (0)
+#define trace_rcu_quiescent_state_report(rcuname, gp_seq, mask, qsmask, level, \
grplo, grphi, gp_tasks) do { } \
while (0)
-#define trace_rcu_fqs(rcuname, gpnum, cpu, qsevent) do { } while (0)
+#define trace_rcu_fqs(rcuname, gp_seq, cpu, qsevent) do { } while (0)
#define trace_rcu_dyntick(polarity, oldnesting, newnesting, dyntick) do { } while (0)
#define trace_rcu_callback(rcuname, rhp, qlen_lazy, qlen) do { } while (0)
#define trace_rcu_kfree_callback(rcuname, rhp, offset, qlen_lazy, qlen) \
diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 8402b3349dca..57bef4fbfb31 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -21,6 +21,9 @@
* Davidlohr Bueso <dave@stgolabs.net>
* Based on kernel/rcu/torture.c.
*/
+
+#define pr_fmt(fmt) fmt
+
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/kthread.h>
@@ -57,7 +60,7 @@ torture_param(int, shutdown_secs, 0, "Shutdown time (j), <= zero to disable.");
torture_param(int, stat_interval, 60,
"Number of seconds between stats printk()s");
torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=disable");
-torture_param(bool, verbose, true,
+torture_param(int, verbose, 1,
"Enable verbose debugging printk()s");
static char *torture_type = "spin_lock";
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 40cea6735c2d..4d04683c31b2 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -91,7 +91,17 @@ static inline void rcu_seq_end(unsigned long *sp)
WRITE_ONCE(*sp, rcu_seq_endval(sp));
}
-/* Take a snapshot of the update side's sequence number. */
+/*
+ * rcu_seq_snap - Take a snapshot of the update side's sequence number.
+ *
+ * This function returns the earliest value of the grace-period sequence number
+ * that will indicate that a full grace period has elapsed since the current
+ * time. Once the grace-period sequence number has reached this value, it will
+ * be safe to invoke all callbacks that have been registered prior to the
+ * current time. This value is the current grace-period number plus two to the
+ * power of the number of low-order bits reserved for state, then rounded up to
+ * the next value in which the state bits are all zero.
+ */
static inline unsigned long rcu_seq_snap(unsigned long *sp)
{
unsigned long s;
@@ -108,6 +118,15 @@ static inline unsigned long rcu_seq_current(unsigned long *sp)
}
/*
+ * Given a snapshot from rcu_seq_snap(), determine whether or not the
+ * corresponding update-side operation has started.
+ */
+static inline bool rcu_seq_started(unsigned long *sp, unsigned long s)
+{
+ return ULONG_CMP_LT((s - 1) & ~RCU_SEQ_STATE_MASK, READ_ONCE(*sp));
+}
+
+/*
* Given a snapshot from rcu_seq_snap(), determine whether or not a
* full update-side operation has occurred.
*/
@@ -117,6 +136,45 @@ static inline bool rcu_seq_done(unsigned long *sp, unsigned long s)
}
/*
+ * Has a grace period completed since the time the old gp_seq was collected?
+ */
+static inline bool rcu_seq_completed_gp(unsigned long old, unsigned long new)
+{
+ return ULONG_CMP_LT(old, new & ~RCU_SEQ_STATE_MASK);
+}
+
+/*
+ * Has a grace period started since the time the old gp_seq was collected?
+ */
+static inline bool rcu_seq_new_gp(unsigned long old, unsigned long new)
+{
+ return ULONG_CMP_LT((old + RCU_SEQ_STATE_MASK) & ~RCU_SEQ_STATE_MASK,
+ new);
+}
+
+/*
+ * Roughly how many full grace periods have elapsed between the collection
+ * of the two specified grace periods?
+ */
+static inline unsigned long rcu_seq_diff(unsigned long new, unsigned long old)
+{
+ unsigned long rnd_diff;
+
+ if (old == new)
+ return 0;
+ /*
+ * Compute the number of grace periods (still shifted up), plus
+ * one if either of new and old is not an exact grace period.
+ */
+ rnd_diff = (new & ~RCU_SEQ_STATE_MASK) -
+ ((old + RCU_SEQ_STATE_MASK) & ~RCU_SEQ_STATE_MASK) +
+ ((new & RCU_SEQ_STATE_MASK) || (old & RCU_SEQ_STATE_MASK));
+ if (ULONG_CMP_GE(RCU_SEQ_STATE_MASK, rnd_diff))
+ return 1; /* Definitely no grace period has elapsed. */
+ return ((rnd_diff - RCU_SEQ_STATE_MASK - 1) >> RCU_SEQ_CTR_SHIFT) + 2;
+}
+
+/*
* debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally
* by call_rcu() and rcu callback execution, and are therefore not part of the
* RCU API. Leaving in rcupdate.h because they are used by all RCU flavors.
@@ -276,6 +334,9 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt)
/* Is this rcu_node a leaf? */
#define rcu_is_leaf_node(rnp) ((rnp)->level == rcu_num_lvls - 1)
+/* Is this rcu_node the last leaf? */
+#define rcu_is_last_leaf_node(rsp, rnp) ((rnp) == &(rsp)->node[rcu_num_nodes - 1])
+
/*
* Do a full breadth-first scan of the rcu_node structures for the
* specified rcu_state structure.
@@ -405,8 +466,7 @@ enum rcutorture_type {
#if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU)
void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
- unsigned long *gpnum, unsigned long *completed);
-void rcutorture_record_test_transition(void);
+ unsigned long *gp_seq);
void rcutorture_record_progress(unsigned long vernum);
void do_trace_rcu_torture_read(const char *rcutorturename,
struct rcu_head *rhp,
@@ -415,15 +475,11 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
unsigned long c);
#else
static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
- int *flags,
- unsigned long *gpnum,
- unsigned long *completed)
+ int *flags, unsigned long *gp_seq)
{
*flags = 0;
- *gpnum = 0;
- *completed = 0;
+ *gp_seq = 0;
}
-static inline void rcutorture_record_test_transition(void) { }
static inline void rcutorture_record_progress(unsigned long vernum) { }
#ifdef CONFIG_RCU_TRACE
void do_trace_rcu_torture_read(const char *rcutorturename,
@@ -441,31 +497,26 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
static inline void srcutorture_get_gp_data(enum rcutorture_type test_type,
struct srcu_struct *sp, int *flags,
- unsigned long *gpnum,
- unsigned long *completed)
+ unsigned long *gp_seq)
{
if (test_type != SRCU_FLAVOR)
return;
*flags = 0;
- *completed = sp->srcu_idx;
- *gpnum = *completed;
+ *gp_seq = sp->srcu_idx;
}
#elif defined(CONFIG_TREE_SRCU)
void srcutorture_get_gp_data(enum rcutorture_type test_type,
struct srcu_struct *sp, int *flags,
- unsigned long *gpnum, unsigned long *completed);
+ unsigned long *gp_seq);
#endif
#ifdef CONFIG_TINY_RCU
-static inline unsigned long rcu_batches_started(void) { return 0; }
-static inline unsigned long rcu_batches_started_bh(void) { return 0; }
-static inline unsigned long rcu_batches_started_sched(void) { return 0; }
-static inline unsigned long rcu_batches_completed(void) { return 0; }
-static inline unsigned long rcu_batches_completed_bh(void) { return 0; }
-static inline unsigned long rcu_batches_completed_sched(void) { return 0; }
+static inline unsigned long rcu_get_gp_seq(void) { return 0; }
+static inline unsigned long rcu_bh_get_gp_seq(void) { return 0; }
+static inline unsigned long rcu_sched_get_gp_seq(void) { return 0; }
static inline unsigned long rcu_exp_batches_completed(void) { return 0; }
static inline unsigned long rcu_exp_batches_completed_sched(void) { return 0; }
static inline unsigned long
@@ -474,19 +525,16 @@ static inline void rcu_force_quiescent_state(void) { }
static inline void rcu_bh_force_quiescent_state(void) { }
static inline void rcu_sched_force_quiescent_state(void) { }
static inline void show_rcu_gp_kthreads(void) { }
+static inline int rcu_get_gp_kthreads_prio(void) { return 0; }
#else /* #ifdef CONFIG_TINY_RCU */
-extern unsigned long rcutorture_testseq;
-extern unsigned long rcutorture_vernum;
-unsigned long rcu_batches_started(void);
-unsigned long rcu_batches_started_bh(void);
-unsigned long rcu_batches_started_sched(void);
-unsigned long rcu_batches_completed(void);
-unsigned long rcu_batches_completed_bh(void);
-unsigned long rcu_batches_completed_sched(void);
+unsigned long rcu_get_gp_seq(void);
+unsigned long rcu_bh_get_gp_seq(void);
+unsigned long rcu_sched_get_gp_seq(void);
unsigned long rcu_exp_batches_completed(void);
unsigned long rcu_exp_batches_completed_sched(void);
unsigned long srcu_batches_completed(struct srcu_struct *sp);
void show_rcu_gp_kthreads(void);
+int rcu_get_gp_kthreads_prio(void);
void rcu_force_quiescent_state(void);
void rcu_bh_force_quiescent_state(void);
void rcu_sched_force_quiescent_state(void);
diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index e232846516b3..34244523550e 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -19,6 +19,9 @@
*
* Authors: Paul E. McKenney <paulmck@us.ibm.com>
*/
+
+#define pr_fmt(fmt) fmt
+
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/init.h>
@@ -88,7 +91,7 @@ torture_param(int, nreaders, -1, "Number of RCU reader threads");
torture_param(int, nwriters, -1, "Number of RCU updater threads");
torture_param(bool, shutdown, !IS_ENABLED(MODULE),
"Shutdown at end of performance tests.");
-torture_param(bool, verbose, true, "Enable verbose debugging printk()s");
+torture_param(int, verbose, 1, "Enable verbose debugging printk()s");
torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable");
static char *perf_type = "rcu";
@@ -135,8 +138,8 @@ struct rcu_perf_ops {
void (*cleanup)(void);
int (*readlock)(void);
void (*readunlock)(int idx);
- unsigned long (*started)(void);
- unsigned long (*completed)(void);
+ unsigned long (*get_gp_seq)(void);
+ unsigned long (*gp_diff)(unsigned long new, unsigned long old);
unsigned long (*exp_completed)(void);
void (*async)(struct rcu_head *head, rcu_callback_t func);
void (*gp_barrier)(void);
@@ -176,8 +179,8 @@ static struct rcu_perf_ops rcu_ops = {
.init = rcu_sync_perf_init,
.readlock = rcu_perf_read_lock,
.readunlock = rcu_perf_read_unlock,
- .started = rcu_batches_started,
- .completed = rcu_batches_completed,
+ .get_gp_seq = rcu_get_gp_seq,
+ .gp_diff = rcu_seq_diff,
.exp_completed = rcu_exp_batches_completed,
.async = call_rcu,
.gp_barrier = rcu_barrier,
@@ -206,8 +209,8 @@ static struct rcu_perf_ops rcu_bh_ops = {
.init = rcu_sync_perf_init,
.readlock = rcu_bh_perf_read_lock,
.readunlock = rcu_bh_perf_read_unlock,
- .started = rcu_batches_started_bh,
- .completed = rcu_batches_completed_bh,
+ .get_gp_seq = rcu_bh_get_gp_seq,
+ .gp_diff = rcu_seq_diff,
.exp_completed = rcu_exp_batches_completed_sched,
.async = call_rcu_bh,
.gp_barrier = rcu_barrier_bh,
@@ -263,8 +266,8 @@ static struct rcu_perf_ops srcu_ops = {
.init = rcu_sync_perf_init,
.readlock = srcu_perf_read_lock,
.readunlock = srcu_perf_read_unlock,
- .started = NULL,
- .completed = srcu_perf_completed,
+ .get_gp_seq = srcu_perf_completed,
+ .gp_diff = rcu_seq_diff,
.exp_completed = srcu_perf_completed,
.async = srcu_call_rcu,
.gp_barrier = srcu_rcu_barrier,
@@ -292,8 +295,8 @@ static struct rcu_perf_ops srcud_ops = {
.cleanup = srcu_sync_perf_cleanup,
.readlock = srcu_perf_read_lock,
.readunlock = srcu_perf_read_unlock,
- .started = NULL,
- .completed = srcu_perf_completed,
+ .get_gp_seq = srcu_perf_completed,
+ .gp_diff = rcu_seq_diff,
.exp_completed = srcu_perf_completed,
.async = srcu_call_rcu,
.gp_barrier = srcu_rcu_barrier,
@@ -322,8 +325,8 @@ static struct rcu_perf_ops sched_ops = {
.init = rcu_sync_perf_init,
.readlock = sched_perf_read_lock,
.readunlock = sched_perf_read_unlock,
- .started = rcu_batches_started_sched,
- .completed = rcu_batches_completed_sched,
+ .get_gp_seq = rcu_sched_get_gp_seq,
+ .gp_diff = rcu_seq_diff,
.exp_completed = rcu_exp_batches_completed_sched,
.async = call_rcu_sched,
.gp_barrier = rcu_barrier_sched,
@@ -350,8 +353,8 @@ static struct rcu_perf_ops tasks_ops = {
.init = rcu_sync_perf_init,
.readlock = tasks_perf_read_lock,
.readunlock = tasks_perf_read_unlock,
- .started = rcu_no_completed,
- .completed = rcu_no_completed,
+ .get_gp_seq = rcu_no_completed,
+ .gp_diff = rcu_seq_diff,
.async = call_rcu_tasks,
.gp_barrier = rcu_barrier_tasks,
.sync = synchronize_rcu_tasks,
@@ -359,9 +362,11 @@ static struct rcu_perf_ops tasks_ops = {
.name = "tasks"
};
-static bool __maybe_unused torturing_tasks(void)
+static unsigned long rcuperf_seq_diff(unsigned long new, unsigned long old)
{
- return cur_ops == &tasks_ops;
+ if (!cur_ops->gp_diff)
+ return new - old;
+ return cur_ops->gp_diff(new, old);
}
/*
@@ -444,8 +449,7 @@ rcu_perf_writer(void *arg)
b_rcu_perf_writer_started =
cur_ops->exp_completed() / 2;
} else {
- b_rcu_perf_writer_started =
- cur_ops->completed();
+ b_rcu_perf_writer_started = cur_ops->get_gp_seq();
}
}
@@ -502,7 +506,7 @@ retry:
cur_ops->exp_completed() / 2;
} else {
b_rcu_perf_writer_finished =
- cur_ops->completed();
+ cur_ops->get_gp_seq();
}
if (shutdown) {
smp_mb(); /* Assign before wake. */
@@ -527,7 +531,7 @@ retry:
return 0;
}
-static inline void
+static void
rcu_perf_print_module_parms(struct rcu_perf_ops *cur_ops, const char *tag)
{
pr_alert("%s" PERF_FLAG
@@ -582,8 +586,8 @@ rcu_perf_cleanup(void)
t_rcu_perf_writer_finished -
t_rcu_perf_writer_started,
ngps,
- b_rcu_perf_writer_finished -
- b_rcu_perf_writer_started);
+ rcuperf_seq_diff(b_rcu_perf_writer_finished,
+ b_rcu_perf_writer_started));
for (i = 0; i < nrealwriters; i++) {
if (!writer_durations)
break;
@@ -671,12 +675,11 @@ rcu_perf_init(void)
break;
}
if (i == ARRAY_SIZE(perf_ops)) {
- pr_alert("rcu-perf: invalid perf type: \"%s\"\n",
- perf_type);
+ pr_alert("rcu-perf: invalid perf type: \"%s\"\n", perf_type);
pr_alert("rcu-perf types:");
for (i = 0; i < ARRAY_SIZE(perf_ops); i++)
- pr_alert(" %s", perf_ops[i]->name);
- pr_alert("\n");
+ pr_cont(" %s", perf_ops[i]->name);
+ pr_cont("\n");
firsterr = -EINVAL;
goto unwind;
}
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 42fcb7f05fac..c596c6f1e457 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -22,6 +22,9 @@
*
* See also: Documentation/RCU/torture.txt
*/
+
+#define pr_fmt(fmt) fmt
+
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/init.h>
@@ -52,6 +55,7 @@
#include <linux/torture.h>
#include <linux/vmalloc.h>
#include <linux/sched/debug.h>
+#include <linux/sched/sysctl.h>
#include "rcu.h"
@@ -59,6 +63,19 @@ MODULE_LICENSE("GPL");
MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com> and Josh Triplett <josh@joshtriplett.org>");
+/* Bits for ->extendables field, extendables param, and related definitions. */
+#define RCUTORTURE_RDR_SHIFT 8 /* Put SRCU index in upper bits. */
+#define RCUTORTURE_RDR_MASK ((1 << RCUTORTURE_RDR_SHIFT) - 1)
+#define RCUTORTURE_RDR_BH 0x1 /* Extend readers by disabling bh. */
+#define RCUTORTURE_RDR_IRQ 0x2 /* ... disabling interrupts. */
+#define RCUTORTURE_RDR_PREEMPT 0x4 /* ... disabling preemption. */
+#define RCUTORTURE_RDR_RCU 0x8 /* ... entering another RCU reader. */
+#define RCUTORTURE_RDR_NBITS 4 /* Number of bits defined above. */
+#define RCUTORTURE_MAX_EXTEND (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | \
+ RCUTORTURE_RDR_PREEMPT)
+#define RCUTORTURE_RDR_MAX_LOOPS 0x7 /* Maximum reader extensions. */
+ /* Must be power of two minus one. */
+
torture_param(int, cbflood_inter_holdoff, HZ,
"Holdoff between floods (jiffies)");
torture_param(int, cbflood_intra_holdoff, 1,
@@ -66,6 +83,8 @@ torture_param(int, cbflood_intra_holdoff, 1,
torture_param(int, cbflood_n_burst, 3, "# bursts in flood, zero to disable");
torture_param(int, cbflood_n_per_burst, 20000,
"# callbacks per burst in flood");
+torture_param(int, extendables, RCUTORTURE_MAX_EXTEND,
+ "Extend readers by disabling bh (1), irqs (2), or preempt (4)");
torture_param(int, fqs_duration, 0,
"Duration of fqs bursts (us), 0 to disable");
torture_param(int, fqs_holdoff, 0, "Holdoff time within fqs bursts (us)");
@@ -84,7 +103,7 @@ torture_param(int, object_debug, 0,
"Enable debug-object double call_rcu() testing");
torture_param(int, onoff_holdoff, 0, "Time after boot before CPU hotplugs (s)");
torture_param(int, onoff_interval, 0,
- "Time between CPU hotplugs (s), 0=disable");
+ "Time between CPU hotplugs (jiffies), 0=disable");
torture_param(int, shuffle_interval, 3, "Number of seconds between shuffles");
torture_param(int, shutdown_secs, 0, "Shutdown time (s), <= zero to disable.");
torture_param(int, stall_cpu, 0, "Stall duration (s), zero to disable.");
@@ -101,7 +120,7 @@ torture_param(int, test_boost_interval, 7,
"Interval between boost tests, seconds.");
torture_param(bool, test_no_idle_hz, true,
"Test support for tickless idle CPUs");
-torture_param(bool, verbose, true,
+torture_param(int, verbose, 1,
"Enable verbose debugging printk()s");
static char *torture_type = "rcu";
@@ -148,9 +167,9 @@ static long n_rcu_torture_boost_ktrerror;
static long n_rcu_torture_boost_rterror;
static long n_rcu_torture_boost_failure;
static long n_rcu_torture_boosts;
-static long n_rcu_torture_timers;
+static atomic_long_t n_rcu_torture_timers;
static long n_barrier_attempts;
-static long n_barrier_successes;
+static long n_barrier_successes; /* did rcu_barrier test succeed? */
static atomic_long_t n_cbfloods;
static struct list_head rcu_torture_removed;
@@ -261,8 +280,8 @@ struct rcu_torture_ops {
int (*readlock)(void);
void (*read_delay)(struct torture_random_state *rrsp);
void (*readunlock)(int idx);
- unsigned long (*started)(void);
- unsigned long (*completed)(void);
+ unsigned long (*get_gp_seq)(void);
+ unsigned long (*gp_diff)(unsigned long new, unsigned long old);
void (*deferred_free)(struct rcu_torture *p);
void (*sync)(void);
void (*exp_sync)(void);
@@ -274,6 +293,8 @@ struct rcu_torture_ops {
void (*stats)(void);
int irq_capable;
int can_boost;
+ int extendables;
+ int ext_irq_conflict;
const char *name;
};
@@ -302,10 +323,10 @@ static void rcu_read_delay(struct torture_random_state *rrsp)
* force_quiescent_state. */
if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms))) {
- started = cur_ops->completed();
+ started = cur_ops->get_gp_seq();
ts = rcu_trace_clock_local();
mdelay(longdelay_ms);
- completed = cur_ops->completed();
+ completed = cur_ops->get_gp_seq();
do_trace_rcu_torture_read(cur_ops->name, NULL, ts,
started, completed);
}
@@ -397,8 +418,8 @@ static struct rcu_torture_ops rcu_ops = {
.readlock = rcu_torture_read_lock,
.read_delay = rcu_read_delay,
.readunlock = rcu_torture_read_unlock,
- .started = rcu_batches_started,
- .completed = rcu_batches_completed,
+ .get_gp_seq = rcu_get_gp_seq,
+ .gp_diff = rcu_seq_diff,
.deferred_free = rcu_torture_deferred_free,
.sync = synchronize_rcu,
.exp_sync = synchronize_rcu_expedited,
@@ -439,8 +460,8 @@ static struct rcu_torture_ops rcu_bh_ops = {
.readlock = rcu_bh_torture_read_lock,
.read_delay = rcu_read_delay, /* just reuse rcu's version. */
.readunlock = rcu_bh_torture_read_unlock,
- .started = rcu_batches_started_bh,
- .completed = rcu_batches_completed_bh,
+ .get_gp_seq = rcu_bh_get_gp_seq,
+ .gp_diff = rcu_seq_diff,
.deferred_free = rcu_bh_torture_deferred_free,
.sync = synchronize_rcu_bh,
.exp_sync = synchronize_rcu_bh_expedited,
@@ -449,6 +470,8 @@ static struct rcu_torture_ops rcu_bh_ops = {
.fqs = rcu_bh_force_quiescent_state,
.stats = NULL,
.irq_capable = 1,
+ .extendables = (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ),
+ .ext_irq_conflict = RCUTORTURE_RDR_RCU,
.name = "rcu_bh"
};
@@ -483,8 +506,7 @@ static struct rcu_torture_ops rcu_busted_ops = {
.readlock = rcu_torture_read_lock,
.read_delay = rcu_read_delay, /* just reuse rcu's version. */
.readunlock = rcu_torture_read_unlock,
- .started = rcu_no_completed,
- .completed = rcu_no_completed,
+ .get_gp_seq = rcu_no_completed,
.deferred_free = rcu_busted_torture_deferred_free,
.sync = synchronize_rcu_busted,
.exp_sync = synchronize_rcu_busted,
@@ -572,8 +594,7 @@ static struct rcu_torture_ops srcu_ops = {
.readlock = srcu_torture_read_lock,
.read_delay = srcu_read_delay,
.readunlock = srcu_torture_read_unlock,
- .started = NULL,
- .completed = srcu_torture_completed,
+ .get_gp_seq = srcu_torture_completed,
.deferred_free = srcu_torture_deferred_free,
.sync = srcu_torture_synchronize,
.exp_sync = srcu_torture_synchronize_expedited,
@@ -610,8 +631,7 @@ static struct rcu_torture_ops srcud_ops = {
.readlock = srcu_torture_read_lock,
.read_delay = srcu_read_delay,
.readunlock = srcu_torture_read_unlock,
- .started = NULL,
- .completed = srcu_torture_completed,
+ .get_gp_seq = srcu_torture_completed,
.deferred_free = srcu_torture_deferred_free,
.sync = srcu_torture_synchronize,
.exp_sync = srcu_torture_synchronize_expedited,
@@ -622,6 +642,26 @@ static struct rcu_torture_ops srcud_ops = {
.name = "srcud"
};
+/* As above, but broken due to inappropriate reader extension. */
+static struct rcu_torture_ops busted_srcud_ops = {
+ .ttype = SRCU_FLAVOR,
+ .init = srcu_torture_init,
+ .cleanup = srcu_torture_cleanup,
+ .readlock = srcu_torture_read_lock,
+ .read_delay = rcu_read_delay,
+ .readunlock = srcu_torture_read_unlock,
+ .get_gp_seq = srcu_torture_completed,
+ .deferred_free = srcu_torture_deferred_free,
+ .sync = srcu_torture_synchronize,
+ .exp_sync = srcu_torture_synchronize_expedited,
+ .call = srcu_torture_call,
+ .cb_barrier = srcu_torture_barrier,
+ .stats = srcu_torture_stats,
+ .irq_capable = 1,
+ .extendables = RCUTORTURE_MAX_EXTEND,
+ .name = "busted_srcud"
+};
+
/*
* Definitions for sched torture testing.
*/
@@ -648,8 +688,8 @@ static struct rcu_torture_ops sched_ops = {
.readlock = sched_torture_read_lock,
.read_delay = rcu_read_delay, /* just reuse rcu's version. */
.readunlock = sched_torture_read_unlock,
- .started = rcu_batches_started_sched,
- .completed = rcu_batches_completed_sched,
+ .get_gp_seq = rcu_sched_get_gp_seq,
+ .gp_diff = rcu_seq_diff,
.deferred_free = rcu_sched_torture_deferred_free,
.sync = synchronize_sched,
.exp_sync = synchronize_sched_expedited,
@@ -660,6 +700,7 @@ static struct rcu_torture_ops sched_ops = {
.fqs = rcu_sched_force_quiescent_state,
.stats = NULL,
.irq_capable = 1,
+ .extendables = RCUTORTURE_MAX_EXTEND,
.name = "sched"
};
@@ -687,8 +728,7 @@ static struct rcu_torture_ops tasks_ops = {
.readlock = tasks_torture_read_lock,
.read_delay = rcu_read_delay, /* just reuse rcu's version. */
.readunlock = tasks_torture_read_unlock,
- .started = rcu_no_completed,
- .completed = rcu_no_completed,
+ .get_gp_seq = rcu_no_completed,
.deferred_free = rcu_tasks_torture_deferred_free,
.sync = synchronize_rcu_tasks,
.exp_sync = synchronize_rcu_tasks,
@@ -700,6 +740,13 @@ static struct rcu_torture_ops tasks_ops = {
.name = "tasks"
};
+static unsigned long rcutorture_seq_diff(unsigned long new, unsigned long old)
+{
+ if (!cur_ops->gp_diff)
+ return new - old;
+ return cur_ops->gp_diff(new, old);
+}
+
static bool __maybe_unused torturing_tasks(void)
{
return cur_ops == &tasks_ops;
@@ -726,6 +773,44 @@ static void rcu_torture_boost_cb(struct rcu_head *head)
smp_store_release(&rbip->inflight, 0);
}
+static int old_rt_runtime = -1;
+
+static void rcu_torture_disable_rt_throttle(void)
+{
+ /*
+ * Disable RT throttling so that rcutorture's boost threads don't get
+ * throttled. Only possible if rcutorture is built-in otherwise the
+ * user should manually do this by setting the sched_rt_period_us and
+ * sched_rt_runtime sysctls.
+ */
+ if (!IS_BUILTIN(CONFIG_RCU_TORTURE_TEST) || old_rt_runtime != -1)
+ return;
+
+ old_rt_runtime = sysctl_sched_rt_runtime;
+ sysctl_sched_rt_runtime = -1;
+}
+
+static void rcu_torture_enable_rt_throttle(void)
+{
+ if (!IS_BUILTIN(CONFIG_RCU_TORTURE_TEST) || old_rt_runtime == -1)
+ return;
+
+ sysctl_sched_rt_runtime = old_rt_runtime;
+ old_rt_runtime = -1;
+}
+
+static bool rcu_torture_boost_failed(unsigned long start, unsigned long end)
+{
+ if (end - start > test_boost_duration * HZ - HZ / 2) {
+ VERBOSE_TOROUT_STRING("rcu_torture_boost boosting failed");
+ n_rcu_torture_boost_failure++;
+
+ return true; /* failed */
+ }
+
+ return false; /* passed */
+}
+
static int rcu_torture_boost(void *arg)
{
unsigned long call_rcu_time;
@@ -746,6 +831,21 @@ static int rcu_torture_boost(void *arg)
init_rcu_head_on_stack(&rbi.rcu);
/* Each pass through the following loop does one boost-test cycle. */
do {
+ /* Track if the test failed already in this test interval? */
+ bool failed = false;
+
+ /* Increment n_rcu_torture_boosts once per boost-test */
+ while (!kthread_should_stop()) {
+ if (mutex_trylock(&boost_mutex)) {
+ n_rcu_torture_boosts++;
+ mutex_unlock(&boost_mutex);
+ break;
+ }
+ schedule_timeout_uninterruptible(1);
+ }
+ if (kthread_should_stop())
+ goto checkwait;
+
/* Wait for the next test interval. */
oldstarttime = boost_starttime;
while (ULONG_CMP_LT(jiffies, oldstarttime)) {
@@ -764,11 +864,10 @@ static int rcu_torture_boost(void *arg)
/* RCU core before ->inflight = 1. */
smp_store_release(&rbi.inflight, 1);
call_rcu(&rbi.rcu, rcu_torture_boost_cb);
- if (jiffies - call_rcu_time >
- test_boost_duration * HZ - HZ / 2) {
- VERBOSE_TOROUT_STRING("rcu_torture_boost boosting failed");
- n_rcu_torture_boost_failure++;
- }
+ /* Check if the boost test failed */
+ failed = failed ||
+ rcu_torture_boost_failed(call_rcu_time,
+ jiffies);
call_rcu_time = jiffies;
}
stutter_wait("rcu_torture_boost");
@@ -777,6 +876,14 @@ static int rcu_torture_boost(void *arg)
}
/*
+ * If boost never happened, then inflight will always be 1, in
+ * this case the boost check would never happen in the above
+ * loop so do another one here.
+ */
+ if (!failed && smp_load_acquire(&rbi.inflight))
+ rcu_torture_boost_failed(call_rcu_time, jiffies);
+
+ /*
* Set the start time of the next test interval.
* Yes, this is vulnerable to long delays, but such
* delays simply cause a false negative for the next
@@ -788,7 +895,6 @@ static int rcu_torture_boost(void *arg)
if (mutex_trylock(&boost_mutex)) {
boost_starttime = jiffies +
test_boost_interval * HZ;
- n_rcu_torture_boosts++;
mutex_unlock(&boost_mutex);
break;
}
@@ -1010,7 +1116,7 @@ rcu_torture_writer(void *arg)
break;
}
}
- rcutorture_record_progress(++rcu_torture_current_version);
+ rcu_torture_current_version++;
/* Cycle through nesting levels of rcu_expedite_gp() calls. */
if (can_expedite &&
!(torture_random(&rand) & 0xff & (!!expediting - 1))) {
@@ -1084,27 +1190,133 @@ static void rcu_torture_timer_cb(struct rcu_head *rhp)
}
/*
- * RCU torture reader from timer handler. Dereferences rcu_torture_current,
- * incrementing the corresponding element of the pipeline array. The
- * counter in the element should never be greater than 1, otherwise, the
- * RCU implementation is broken.
+ * Do one extension of an RCU read-side critical section using the
+ * current reader state in readstate (set to zero for initial entry
+ * to extended critical section), set the new state as specified by
+ * newstate (set to zero for final exit from extended critical section),
+ * and random-number-generator state in trsp. If this is neither the
+ * beginning or end of the critical section and if there was actually a
+ * change, do a ->read_delay().
*/
-static void rcu_torture_timer(struct timer_list *unused)
+static void rcutorture_one_extend(int *readstate, int newstate,
+ struct torture_random_state *trsp)
+{
+ int idxnew = -1;
+ int idxold = *readstate;
+ int statesnew = ~*readstate & newstate;
+ int statesold = *readstate & ~newstate;
+
+ WARN_ON_ONCE(idxold < 0);
+ WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1);
+
+ /* First, put new protection in place to avoid critical-section gap. */
+ if (statesnew & RCUTORTURE_RDR_BH)
+ local_bh_disable();
+ if (statesnew & RCUTORTURE_RDR_IRQ)
+ local_irq_disable();
+ if (statesnew & RCUTORTURE_RDR_PREEMPT)
+ preempt_disable();
+ if (statesnew & RCUTORTURE_RDR_RCU)
+ idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
+
+ /* Next, remove old protection, irq first due to bh conflict. */
+ if (statesold & RCUTORTURE_RDR_IRQ)
+ local_irq_enable();
+ if (statesold & RCUTORTURE_RDR_BH)
+ local_bh_enable();
+ if (statesold & RCUTORTURE_RDR_PREEMPT)
+ preempt_enable();
+ if (statesold & RCUTORTURE_RDR_RCU)
+ cur_ops->readunlock(idxold >> RCUTORTURE_RDR_SHIFT);
+
+ /* Delay if neither beginning nor end and there was a change. */
+ if ((statesnew || statesold) && *readstate && newstate)
+ cur_ops->read_delay(trsp);
+
+ /* Update the reader state. */
+ if (idxnew == -1)
+ idxnew = idxold & ~RCUTORTURE_RDR_MASK;
+ WARN_ON_ONCE(idxnew < 0);
+ WARN_ON_ONCE((idxnew >> RCUTORTURE_RDR_SHIFT) > 1);
+ *readstate = idxnew | newstate;
+ WARN_ON_ONCE((*readstate >> RCUTORTURE_RDR_SHIFT) < 0);
+ WARN_ON_ONCE((*readstate >> RCUTORTURE_RDR_SHIFT) > 1);
+}
+
+/* Return the biggest extendables mask given current RCU and boot parameters. */
+static int rcutorture_extend_mask_max(void)
+{
+ int mask;
+
+ WARN_ON_ONCE(extendables & ~RCUTORTURE_MAX_EXTEND);
+ mask = extendables & RCUTORTURE_MAX_EXTEND & cur_ops->extendables;
+ mask = mask | RCUTORTURE_RDR_RCU;
+ return mask;
+}
+
+/* Return a random protection state mask, but with at least one bit set. */
+static int
+rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
+{
+ int mask = rcutorture_extend_mask_max();
+ unsigned long randmask1 = torture_random(trsp) >> 8;
+ unsigned long randmask2 = randmask1 >> 1;
+
+ WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
+ /* Half the time lots of bits, half the time only one bit. */
+ if (randmask1 & 0x1)
+ mask = mask & randmask2;
+ else
+ mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
+ if ((mask & RCUTORTURE_RDR_IRQ) &&
+ !(mask & RCUTORTURE_RDR_BH) &&
+ (oldmask & RCUTORTURE_RDR_BH))
+ mask |= RCUTORTURE_RDR_BH; /* Can't enable bh w/irq disabled. */
+ if ((mask & RCUTORTURE_RDR_IRQ) &&
+ !(mask & cur_ops->ext_irq_conflict) &&
+ (oldmask & cur_ops->ext_irq_conflict))
+ mask |= cur_ops->ext_irq_conflict; /* Or if readers object. */
+ return mask ?: RCUTORTURE_RDR_RCU;
+}
+
+/*
+ * Do a randomly selected number of extensions of an existing RCU read-side
+ * critical section.
+ */
+static void rcutorture_loop_extend(int *readstate,
+ struct torture_random_state *trsp)
+{
+ int i;
+ int mask = rcutorture_extend_mask_max();
+
+ WARN_ON_ONCE(!*readstate); /* -Existing- RCU read-side critsect! */
+ if (!((mask - 1) & mask))
+ return; /* Current RCU flavor not extendable. */
+ i = (torture_random(trsp) >> 3) & RCUTORTURE_RDR_MAX_LOOPS;
+ while (i--) {
+ mask = rcutorture_extend_mask(*readstate, trsp);
+ rcutorture_one_extend(readstate, mask, trsp);
+ }
+}
+
+/*
+ * Do one read-side critical section, returning false if there was
+ * no data to read. Can be invoked both from process context and
+ * from a timer handler.
+ */
+static bool rcu_torture_one_read(struct torture_random_state *trsp)
{
- int idx;
unsigned long started;
unsigned long completed;
- static DEFINE_TORTURE_RANDOM(rand);
- static DEFINE_SPINLOCK(rand_lock);
+ int newstate;
struct rcu_torture *p;
int pipe_count;
+ int readstate = 0;
unsigned long long ts;
- idx = cur_ops->readlock();
- if (cur_ops->started)
- started = cur_ops->started();
- else
- started = cur_ops->completed();
+ newstate = rcutorture_extend_mask(readstate, trsp);
+ rcutorture_one_extend(&readstate, newstate, trsp);
+ started = cur_ops->get_gp_seq();
ts = rcu_trace_clock_local();
p = rcu_dereference_check(rcu_torture_current,
rcu_read_lock_bh_held() ||
@@ -1112,39 +1324,50 @@ static void rcu_torture_timer(struct timer_list *unused)
srcu_read_lock_held(srcu_ctlp) ||
torturing_tasks());
if (p == NULL) {
- /* Leave because rcu_torture_writer is not yet underway */
- cur_ops->readunlock(idx);
- return;
+ /* Wait for rcu_torture_writer to get underway */
+ rcutorture_one_extend(&readstate, 0, trsp);
+ return false;
}
if (p->rtort_mbtest == 0)
atomic_inc(&n_rcu_torture_mberror);
- spin_lock(&rand_lock);
- cur_ops->read_delay(&rand);
- n_rcu_torture_timers++;
- spin_unlock(&rand_lock);
+ rcutorture_loop_extend(&readstate, trsp);
preempt_disable();
pipe_count = p->rtort_pipe_count;
if (pipe_count > RCU_TORTURE_PIPE_LEN) {
/* Should not happen, but... */
pipe_count = RCU_TORTURE_PIPE_LEN;
}
- completed = cur_ops->completed();
+ completed = cur_ops->get_gp_seq();
if (pipe_count > 1) {
- do_trace_rcu_torture_read(cur_ops->name, &p->rtort_rcu, ts,
- started, completed);
+ do_trace_rcu_torture_read(cur_ops->name, &p->rtort_rcu,
+ ts, started, completed);
rcu_ftrace_dump(DUMP_ALL);
}
__this_cpu_inc(rcu_torture_count[pipe_count]);
- completed = completed - started;
- if (cur_ops->started)
- completed++;
+ completed = rcutorture_seq_diff(completed, started);
if (completed > RCU_TORTURE_PIPE_LEN) {
/* Should not happen, but... */
completed = RCU_TORTURE_PIPE_LEN;
}
__this_cpu_inc(rcu_torture_batch[completed]);
preempt_enable();
- cur_ops->readunlock(idx);
+ rcutorture_one_extend(&readstate, 0, trsp);
+ WARN_ON_ONCE(readstate & RCUTORTURE_RDR_MASK);
+ return true;
+}
+
+static DEFINE_TORTURE_RANDOM_PERCPU(rcu_torture_timer_rand);
+
+/*
+ * RCU torture reader from timer handler. Dereferences rcu_torture_current,
+ * incrementing the corresponding element of the pipeline array. The
+ * counter in the element should never be greater than 1, otherwise, the
+ * RCU implementation is broken.
+ */
+static void rcu_torture_timer(struct timer_list *unused)
+{
+ atomic_long_inc(&n_rcu_torture_timers);
+ (void)rcu_torture_one_read(this_cpu_ptr(&rcu_torture_timer_rand));
/* Test call_rcu() invocation from interrupt handler. */
if (cur_ops->call) {
@@ -1164,14 +1387,8 @@ static void rcu_torture_timer(struct timer_list *unused)
static int
rcu_torture_reader(void *arg)
{
- unsigned long started;
- unsigned long completed;
- int idx;
DEFINE_TORTURE_RANDOM(rand);
- struct rcu_torture *p;
- int pipe_count;
struct timer_list t;
- unsigned long long ts;
VERBOSE_TOROUT_STRING("rcu_torture_reader task started");
set_user_nice(current, MAX_NICE);
@@ -1183,49 +1400,8 @@ rcu_torture_reader(void *arg)
if (!timer_pending(&t))
mod_timer(&t, jiffies + 1);
}
- idx = cur_ops->readlock();
- if (cur_ops->started)
- started = cur_ops->started();
- else
- started = cur_ops->completed();
- ts = rcu_trace_clock_local();
- p = rcu_dereference_check(rcu_torture_current,
- rcu_read_lock_bh_held() ||
- rcu_read_lock_sched_held() ||
- srcu_read_lock_held(srcu_ctlp) ||
- torturing_tasks());
- if (p == NULL) {
- /* Wait for rcu_torture_writer to get underway */
- cur_ops->readunlock(idx);
+ if (!rcu_torture_one_read(&rand))
schedule_timeout_interruptible(HZ);
- continue;
- }
- if (p->rtort_mbtest == 0)
- atomic_inc(&n_rcu_torture_mberror);
- cur_ops->read_delay(&rand);
- preempt_disable();
- pipe_count = p->rtort_pipe_count;
- if (pipe_count > RCU_TORTURE_PIPE_LEN) {
- /* Should not happen, but... */
- pipe_count = RCU_TORTURE_PIPE_LEN;
- }
- completed = cur_ops->completed();
- if (pipe_count > 1) {
- do_trace_rcu_torture_read(cur_ops->name, &p->rtort_rcu,
- ts, started, completed);
- rcu_ftrace_dump(DUMP_ALL);
- }
- __this_cpu_inc(rcu_torture_count[pipe_count]);
- completed = completed - started;
- if (cur_ops->started)
- completed++;
- if (completed > RCU_TORTURE_PIPE_LEN) {
- /* Should not happen, but... */
- completed = RCU_TORTURE_PIPE_LEN;
- }
- __this_cpu_inc(rcu_torture_batch[completed]);
- preempt_enable();
- cur_ops->readunlock(idx);
stutter_wait("rcu_torture_reader");
} while (!torture_must_stop());
if (irqreader && cur_ops->irq_capable) {
@@ -1282,7 +1458,7 @@ rcu_torture_stats_print(void)
pr_cont("rtbf: %ld rtb: %ld nt: %ld ",
n_rcu_torture_boost_failure,
n_rcu_torture_boosts,
- n_rcu_torture_timers);
+ atomic_long_read(&n_rcu_torture_timers));
torture_onoff_stats();
pr_cont("barrier: %ld/%ld:%ld ",
n_barrier_successes,
@@ -1324,18 +1500,16 @@ rcu_torture_stats_print(void)
if (rtcv_snap == rcu_torture_current_version &&
rcu_torture_current != NULL) {
int __maybe_unused flags = 0;
- unsigned long __maybe_unused gpnum = 0;
- unsigned long __maybe_unused completed = 0;
+ unsigned long __maybe_unused gp_seq = 0;
rcutorture_get_gp_data(cur_ops->ttype,
- &flags, &gpnum, &completed);
+ &flags, &gp_seq);
srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp,
- &flags, &gpnum, &completed);
+ &flags, &gp_seq);
wtp = READ_ONCE(writer_task);
- pr_alert("??? Writer stall state %s(%d) g%lu c%lu f%#x ->state %#lx cpu %d\n",
+ pr_alert("??? Writer stall state %s(%d) g%lu f%#x ->state %#lx cpu %d\n",
rcu_torture_writer_state_getname(),
- rcu_torture_writer_state,
- gpnum, completed, flags,
+ rcu_torture_writer_state, gp_seq, flags,
wtp == NULL ? ~0UL : wtp->state,
wtp == NULL ? -1 : (int)task_cpu(wtp));
if (!splatted && wtp) {
@@ -1365,7 +1539,7 @@ rcu_torture_stats(void *arg)
return 0;
}
-static inline void
+static void
rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
{
pr_alert("%s" TORTURE_FLAG
@@ -1397,6 +1571,7 @@ static int rcutorture_booster_cleanup(unsigned int cpu)
mutex_lock(&boost_mutex);
t = boost_tasks[cpu];
boost_tasks[cpu] = NULL;
+ rcu_torture_enable_rt_throttle();
mutex_unlock(&boost_mutex);
/* This must be outside of the mutex, otherwise deadlock! */
@@ -1413,6 +1588,7 @@ static int rcutorture_booster_init(unsigned int cpu)
/* Don't allow time recalculation while creating a new task. */
mutex_lock(&boost_mutex);
+ rcu_torture_disable_rt_throttle();
VERBOSE_TOROUT_STRING("Creating rcu_torture_boost task");
boost_tasks[cpu] = kthread_create_on_node(rcu_torture_boost, NULL,
cpu_to_node(cpu),
@@ -1446,7 +1622,7 @@ static int rcu_torture_stall(void *args)
VERBOSE_TOROUT_STRING("rcu_torture_stall end holdoff");
}
if (!kthread_should_stop()) {
- stop_at = get_seconds() + stall_cpu;
+ stop_at = ktime_get_seconds() + stall_cpu;
/* RCU CPU stall is expected behavior in following code. */
rcu_read_lock();
if (stall_cpu_irqsoff)
@@ -1455,7 +1631,8 @@ static int rcu_torture_stall(void *args)
preempt_disable();
pr_alert("rcu_torture_stall start on CPU %d.\n",
smp_processor_id());
- while (ULONG_CMP_LT(get_seconds(), stop_at))
+ while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(),
+ stop_at))
continue; /* Induce RCU CPU stall warning. */
if (stall_cpu_irqsoff)
local_irq_enable();
@@ -1546,8 +1723,9 @@ static int rcu_torture_barrier(void *arg)
atomic_read(&barrier_cbs_invoked),
n_barrier_cbs);
WARN_ON_ONCE(1);
+ } else {
+ n_barrier_successes++;
}
- n_barrier_successes++;
schedule_timeout_interruptible(HZ / 10);
} while (!torture_must_stop());
torture_kthread_stopping("rcu_torture_barrier");
@@ -1610,17 +1788,39 @@ static void rcu_torture_barrier_cleanup(void)
}
}
+static bool rcu_torture_can_boost(void)
+{
+ static int boost_warn_once;
+ int prio;
+
+ if (!(test_boost == 1 && cur_ops->can_boost) && test_boost != 2)
+ return false;
+
+ prio = rcu_get_gp_kthreads_prio();
+ if (!prio)
+ return false;
+
+ if (prio < 2) {
+ if (boost_warn_once == 1)
+ return false;
+
+ pr_alert("%s: WARN: RCU kthread priority too low to test boosting. Skipping RCU boost test. Try passing rcutree.kthread_prio > 1 on the kernel command line.\n", KBUILD_MODNAME);
+ boost_warn_once = 1;
+ return false;
+ }
+
+ return true;
+}
+
static enum cpuhp_state rcutor_hp;
static void
rcu_torture_cleanup(void)
{
int flags = 0;
- unsigned long gpnum = 0;
- unsigned long completed = 0;
+ unsigned long gp_seq = 0;
int i;
- rcutorture_record_test_transition();
if (torture_cleanup_begin()) {
if (cur_ops->cb_barrier != NULL)
cur_ops->cb_barrier();
@@ -1648,17 +1848,15 @@ rcu_torture_cleanup(void)
fakewriter_tasks = NULL;
}
- rcutorture_get_gp_data(cur_ops->ttype, &flags, &gpnum, &completed);
- srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp,
- &flags, &gpnum, &completed);
- pr_alert("%s: End-test grace-period state: g%lu c%lu f%#x\n",
- cur_ops->name, gpnum, completed, flags);
+ rcutorture_get_gp_data(cur_ops->ttype, &flags, &gp_seq);
+ srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp, &flags, &gp_seq);
+ pr_alert("%s: End-test grace-period state: g%lu f%#x\n",
+ cur_ops->name, gp_seq, flags);
torture_stop_kthread(rcu_torture_stats, stats_task);
torture_stop_kthread(rcu_torture_fqs, fqs_task);
for (i = 0; i < ncbflooders; i++)
torture_stop_kthread(rcu_torture_cbflood, cbflood_task[i]);
- if ((test_boost == 1 && cur_ops->can_boost) ||
- test_boost == 2)
+ if (rcu_torture_can_boost())
cpuhp_remove_state(rcutor_hp);
/*
@@ -1746,7 +1944,7 @@ rcu_torture_init(void)
int firsterr = 0;
static struct rcu_torture_ops *torture_ops[] = {
&rcu_ops, &rcu_bh_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops,
- &sched_ops, &tasks_ops,
+ &busted_srcud_ops, &sched_ops, &tasks_ops,
};
if (!torture_init_begin(torture_type, verbose))
@@ -1763,8 +1961,8 @@ rcu_torture_init(void)
torture_type);
pr_alert("rcu-torture types:");
for (i = 0; i < ARRAY_SIZE(torture_ops); i++)
- pr_alert(" %s", torture_ops[i]->name);
- pr_alert("\n");
+ pr_cont(" %s", torture_ops[i]->name);
+ pr_cont("\n");
firsterr = -EINVAL;
goto unwind;
}
@@ -1882,8 +2080,7 @@ rcu_torture_init(void)
test_boost_interval = 1;
if (test_boost_duration < 2)
test_boost_duration = 2;
- if ((test_boost == 1 && cur_ops->can_boost) ||
- test_boost == 2) {
+ if (rcu_torture_can_boost()) {
boost_starttime = jiffies + test_boost_interval * HZ;
@@ -1897,7 +2094,7 @@ rcu_torture_init(void)
firsterr = torture_shutdown_init(shutdown_secs, rcu_torture_cleanup);
if (firsterr)
goto unwind;
- firsterr = torture_onoff_init(onoff_holdoff * HZ, onoff_interval * HZ);
+ firsterr = torture_onoff_init(onoff_holdoff * HZ, onoff_interval);
if (firsterr)
goto unwind;
firsterr = rcu_torture_stall_init();
@@ -1926,7 +2123,6 @@ rcu_torture_init(void)
goto unwind;
}
}
- rcutorture_record_test_transition();
torture_init_end();
return 0;
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index b4123d7a2cec..6c9866a854b1 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -26,6 +26,8 @@
*
*/
+#define pr_fmt(fmt) "rcu: " fmt
+
#include <linux/export.h>
#include <linux/mutex.h>
#include <linux/percpu.h>
@@ -390,7 +392,8 @@ void _cleanup_srcu_struct(struct srcu_struct *sp, bool quiesced)
}
if (WARN_ON(rcu_seq_state(READ_ONCE(sp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
WARN_ON(srcu_readers_active(sp))) {
- pr_info("%s: Active srcu_struct %p state: %d\n", __func__, sp, rcu_seq_state(READ_ONCE(sp->srcu_gp_seq)));
+ pr_info("%s: Active srcu_struct %p state: %d\n",
+ __func__, sp, rcu_seq_state(READ_ONCE(sp->srcu_gp_seq)));
return; /* Caller forgot to stop doing call_srcu()? */
}
free_percpu(sp->sda);
@@ -641,6 +644,9 @@ static void srcu_funnel_exp_start(struct srcu_struct *sp, struct srcu_node *snp,
* period s. Losers must either ensure that their desired grace-period
* number is recorded on at least their leaf srcu_node structure, or they
* must take steps to invoke their own callbacks.
+ *
+ * Note that this function also does the work of srcu_funnel_exp_start(),
+ * in some cases by directly invoking it.
*/
static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
unsigned long s, bool do_norm)
@@ -823,17 +829,17 @@ static void srcu_leak_callback(struct rcu_head *rhp)
* more than one CPU, this means that when "func()" is invoked, each CPU
* is guaranteed to have executed a full memory barrier since the end of
* its last corresponding SRCU read-side critical section whose beginning
- * preceded the call to call_rcu(). It also means that each CPU executing
+ * preceded the call to call_srcu(). It also means that each CPU executing
* an SRCU read-side critical section that continues beyond the start of
- * "func()" must have executed a memory barrier after the call_rcu()
+ * "func()" must have executed a memory barrier after the call_srcu()
* but before the beginning of that SRCU read-side critical section.
* Note that these guarantees include CPUs that are offline, idle, or
* executing in user mode, as well as CPUs that are executing in the kernel.
*
- * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the
+ * Furthermore, if CPU A invoked call_srcu() and CPU B invoked the
* resulting SRCU callback function "func()", then both CPU A and CPU
* B are guaranteed to execute a full memory barrier during the time
- * interval between the call to call_rcu() and the invocation of "func()".
+ * interval between the call to call_srcu() and the invocation of "func()".
* This guarantee applies even if CPU A and CPU B are the same CPU (but
* again only if the system has more than one CPU).
*
@@ -1246,13 +1252,12 @@ static void process_srcu(struct work_struct *work)
void srcutorture_get_gp_data(enum rcutorture_type test_type,
struct srcu_struct *sp, int *flags,
- unsigned long *gpnum, unsigned long *completed)
+ unsigned long *gp_seq)
{
if (test_type != SRCU_FLAVOR)
return;
*flags = 0;
- *completed = rcu_seq_ctr(sp->srcu_gp_seq);
- *gpnum = rcu_seq_ctr(sp->srcu_gp_seq_needed);
+ *gp_seq = rcu_seq_current(&sp->srcu_gp_seq);
}
EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
@@ -1263,16 +1268,17 @@ void srcu_torture_stats_print(struct srcu_struct *sp, char *tt, char *tf)
unsigned long s0 = 0, s1 = 0;
idx = sp->srcu_idx & 0x1;
- pr_alert("%s%s Tree SRCU per-CPU(idx=%d):", tt, tf, idx);
+ pr_alert("%s%s Tree SRCU g%ld per-CPU(idx=%d):",
+ tt, tf, rcu_seq_current(&sp->srcu_gp_seq), idx);
for_each_possible_cpu(cpu) {
unsigned long l0, l1;
unsigned long u0, u1;
long c0, c1;
- struct srcu_data *counts;
+ struct srcu_data *sdp;
- counts = per_cpu_ptr(sp->sda, cpu);
- u0 = counts->srcu_unlock_count[!idx];
- u1 = counts->srcu_unlock_count[idx];
+ sdp = per_cpu_ptr(sp->sda, cpu);
+ u0 = sdp->srcu_unlock_count[!idx];
+ u1 = sdp->srcu_unlock_count[idx];
/*
* Make sure that a lock is always counted if the corresponding
@@ -1280,12 +1286,13 @@ void srcu_torture_stats_print(struct srcu_struct *sp, char *tt, char *tf)
*/
smp_rmb();
- l0 = counts->srcu_lock_count[!idx];
- l1 = counts->srcu_lock_count[idx];
+ l0 = sdp->srcu_lock_count[!idx];
+ l1 = sdp->srcu_lock_count[idx];
c0 = l0 - u0;
c1 = l1 - u1;
- pr_cont(" %d(%ld,%ld)", cpu, c0, c1);
+ pr_cont(" %d(%ld,%ld %1p)",
+ cpu, c0, c1, rcu_segcblist_head(&sdp->srcu_cblist));
s0 += c0;
s1 += c1;
}
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index a64eee0db39e..befc9321a89c 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -122,10 +122,8 @@ void rcu_check_callbacks(int user)
{
if (user)
rcu_sched_qs();
- else if (!in_softirq())
+ if (user || !in_softirq())
rcu_bh_qs();
- if (user)
- rcu_note_voluntary_context_switch(current);
}
/*
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index aa7cade1b9f3..6930934e8b9f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -27,6 +27,9 @@
* For detailed explanation of Read-Copy Update mechanism see -
* Documentation/RCU
*/
+
+#define pr_fmt(fmt) "rcu: " fmt
+
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/init.h>
@@ -95,13 +98,13 @@ struct rcu_state sname##_state = { \
.rda = &sname##_data, \
.call = cr, \
.gp_state = RCU_GP_IDLE, \
- .gpnum = 0UL - 300UL, \
- .completed = 0UL - 300UL, \
+ .gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT, \
.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
.name = RCU_STATE_NAME(sname), \
.abbr = sabbr, \
.exp_mutex = __MUTEX_INITIALIZER(sname##_state.exp_mutex), \
.exp_wake_mutex = __MUTEX_INITIALIZER(sname##_state.exp_wake_mutex), \
+ .ofl_lock = __SPIN_LOCK_UNLOCKED(sname##_state.ofl_lock), \
}
RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu_sched);
@@ -155,6 +158,9 @@ EXPORT_SYMBOL_GPL(rcu_scheduler_active);
*/
static int rcu_scheduler_fully_active __read_mostly;
+static void
+rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
+ struct rcu_node *rnp, unsigned long gps, unsigned long flags);
static void rcu_init_new_rnp(struct rcu_node *rnp_leaf);
static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf);
static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu);
@@ -177,6 +183,13 @@ module_param(gp_init_delay, int, 0444);
static int gp_cleanup_delay;
module_param(gp_cleanup_delay, int, 0444);
+/* Retreive RCU kthreads priority for rcutorture */
+int rcu_get_gp_kthreads_prio(void)
+{
+ return kthread_prio;
+}
+EXPORT_SYMBOL_GPL(rcu_get_gp_kthreads_prio);
+
/*
* Number of grace periods between delays, normalized by the duration of
* the delay. The longer the delay, the more the grace periods between
@@ -189,18 +202,6 @@ module_param(gp_cleanup_delay, int, 0444);
#define PER_RCU_NODE_PERIOD 3 /* Number of grace periods between delays. */
/*
- * Track the rcutorture test sequence number and the update version
- * number within a given test. The rcutorture_testseq is incremented
- * on every rcutorture module load and unload, so has an odd value
- * when a test is running. The rcutorture_vernum is set to zero
- * when rcutorture starts and is incremented on each rcutorture update.
- * These variables enable correlating rcutorture output with the
- * RCU tracing information.
- */
-unsigned long rcutorture_testseq;
-unsigned long rcutorture_vernum;
-
-/*
* Compute the mask of online CPUs for the specified rcu_node structure.
* This will not be stable unless the rcu_node structure's ->lock is
* held, but the bit corresponding to the current CPU will be stable
@@ -218,7 +219,7 @@ unsigned long rcu_rnp_online_cpus(struct rcu_node *rnp)
*/
static int rcu_gp_in_progress(struct rcu_state *rsp)
{
- return READ_ONCE(rsp->completed) != READ_ONCE(rsp->gpnum);
+ return rcu_seq_state(rcu_seq_current(&rsp->gp_seq));
}
/*
@@ -233,7 +234,7 @@ void rcu_sched_qs(void)
if (!__this_cpu_read(rcu_sched_data.cpu_no_qs.s))
return;
trace_rcu_grace_period(TPS("rcu_sched"),
- __this_cpu_read(rcu_sched_data.gpnum),
+ __this_cpu_read(rcu_sched_data.gp_seq),
TPS("cpuqs"));
__this_cpu_write(rcu_sched_data.cpu_no_qs.b.norm, false);
if (!__this_cpu_read(rcu_sched_data.cpu_no_qs.b.exp))
@@ -248,7 +249,7 @@ void rcu_bh_qs(void)
RCU_LOCKDEP_WARN(preemptible(), "rcu_bh_qs() invoked with preemption enabled!!!");
if (__this_cpu_read(rcu_bh_data.cpu_no_qs.s)) {
trace_rcu_grace_period(TPS("rcu_bh"),
- __this_cpu_read(rcu_bh_data.gpnum),
+ __this_cpu_read(rcu_bh_data.gp_seq),
TPS("cpuqs"));
__this_cpu_write(rcu_bh_data.cpu_no_qs.b.norm, false);
}
@@ -380,20 +381,6 @@ static bool rcu_dynticks_in_eqs_since(struct rcu_dynticks *rdtp, int snap)
}
/*
- * Do a double-increment of the ->dynticks counter to emulate a
- * momentary idle-CPU quiescent state.
- */
-static void rcu_dynticks_momentary_idle(void)
-{
- struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
- int special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR,
- &rdtp->dynticks);
-
- /* It is illegal to call this from idle state. */
- WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR));
-}
-
-/*
* Set the special (bottom) bit of the specified CPU so that it
* will take special action (such as flushing its TLB) on the
* next exit from an extended quiescent state. Returns true if
@@ -424,12 +411,17 @@ bool rcu_eqs_special_set(int cpu)
*
* We inform the RCU core by emulating a zero-duration dyntick-idle period.
*
- * The caller must have disabled interrupts.
+ * The caller must have disabled interrupts and must not be idle.
*/
static void rcu_momentary_dyntick_idle(void)
{
+ struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
+ int special;
+
raw_cpu_write(rcu_dynticks.rcu_need_heavy_qs, false);
- rcu_dynticks_momentary_idle();
+ special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks);
+ /* It is illegal to call this from idle state. */
+ WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR));
}
/*
@@ -451,7 +443,7 @@ void rcu_note_context_switch(bool preempt)
rcu_momentary_dyntick_idle();
this_cpu_inc(rcu_dynticks.rcu_qs_ctr);
if (!preempt)
- rcu_note_voluntary_context_switch_lite(current);
+ rcu_tasks_qs(current);
out:
trace_rcu_utilization(TPS("End context switch"));
barrier(); /* Avoid RCU read-side critical sections leaking up. */
@@ -513,8 +505,38 @@ static ulong jiffies_till_first_fqs = ULONG_MAX;
static ulong jiffies_till_next_fqs = ULONG_MAX;
static bool rcu_kick_kthreads;
-module_param(jiffies_till_first_fqs, ulong, 0644);
-module_param(jiffies_till_next_fqs, ulong, 0644);
+static int param_set_first_fqs_jiffies(const char *val, const struct kernel_param *kp)
+{
+ ulong j;
+ int ret = kstrtoul(val, 0, &j);
+
+ if (!ret)
+ WRITE_ONCE(*(ulong *)kp->arg, (j > HZ) ? HZ : j);
+ return ret;
+}
+
+static int param_set_next_fqs_jiffies(const char *val, const struct kernel_param *kp)
+{
+ ulong j;
+ int ret = kstrtoul(val, 0, &j);
+
+ if (!ret)
+ WRITE_ONCE(*(ulong *)kp->arg, (j > HZ) ? HZ : (j ?: 1));
+ return ret;
+}
+
+static struct kernel_param_ops first_fqs_jiffies_ops = {
+ .set = param_set_first_fqs_jiffies,
+ .get = param_get_ulong,
+};
+
+static struct kernel_param_ops next_fqs_jiffies_ops = {
+ .set = param_set_next_fqs_jiffies,
+ .get = param_get_ulong,
+};
+
+module_param_cb(jiffies_till_first_fqs, &first_fqs_jiffies_ops, &jiffies_till_first_fqs, 0644);
+module_param_cb(jiffies_till_next_fqs, &next_fqs_jiffies_ops, &jiffies_till_next_fqs, 0644);
module_param(rcu_kick_kthreads, bool, 0644);
/*
@@ -529,58 +551,31 @@ static void force_quiescent_state(struct rcu_state *rsp);
static int rcu_pending(void);
/*
- * Return the number of RCU batches started thus far for debug & stats.
+ * Return the number of RCU GPs completed thus far for debug & stats.
*/
-unsigned long rcu_batches_started(void)
+unsigned long rcu_get_gp_seq(void)
{
- return rcu_state_p->gpnum;
+ return READ_ONCE(rcu_state_p->gp_seq);
}
-EXPORT_SYMBOL_GPL(rcu_batches_started);
+EXPORT_SYMBOL_GPL(rcu_get_gp_seq);
/*
- * Return the number of RCU-sched batches started thus far for debug & stats.
+ * Return the number of RCU-sched GPs completed thus far for debug & stats.
*/
-unsigned long rcu_batches_started_sched(void)
+unsigned long rcu_sched_get_gp_seq(void)
{
- return rcu_sched_state.gpnum;
+ return READ_ONCE(rcu_sched_state.gp_seq);
}
-EXPORT_SYMBOL_GPL(rcu_batches_started_sched);
+EXPORT_SYMBOL_GPL(rcu_sched_get_gp_seq);
/*
- * Return the number of RCU BH batches started thus far for debug & stats.
+ * Return the number of RCU-bh GPs completed thus far for debug & stats.
*/
-unsigned long rcu_batches_started_bh(void)
+unsigned long rcu_bh_get_gp_seq(void)
{
- return rcu_bh_state.gpnum;
+ return READ_ONCE(rcu_bh_state.gp_seq);
}
-EXPORT_SYMBOL_GPL(rcu_batches_started_bh);
-
-/*
- * Return the number of RCU batches completed thus far for debug & stats.
- */
-unsigned long rcu_batches_completed(void)
-{
- return rcu_state_p->completed;
-}
-EXPORT_SYMBOL_GPL(rcu_batches_completed);
-
-/*
- * Return the number of RCU-sched batches completed thus far for debug & stats.
- */
-unsigned long rcu_batches_completed_sched(void)
-{
- return rcu_sched_state.completed;
-}
-EXPORT_SYMBOL_GPL(rcu_batches_completed_sched);
-
-/*
- * Return the number of RCU BH batches completed thus far for debug & stats.
- */
-unsigned long rcu_batches_completed_bh(void)
-{
- return rcu_bh_state.completed;
-}
-EXPORT_SYMBOL_GPL(rcu_batches_completed_bh);
+EXPORT_SYMBOL_GPL(rcu_bh_get_gp_seq);
/*
* Return the number of RCU expedited batches completed thus far for
@@ -636,35 +631,42 @@ EXPORT_SYMBOL_GPL(rcu_sched_force_quiescent_state);
*/
void show_rcu_gp_kthreads(void)
{
+ int cpu;
+ struct rcu_data *rdp;
+ struct rcu_node *rnp;
struct rcu_state *rsp;
for_each_rcu_flavor(rsp) {
pr_info("%s: wait state: %d ->state: %#lx\n",
rsp->name, rsp->gp_state, rsp->gp_kthread->state);
+ rcu_for_each_node_breadth_first(rsp, rnp) {
+ if (ULONG_CMP_GE(rsp->gp_seq, rnp->gp_seq_needed))
+ continue;
+ pr_info("\trcu_node %d:%d ->gp_seq %lu ->gp_seq_needed %lu\n",
+ rnp->grplo, rnp->grphi, rnp->gp_seq,
+ rnp->gp_seq_needed);
+ if (!rcu_is_leaf_node(rnp))
+ continue;
+ for_each_leaf_node_possible_cpu(rnp, cpu) {
+ rdp = per_cpu_ptr(rsp->rda, cpu);
+ if (rdp->gpwrap ||
+ ULONG_CMP_GE(rsp->gp_seq,
+ rdp->gp_seq_needed))
+ continue;
+ pr_info("\tcpu %d ->gp_seq_needed %lu\n",
+ cpu, rdp->gp_seq_needed);
+ }
+ }
/* sched_show_task(rsp->gp_kthread); */
}
}
EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
/*
- * Record the number of times rcutorture tests have been initiated and
- * terminated. This information allows the debugfs tracing stats to be
- * correlated to the rcutorture messages, even when the rcutorture module
- * is being repeatedly loaded and unloaded. In other words, we cannot
- * store this state in rcutorture itself.
- */
-void rcutorture_record_test_transition(void)
-{
- rcutorture_testseq++;
- rcutorture_vernum = 0;
-}
-EXPORT_SYMBOL_GPL(rcutorture_record_test_transition);
-
-/*
* Send along grace-period-related data for rcutorture diagnostics.
*/
void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
- unsigned long *gpnum, unsigned long *completed)
+ unsigned long *gp_seq)
{
struct rcu_state *rsp = NULL;
@@ -684,23 +686,11 @@ void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
if (rsp == NULL)
return;
*flags = READ_ONCE(rsp->gp_flags);
- *gpnum = READ_ONCE(rsp->gpnum);
- *completed = READ_ONCE(rsp->completed);
+ *gp_seq = rcu_seq_current(&rsp->gp_seq);
}
EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
/*
- * Record the number of writer passes through the current rcutorture test.
- * This is also used to correlate debugfs tracing stats with the rcutorture
- * messages.
- */
-void rcutorture_record_progress(unsigned long vernum)
-{
- rcutorture_vernum++;
-}
-EXPORT_SYMBOL_GPL(rcutorture_record_progress);
-
-/*
* Return the root node of the specified rcu_state structure.
*/
static struct rcu_node *rcu_get_root(struct rcu_state *rsp)
@@ -1059,41 +1049,41 @@ void rcu_request_urgent_qs_task(struct task_struct *t)
#if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)
/*
- * Is the current CPU online? Disable preemption to avoid false positives
- * that could otherwise happen due to the current CPU number being sampled,
- * this task being preempted, its old CPU being taken offline, resuming
- * on some other CPU, then determining that its old CPU is now offline.
- * It is OK to use RCU on an offline processor during initial boot, hence
- * the check for rcu_scheduler_fully_active. Note also that it is OK
- * for a CPU coming online to use RCU for one jiffy prior to marking itself
- * online in the cpu_online_mask. Similarly, it is OK for a CPU going
- * offline to continue to use RCU for one jiffy after marking itself
- * offline in the cpu_online_mask. This leniency is necessary given the
- * non-atomic nature of the online and offline processing, for example,
- * the fact that a CPU enters the scheduler after completing the teardown
- * of the CPU.
+ * Is the current CPU online as far as RCU is concerned?
*
- * This is also why RCU internally marks CPUs online during in the
- * preparation phase and offline after the CPU has been taken down.
+ * Disable preemption to avoid false positives that could otherwise
+ * happen due to the current CPU number being sampled, this task being
+ * preempted, its old CPU being taken offline, resuming on some other CPU,
+ * then determining that its old CPU is now offline. Because there are
+ * multiple flavors of RCU, and because this function can be called in the
+ * midst of updating the flavors while a given CPU coming online or going
+ * offline, it is necessary to check all flavors. If any of the flavors
+ * believe that given CPU is online, it is considered to be online.
*
- * Disable checking if in an NMI handler because we cannot safely report
- * errors from NMI handlers anyway.
+ * Disable checking if in an NMI handler because we cannot safely
+ * report errors from NMI handlers anyway. In addition, it is OK to use
+ * RCU on an offline processor during initial boot, hence the check for
+ * rcu_scheduler_fully_active.
*/
bool rcu_lockdep_current_cpu_online(void)
{
struct rcu_data *rdp;
struct rcu_node *rnp;
- bool ret;
+ struct rcu_state *rsp;
- if (in_nmi())
+ if (in_nmi() || !rcu_scheduler_fully_active)
return true;
preempt_disable();
- rdp = this_cpu_ptr(&rcu_sched_data);
- rnp = rdp->mynode;
- ret = (rdp->grpmask & rcu_rnp_online_cpus(rnp)) ||
- !rcu_scheduler_fully_active;
+ for_each_rcu_flavor(rsp) {
+ rdp = this_cpu_ptr(rsp->rda);
+ rnp = rdp->mynode;
+ if (rdp->grpmask & rcu_rnp_online_cpus(rnp)) {
+ preempt_enable();
+ return true;
+ }
+ }
preempt_enable();
- return ret;
+ return false;
}
EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
@@ -1115,17 +1105,18 @@ static int rcu_is_cpu_rrupt_from_idle(void)
/*
* We are reporting a quiescent state on behalf of some other CPU, so
* it is our responsibility to check for and handle potential overflow
- * of the rcu_node ->gpnum counter with respect to the rcu_data counters.
+ * of the rcu_node ->gp_seq counter with respect to the rcu_data counters.
* After all, the CPU might be in deep idle state, and thus executing no
* code whatsoever.
*/
static void rcu_gpnum_ovf(struct rcu_node *rnp, struct rcu_data *rdp)
{
raw_lockdep_assert_held_rcu_node(rnp);
- if (ULONG_CMP_LT(READ_ONCE(rdp->gpnum) + ULONG_MAX / 4, rnp->gpnum))
+ if (ULONG_CMP_LT(rcu_seq_current(&rdp->gp_seq) + ULONG_MAX / 4,
+ rnp->gp_seq))
WRITE_ONCE(rdp->gpwrap, true);
- if (ULONG_CMP_LT(rdp->rcu_iw_gpnum + ULONG_MAX / 4, rnp->gpnum))
- rdp->rcu_iw_gpnum = rnp->gpnum + ULONG_MAX / 4;
+ if (ULONG_CMP_LT(rdp->rcu_iw_gp_seq + ULONG_MAX / 4, rnp->gp_seq))
+ rdp->rcu_iw_gp_seq = rnp->gp_seq + ULONG_MAX / 4;
}
/*
@@ -1137,7 +1128,7 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp)
{
rdp->dynticks_snap = rcu_dynticks_snap(rdp->dynticks);
if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) {
- trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, TPS("dti"));
+ trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("dti"));
rcu_gpnum_ovf(rdp->mynode, rdp);
return 1;
}
@@ -1159,7 +1150,7 @@ static void rcu_iw_handler(struct irq_work *iwp)
rnp = rdp->mynode;
raw_spin_lock_rcu_node(rnp);
if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
- rdp->rcu_iw_gpnum = rnp->gpnum;
+ rdp->rcu_iw_gp_seq = rnp->gp_seq;
rdp->rcu_iw_pending = false;
}
raw_spin_unlock_rcu_node(rnp);
@@ -1187,7 +1178,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
* of the current RCU grace period.
*/
if (rcu_dynticks_in_eqs_since(rdp->dynticks, rdp->dynticks_snap)) {
- trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, TPS("dti"));
+ trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("dti"));
rdp->dynticks_fqs++;
rcu_gpnum_ovf(rnp, rdp);
return 1;
@@ -1203,8 +1194,8 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu);
if (time_after(jiffies, rdp->rsp->gp_start + jtsq) &&
READ_ONCE(rdp->rcu_qs_ctr_snap) != per_cpu(rcu_dynticks.rcu_qs_ctr, rdp->cpu) &&
- READ_ONCE(rdp->gpnum) == rnp->gpnum && !rdp->gpwrap) {
- trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, TPS("rqc"));
+ rcu_seq_current(&rdp->gp_seq) == rnp->gp_seq && !rdp->gpwrap) {
+ trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("rqc"));
rcu_gpnum_ovf(rnp, rdp);
return 1;
} else if (time_after(jiffies, rdp->rsp->gp_start + jtsq)) {
@@ -1212,12 +1203,25 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
smp_store_release(ruqp, true);
}
- /* Check for the CPU being offline. */
- if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp))) {
- trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, TPS("ofl"));
- rdp->offline_fqs++;
- rcu_gpnum_ovf(rnp, rdp);
- return 1;
+ /* If waiting too long on an offline CPU, complain. */
+ if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp)) &&
+ time_after(jiffies, rdp->rsp->gp_start + HZ)) {
+ bool onl;
+ struct rcu_node *rnp1;
+
+ WARN_ON(1); /* Offline CPUs are supposed to report QS! */
+ pr_info("%s: grp: %d-%d level: %d ->gp_seq %ld ->completedqs %ld\n",
+ __func__, rnp->grplo, rnp->grphi, rnp->level,
+ (long)rnp->gp_seq, (long)rnp->completedqs);
+ for (rnp1 = rnp; rnp1; rnp1 = rnp1->parent)
+ pr_info("%s: %d:%d ->qsmask %#lx ->qsmaskinit %#lx ->qsmaskinitnext %#lx ->rcu_gp_init_mask %#lx\n",
+ __func__, rnp1->grplo, rnp1->grphi, rnp1->qsmask, rnp1->qsmaskinit, rnp1->qsmaskinitnext, rnp1->rcu_gp_init_mask);
+ onl = !!(rdp->grpmask & rcu_rnp_online_cpus(rnp));
+ pr_info("%s %d: %c online: %ld(%d) offline: %ld(%d)\n",
+ __func__, rdp->cpu, ".o"[onl],
+ (long)rdp->rcu_onl_gp_seq, rdp->rcu_onl_gp_flags,
+ (long)rdp->rcu_ofl_gp_seq, rdp->rcu_ofl_gp_flags);
+ return 1; /* Break things loose after complaining. */
}
/*
@@ -1256,11 +1260,11 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
if (jiffies - rdp->rsp->gp_start > rcu_jiffies_till_stall_check() / 2) {
resched_cpu(rdp->cpu);
if (IS_ENABLED(CONFIG_IRQ_WORK) &&
- !rdp->rcu_iw_pending && rdp->rcu_iw_gpnum != rnp->gpnum &&
+ !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != rnp->gp_seq &&
(rnp->ffmask & rdp->grpmask)) {
init_irq_work(&rdp->rcu_iw, rcu_iw_handler);
rdp->rcu_iw_pending = true;
- rdp->rcu_iw_gpnum = rnp->gpnum;
+ rdp->rcu_iw_gp_seq = rnp->gp_seq;
irq_work_queue_on(&rdp->rcu_iw, rdp->cpu);
}
}
@@ -1274,9 +1278,9 @@ static void record_gp_stall_check_time(struct rcu_state *rsp)
unsigned long j1;
rsp->gp_start = j;
- smp_wmb(); /* Record start time before stall time. */
j1 = rcu_jiffies_till_stall_check();
- WRITE_ONCE(rsp->jiffies_stall, j + j1);
+ /* Record ->gp_start before ->jiffies_stall. */
+ smp_store_release(&rsp->jiffies_stall, j + j1); /* ^^^ */
rsp->jiffies_resched = j + j1 / 2;
rsp->n_force_qs_gpstart = READ_ONCE(rsp->n_force_qs);
}
@@ -1302,9 +1306,9 @@ static void rcu_check_gp_kthread_starvation(struct rcu_state *rsp)
j = jiffies;
gpa = READ_ONCE(rsp->gp_activity);
if (j - gpa > 2 * HZ) {
- pr_err("%s kthread starved for %ld jiffies! g%lu c%lu f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
+ pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
rsp->name, j - gpa,
- rsp->gpnum, rsp->completed,
+ (long)rcu_seq_current(&rsp->gp_seq),
rsp->gp_flags,
gp_state_getname(rsp->gp_state), rsp->gp_state,
rsp->gp_kthread ? rsp->gp_kthread->state : ~0,
@@ -1359,16 +1363,15 @@ static void rcu_stall_kick_kthreads(struct rcu_state *rsp)
}
}
-static inline void panic_on_rcu_stall(void)
+static void panic_on_rcu_stall(void)
{
if (sysctl_panic_on_rcu_stall)
panic("RCU Stall\n");
}
-static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
+static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq)
{
int cpu;
- long delta;
unsigned long flags;
unsigned long gpa;
unsigned long j;
@@ -1381,25 +1384,12 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
if (rcu_cpu_stall_suppress)
return;
- /* Only let one CPU complain about others per time interval. */
-
- raw_spin_lock_irqsave_rcu_node(rnp, flags);
- delta = jiffies - READ_ONCE(rsp->jiffies_stall);
- if (delta < RCU_STALL_RAT_DELAY || !rcu_gp_in_progress(rsp)) {
- raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
- return;
- }
- WRITE_ONCE(rsp->jiffies_stall,
- jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
- raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
-
/*
* OK, time to rat on our buddy...
* See Documentation/RCU/stallwarn.txt for info on how to debug
* RCU CPU stall warnings.
*/
- pr_err("INFO: %s detected stalls on CPUs/tasks:",
- rsp->name);
+ pr_err("INFO: %s detected stalls on CPUs/tasks:", rsp->name);
print_cpu_stall_info_begin();
rcu_for_each_leaf_node(rsp, rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
@@ -1418,17 +1408,16 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
for_each_possible_cpu(cpu)
totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(rsp->rda,
cpu)->cblist);
- pr_cont("(detected by %d, t=%ld jiffies, g=%ld, c=%ld, q=%lu)\n",
+ pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n",
smp_processor_id(), (long)(jiffies - rsp->gp_start),
- (long)rsp->gpnum, (long)rsp->completed, totqlen);
+ (long)rcu_seq_current(&rsp->gp_seq), totqlen);
if (ndetected) {
rcu_dump_cpu_stacks(rsp);
/* Complain about tasks blocking the grace period. */
rcu_print_detail_task_stall(rsp);
} else {
- if (READ_ONCE(rsp->gpnum) != gpnum ||
- READ_ONCE(rsp->completed) == gpnum) {
+ if (rcu_seq_current(&rsp->gp_seq) != gp_seq) {
pr_err("INFO: Stall ended before state dump start\n");
} else {
j = jiffies;
@@ -1441,6 +1430,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
sched_show_task(current);
}
}
+ /* Rewrite if needed in case of slow consoles. */
+ if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall)))
+ WRITE_ONCE(rsp->jiffies_stall,
+ jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
rcu_check_gp_kthread_starvation(rsp);
@@ -1476,15 +1469,16 @@ static void print_cpu_stall(struct rcu_state *rsp)
for_each_possible_cpu(cpu)
totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(rsp->rda,
cpu)->cblist);
- pr_cont(" (t=%lu jiffies g=%ld c=%ld q=%lu)\n",
+ pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n",
jiffies - rsp->gp_start,
- (long)rsp->gpnum, (long)rsp->completed, totqlen);
+ (long)rcu_seq_current(&rsp->gp_seq), totqlen);
rcu_check_gp_kthread_starvation(rsp);
rcu_dump_cpu_stacks(rsp);
raw_spin_lock_irqsave_rcu_node(rnp, flags);
+ /* Rewrite if needed in case of slow consoles. */
if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall)))
WRITE_ONCE(rsp->jiffies_stall,
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
@@ -1504,10 +1498,11 @@ static void print_cpu_stall(struct rcu_state *rsp)
static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
{
- unsigned long completed;
- unsigned long gpnum;
+ unsigned long gs1;
+ unsigned long gs2;
unsigned long gps;
unsigned long j;
+ unsigned long jn;
unsigned long js;
struct rcu_node *rnp;
@@ -1520,43 +1515,46 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
/*
* Lots of memory barriers to reject false positives.
*
- * The idea is to pick up rsp->gpnum, then rsp->jiffies_stall,
- * then rsp->gp_start, and finally rsp->completed. These values
- * are updated in the opposite order with memory barriers (or
- * equivalent) during grace-period initialization and cleanup.
- * Now, a false positive can occur if we get an new value of
- * rsp->gp_start and a old value of rsp->jiffies_stall. But given
- * the memory barriers, the only way that this can happen is if one
- * grace period ends and another starts between these two fetches.
- * Detect this by comparing rsp->completed with the previous fetch
- * from rsp->gpnum.
+ * The idea is to pick up rsp->gp_seq, then rsp->jiffies_stall,
+ * then rsp->gp_start, and finally another copy of rsp->gp_seq.
+ * These values are updated in the opposite order with memory
+ * barriers (or equivalent) during grace-period initialization
+ * and cleanup. Now, a false positive can occur if we get an new
+ * value of rsp->gp_start and a old value of rsp->jiffies_stall.
+ * But given the memory barriers, the only way that this can happen
+ * is if one grace period ends and another starts between these
+ * two fetches. This is detected by comparing the second fetch
+ * of rsp->gp_seq with the previous fetch from rsp->gp_seq.
*
* Given this check, comparisons of jiffies, rsp->jiffies_stall,
* and rsp->gp_start suffice to forestall false positives.
*/
- gpnum = READ_ONCE(rsp->gpnum);
- smp_rmb(); /* Pick up ->gpnum first... */
+ gs1 = READ_ONCE(rsp->gp_seq);
+ smp_rmb(); /* Pick up ->gp_seq first... */
js = READ_ONCE(rsp->jiffies_stall);
smp_rmb(); /* ...then ->jiffies_stall before the rest... */
gps = READ_ONCE(rsp->gp_start);
- smp_rmb(); /* ...and finally ->gp_start before ->completed. */
- completed = READ_ONCE(rsp->completed);
- if (ULONG_CMP_GE(completed, gpnum) ||
+ smp_rmb(); /* ...and finally ->gp_start before ->gp_seq again. */
+ gs2 = READ_ONCE(rsp->gp_seq);
+ if (gs1 != gs2 ||
ULONG_CMP_LT(j, js) ||
ULONG_CMP_GE(gps, js))
return; /* No stall or GP completed since entering function. */
rnp = rdp->mynode;
+ jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3;
if (rcu_gp_in_progress(rsp) &&
- (READ_ONCE(rnp->qsmask) & rdp->grpmask)) {
+ (READ_ONCE(rnp->qsmask) & rdp->grpmask) &&
+ cmpxchg(&rsp->jiffies_stall, js, jn) == js) {
/* We haven't checked in, so go dump stack. */
print_cpu_stall(rsp);
} else if (rcu_gp_in_progress(rsp) &&
- ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) {
+ ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) &&
+ cmpxchg(&rsp->jiffies_stall, js, jn) == js) {
/* They had a few time units to dump stack, so complain. */
- print_other_cpu_stall(rsp, gpnum);
+ print_other_cpu_stall(rsp, gs2);
}
}
@@ -1577,123 +1575,99 @@ void rcu_cpu_stall_reset(void)
WRITE_ONCE(rsp->jiffies_stall, jiffies + ULONG_MAX / 2);
}
-/*
- * Determine the value that ->completed will have at the end of the
- * next subsequent grace period. This is used to tag callbacks so that
- * a CPU can invoke callbacks in a timely fashion even if that CPU has
- * been dyntick-idle for an extended period with callbacks under the
- * influence of RCU_FAST_NO_HZ.
- *
- * The caller must hold rnp->lock with interrupts disabled.
- */
-static unsigned long rcu_cbs_completed(struct rcu_state *rsp,
- struct rcu_node *rnp)
-{
- raw_lockdep_assert_held_rcu_node(rnp);
-
- /*
- * If RCU is idle, we just wait for the next grace period.
- * But we can only be sure that RCU is idle if we are looking
- * at the root rcu_node structure -- otherwise, a new grace
- * period might have started, but just not yet gotten around
- * to initializing the current non-root rcu_node structure.
- */
- if (rcu_get_root(rsp) == rnp && rnp->gpnum == rnp->completed)
- return rnp->completed + 1;
-
- /*
- * If the current rcu_node structure believes that RCU is
- * idle, and if the rcu_state structure does not yet reflect
- * the start of a new grace period, then the next grace period
- * will suffice. The memory barrier is needed to accurately
- * sample the rsp->gpnum, and pairs with the second lock
- * acquisition in rcu_gp_init(), which is augmented with
- * smp_mb__after_unlock_lock() for this purpose.
- */
- if (rnp->gpnum == rnp->completed) {
- smp_mb(); /* See above block comment. */
- if (READ_ONCE(rsp->gpnum) == rnp->completed)
- return rnp->completed + 1;
- }
-
- /*
- * Otherwise, wait for a possible partial grace period and
- * then the subsequent full grace period.
- */
- return rnp->completed + 2;
-}
-
/* Trace-event wrapper function for trace_rcu_future_grace_period. */
static void trace_rcu_this_gp(struct rcu_node *rnp, struct rcu_data *rdp,
- unsigned long c, const char *s)
+ unsigned long gp_seq_req, const char *s)
{
- trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum,
- rnp->completed, c, rnp->level,
- rnp->grplo, rnp->grphi, s);
+ trace_rcu_future_grace_period(rdp->rsp->name, rnp->gp_seq, gp_seq_req,
+ rnp->level, rnp->grplo, rnp->grphi, s);
}
/*
+ * rcu_start_this_gp - Request the start of a particular grace period
+ * @rnp_start: The leaf node of the CPU from which to start.
+ * @rdp: The rcu_data corresponding to the CPU from which to start.
+ * @gp_seq_req: The gp_seq of the grace period to start.
+ *
* Start the specified grace period, as needed to handle newly arrived
* callbacks. The required future grace periods are recorded in each
- * rcu_node structure's ->need_future_gp[] field. Returns true if there
+ * rcu_node structure's ->gp_seq_needed field. Returns true if there
* is reason to awaken the grace-period kthread.
*
* The caller must hold the specified rcu_node structure's ->lock, which
* is why the caller is responsible for waking the grace-period kthread.
+ *
+ * Returns true if the GP thread needs to be awakened else false.
*/
-static bool rcu_start_this_gp(struct rcu_node *rnp, struct rcu_data *rdp,
- unsigned long c)
+static bool rcu_start_this_gp(struct rcu_node *rnp_start, struct rcu_data *rdp,
+ unsigned long gp_seq_req)
{
bool ret = false;
struct rcu_state *rsp = rdp->rsp;
- struct rcu_node *rnp_root;
+ struct rcu_node *rnp;
/*
* Use funnel locking to either acquire the root rcu_node
* structure's lock or bail out if the need for this grace period
- * has already been recorded -- or has already started. If there
- * is already a grace period in progress in a non-leaf node, no
- * recording is needed because the end of the grace period will
- * scan the leaf rcu_node structures. Note that rnp->lock must
- * not be released.
+ * has already been recorded -- or if that grace period has in
+ * fact already started. If there is already a grace period in
+ * progress in a non-leaf node, no recording is needed because the
+ * end of the grace period will scan the leaf rcu_node structures.
+ * Note that rnp_start->lock must not be released.
*/
- raw_lockdep_assert_held_rcu_node(rnp);
- trace_rcu_this_gp(rnp, rdp, c, TPS("Startleaf"));
- for (rnp_root = rnp; 1; rnp_root = rnp_root->parent) {
- if (rnp_root != rnp)
- raw_spin_lock_rcu_node(rnp_root);
- WARN_ON_ONCE(ULONG_CMP_LT(rnp_root->gpnum +
- need_future_gp_mask(), c));
- if (need_future_gp_element(rnp_root, c) ||
- ULONG_CMP_GE(rnp_root->gpnum, c) ||
- (rnp != rnp_root &&
- rnp_root->gpnum != rnp_root->completed)) {
- trace_rcu_this_gp(rnp_root, rdp, c, TPS("Prestarted"));
+ raw_lockdep_assert_held_rcu_node(rnp_start);
+ trace_rcu_this_gp(rnp_start, rdp, gp_seq_req, TPS("Startleaf"));
+ for (rnp = rnp_start; 1; rnp = rnp->parent) {
+ if (rnp != rnp_start)
+ raw_spin_lock_rcu_node(rnp);
+ if (ULONG_CMP_GE(rnp->gp_seq_needed, gp_seq_req) ||
+ rcu_seq_started(&rnp->gp_seq, gp_seq_req) ||
+ (rnp != rnp_start &&
+ rcu_seq_state(rcu_seq_current(&rnp->gp_seq)))) {
+ trace_rcu_this_gp(rnp, rdp, gp_seq_req,
+ TPS("Prestarted"));
goto unlock_out;
}
- need_future_gp_element(rnp_root, c) = true;
- if (rnp_root != rnp && rnp_root->parent != NULL)
- raw_spin_unlock_rcu_node(rnp_root);
- if (!rnp_root->parent)
+ rnp->gp_seq_needed = gp_seq_req;
+ if (rcu_seq_state(rcu_seq_current(&rnp->gp_seq))) {
+ /*
+ * We just marked the leaf or internal node, and a
+ * grace period is in progress, which means that
+ * rcu_gp_cleanup() will see the marking. Bail to
+ * reduce contention.
+ */
+ trace_rcu_this_gp(rnp_start, rdp, gp_seq_req,
+ TPS("Startedleaf"));
+ goto unlock_out;
+ }
+ if (rnp != rnp_start && rnp->parent != NULL)
+ raw_spin_unlock_rcu_node(rnp);
+ if (!rnp->parent)
break; /* At root, and perhaps also leaf. */
}
/* If GP already in progress, just leave, otherwise start one. */
- if (rnp_root->gpnum != rnp_root->completed) {
- trace_rcu_this_gp(rnp_root, rdp, c, TPS("Startedleafroot"));
+ if (rcu_gp_in_progress(rsp)) {
+ trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("Startedleafroot"));
goto unlock_out;
}
- trace_rcu_this_gp(rnp_root, rdp, c, TPS("Startedroot"));
+ trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("Startedroot"));
WRITE_ONCE(rsp->gp_flags, rsp->gp_flags | RCU_GP_FLAG_INIT);
+ rsp->gp_req_activity = jiffies;
if (!rsp->gp_kthread) {
- trace_rcu_this_gp(rnp_root, rdp, c, TPS("NoGPkthread"));
+ trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("NoGPkthread"));
goto unlock_out;
}
- trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gpnum), TPS("newreq"));
+ trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq), TPS("newreq"));
ret = true; /* Caller must wake GP kthread. */
unlock_out:
- if (rnp != rnp_root)
- raw_spin_unlock_rcu_node(rnp_root);
+ /* Push furthest requested GP to leaf node and rcu_data structure. */
+ if (ULONG_CMP_LT(gp_seq_req, rnp->gp_seq_needed)) {
+ rnp_start->gp_seq_needed = rnp->gp_seq_needed;
+ rdp->gp_seq_needed = rnp->gp_seq_needed;
+ }
+ if (rnp != rnp_start)
+ raw_spin_unlock_rcu_node(rnp);
return ret;
}
@@ -1703,13 +1677,13 @@ unlock_out:
*/
static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
{
- unsigned long c = rnp->completed;
bool needmore;
struct rcu_data *rdp = this_cpu_ptr(rsp->rda);
- need_future_gp_element(rnp, c) = false;
- needmore = need_any_future_gp(rnp);
- trace_rcu_this_gp(rnp, rdp, c,
+ needmore = ULONG_CMP_LT(rnp->gp_seq, rnp->gp_seq_needed);
+ if (!needmore)
+ rnp->gp_seq_needed = rnp->gp_seq; /* Avoid counter wrap. */
+ trace_rcu_this_gp(rnp, rdp, rnp->gp_seq,
needmore ? TPS("CleanupMore") : TPS("Cleanup"));
return needmore;
}
@@ -1731,21 +1705,21 @@ static void rcu_gp_kthread_wake(struct rcu_state *rsp)
}
/*
- * If there is room, assign a ->completed number to any callbacks on
- * this CPU that have not already been assigned. Also accelerate any
- * callbacks that were previously assigned a ->completed number that has
- * since proven to be too conservative, which can happen if callbacks get
- * assigned a ->completed number while RCU is idle, but with reference to
- * a non-root rcu_node structure. This function is idempotent, so it does
- * not hurt to call it repeatedly. Returns an flag saying that we should
- * awaken the RCU grace-period kthread.
+ * If there is room, assign a ->gp_seq number to any callbacks on this
+ * CPU that have not already been assigned. Also accelerate any callbacks
+ * that were previously assigned a ->gp_seq number that has since proven
+ * to be too conservative, which can happen if callbacks get assigned a
+ * ->gp_seq number while RCU is idle, but with reference to a non-root
+ * rcu_node structure. This function is idempotent, so it does not hurt
+ * to call it repeatedly. Returns an flag saying that we should awaken
+ * the RCU grace-period kthread.
*
* The caller must hold rnp->lock with interrupts disabled.
*/
static bool rcu_accelerate_cbs(struct rcu_state *rsp, struct rcu_node *rnp,
struct rcu_data *rdp)
{
- unsigned long c;
+ unsigned long gp_seq_req;
bool ret = false;
raw_lockdep_assert_held_rcu_node(rnp);
@@ -1764,22 +1738,50 @@ static bool rcu_accelerate_cbs(struct rcu_state *rsp, struct rcu_node *rnp,
* accelerating callback invocation to an earlier grace-period
* number.
*/
- c = rcu_cbs_completed(rsp, rnp);
- if (rcu_segcblist_accelerate(&rdp->cblist, c))
- ret = rcu_start_this_gp(rnp, rdp, c);
+ gp_seq_req = rcu_seq_snap(&rsp->gp_seq);
+ if (rcu_segcblist_accelerate(&rdp->cblist, gp_seq_req))
+ ret = rcu_start_this_gp(rnp, rdp, gp_seq_req);
/* Trace depending on how much we were able to accelerate. */
if (rcu_segcblist_restempty(&rdp->cblist, RCU_WAIT_TAIL))
- trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("AccWaitCB"));
+ trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("AccWaitCB"));
else
- trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("AccReadyCB"));
+ trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("AccReadyCB"));
return ret;
}
/*
+ * Similar to rcu_accelerate_cbs(), but does not require that the leaf
+ * rcu_node structure's ->lock be held. It consults the cached value
+ * of ->gp_seq_needed in the rcu_data structure, and if that indicates
+ * that a new grace-period request be made, invokes rcu_accelerate_cbs()
+ * while holding the leaf rcu_node structure's ->lock.
+ */
+static void rcu_accelerate_cbs_unlocked(struct rcu_state *rsp,
+ struct rcu_node *rnp,
+ struct rcu_data *rdp)
+{
+ unsigned long c;
+ bool needwake;
+
+ lockdep_assert_irqs_disabled();
+ c = rcu_seq_snap(&rsp->gp_seq);
+ if (!rdp->gpwrap && ULONG_CMP_GE(rdp->gp_seq_needed, c)) {
+ /* Old request still live, so mark recent callbacks. */
+ (void)rcu_segcblist_accelerate(&rdp->cblist, c);
+ return;
+ }
+ raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
+ needwake = rcu_accelerate_cbs(rsp, rnp, rdp);
+ raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
+ if (needwake)
+ rcu_gp_kthread_wake(rsp);
+}
+
+/*
* Move any callbacks whose grace period has completed to the
* RCU_DONE_TAIL sublist, then compact the remaining sublists and
- * assign ->completed numbers to any callbacks in the RCU_NEXT_TAIL
+ * assign ->gp_seq numbers to any callbacks in the RCU_NEXT_TAIL
* sublist. This function is idempotent, so it does not hurt to
* invoke it repeatedly. As long as it is not invoked -too- often...
* Returns true if the RCU grace-period kthread needs to be awakened.
@@ -1796,10 +1798,10 @@ static bool rcu_advance_cbs(struct rcu_state *rsp, struct rcu_node *rnp,
return false;
/*
- * Find all callbacks whose ->completed numbers indicate that they
+ * Find all callbacks whose ->gp_seq numbers indicate that they
* are ready to invoke, and put them into the RCU_DONE_TAIL sublist.
*/
- rcu_segcblist_advance(&rdp->cblist, rnp->completed);
+ rcu_segcblist_advance(&rdp->cblist, rnp->gp_seq);
/* Classify any remaining callbacks. */
return rcu_accelerate_cbs(rsp, rnp, rdp);
@@ -1819,39 +1821,38 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp,
raw_lockdep_assert_held_rcu_node(rnp);
- /* Handle the ends of any preceding grace periods first. */
- if (rdp->completed == rnp->completed &&
- !unlikely(READ_ONCE(rdp->gpwrap))) {
-
- /* No grace period end, so just accelerate recent callbacks. */
- ret = rcu_accelerate_cbs(rsp, rnp, rdp);
+ if (rdp->gp_seq == rnp->gp_seq)
+ return false; /* Nothing to do. */
+ /* Handle the ends of any preceding grace periods first. */
+ if (rcu_seq_completed_gp(rdp->gp_seq, rnp->gp_seq) ||
+ unlikely(READ_ONCE(rdp->gpwrap))) {
+ ret = rcu_advance_cbs(rsp, rnp, rdp); /* Advance callbacks. */
+ trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuend"));
} else {
-
- /* Advance callbacks. */
- ret = rcu_advance_cbs(rsp, rnp, rdp);
-
- /* Remember that we saw this grace-period completion. */
- rdp->completed = rnp->completed;
- trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("cpuend"));
+ ret = rcu_accelerate_cbs(rsp, rnp, rdp); /* Recent callbacks. */
}
- if (rdp->gpnum != rnp->gpnum || unlikely(READ_ONCE(rdp->gpwrap))) {
+ /* Now handle the beginnings of any new-to-this-CPU grace periods. */
+ if (rcu_seq_new_gp(rdp->gp_seq, rnp->gp_seq) ||
+ unlikely(READ_ONCE(rdp->gpwrap))) {
/*
* If the current grace period is waiting for this CPU,
* set up to detect a quiescent state, otherwise don't
* go looking for one.
*/
- rdp->gpnum = rnp->gpnum;
- trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("cpustart"));
+ trace_rcu_grace_period(rsp->name, rnp->gp_seq, TPS("cpustart"));
need_gp = !!(rnp->qsmask & rdp->grpmask);
rdp->cpu_no_qs.b.norm = need_gp;
rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_dynticks.rcu_qs_ctr);
rdp->core_needs_qs = need_gp;
zero_cpu_stall_ticks(rdp);
- WRITE_ONCE(rdp->gpwrap, false);
- rcu_gpnum_ovf(rnp, rdp);
}
+ rdp->gp_seq = rnp->gp_seq; /* Remember new grace-period state. */
+ if (ULONG_CMP_GE(rnp->gp_seq_needed, rdp->gp_seq_needed) || rdp->gpwrap)
+ rdp->gp_seq_needed = rnp->gp_seq_needed;
+ WRITE_ONCE(rdp->gpwrap, false);
+ rcu_gpnum_ovf(rnp, rdp);
return ret;
}
@@ -1863,8 +1864,7 @@ static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp)
local_irq_save(flags);
rnp = rdp->mynode;
- if ((rdp->gpnum == READ_ONCE(rnp->gpnum) &&
- rdp->completed == READ_ONCE(rnp->completed) &&
+ if ((rdp->gp_seq == rcu_seq_current(&rnp->gp_seq) &&
!unlikely(READ_ONCE(rdp->gpwrap))) || /* w/out lock. */
!raw_spin_trylock_rcu_node(rnp)) { /* irqs already off, so later. */
local_irq_restore(flags);
@@ -1879,7 +1879,8 @@ static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp)
static void rcu_gp_slow(struct rcu_state *rsp, int delay)
{
if (delay > 0 &&
- !(rsp->gpnum % (rcu_num_nodes * PER_RCU_NODE_PERIOD * delay)))
+ !(rcu_seq_ctr(rsp->gp_seq) %
+ (rcu_num_nodes * PER_RCU_NODE_PERIOD * delay)))
schedule_timeout_uninterruptible(delay);
}
@@ -1888,7 +1889,9 @@ static void rcu_gp_slow(struct rcu_state *rsp, int delay)
*/
static bool rcu_gp_init(struct rcu_state *rsp)
{
+ unsigned long flags;
unsigned long oldmask;
+ unsigned long mask;
struct rcu_data *rdp;
struct rcu_node *rnp = rcu_get_root(rsp);
@@ -1912,9 +1915,9 @@ static bool rcu_gp_init(struct rcu_state *rsp)
/* Advance to a new grace period and initialize state. */
record_gp_stall_check_time(rsp);
- /* Record GP times before starting GP, hence smp_store_release(). */
- smp_store_release(&rsp->gpnum, rsp->gpnum + 1);
- trace_rcu_grace_period(rsp->name, rsp->gpnum, TPS("start"));
+ /* Record GP times before starting GP, hence rcu_seq_start(). */
+ rcu_seq_start(&rsp->gp_seq);
+ trace_rcu_grace_period(rsp->name, rsp->gp_seq, TPS("start"));
raw_spin_unlock_irq_rcu_node(rnp);
/*
@@ -1923,13 +1926,15 @@ static bool rcu_gp_init(struct rcu_state *rsp)
* for subsequent online CPUs, and that quiescent-state forcing
* will handle subsequent offline CPUs.
*/
+ rsp->gp_state = RCU_GP_ONOFF;
rcu_for_each_leaf_node(rsp, rnp) {
- rcu_gp_slow(rsp, gp_preinit_delay);
+ spin_lock(&rsp->ofl_lock);
raw_spin_lock_irq_rcu_node(rnp);
if (rnp->qsmaskinit == rnp->qsmaskinitnext &&
!rnp->wait_blkd_tasks) {
/* Nothing to do on this leaf rcu_node structure. */
raw_spin_unlock_irq_rcu_node(rnp);
+ spin_unlock(&rsp->ofl_lock);
continue;
}
@@ -1939,12 +1944,14 @@ static bool rcu_gp_init(struct rcu_state *rsp)
/* If zero-ness of ->qsmaskinit changed, propagate up tree. */
if (!oldmask != !rnp->qsmaskinit) {
- if (!oldmask) /* First online CPU for this rcu_node. */
- rcu_init_new_rnp(rnp);
- else if (rcu_preempt_has_tasks(rnp)) /* blocked tasks */
- rnp->wait_blkd_tasks = true;
- else /* Last offline CPU and can propagate. */
+ if (!oldmask) { /* First online CPU for rcu_node. */
+ if (!rnp->wait_blkd_tasks) /* Ever offline? */
+ rcu_init_new_rnp(rnp);
+ } else if (rcu_preempt_has_tasks(rnp)) {
+ rnp->wait_blkd_tasks = true; /* blocked tasks */
+ } else { /* Last offline CPU and can propagate. */
rcu_cleanup_dead_rnp(rnp);
+ }
}
/*
@@ -1953,18 +1960,19 @@ static bool rcu_gp_init(struct rcu_state *rsp)
* still offline, propagate up the rcu_node tree and
* clear ->wait_blkd_tasks. Otherwise, if one of this
* rcu_node structure's CPUs has since come back online,
- * simply clear ->wait_blkd_tasks (but rcu_cleanup_dead_rnp()
- * checks for this, so just call it unconditionally).
+ * simply clear ->wait_blkd_tasks.
*/
if (rnp->wait_blkd_tasks &&
- (!rcu_preempt_has_tasks(rnp) ||
- rnp->qsmaskinit)) {
+ (!rcu_preempt_has_tasks(rnp) || rnp->qsmaskinit)) {
rnp->wait_blkd_tasks = false;
- rcu_cleanup_dead_rnp(rnp);
+ if (!rnp->qsmaskinit)
+ rcu_cleanup_dead_rnp(rnp);
}
raw_spin_unlock_irq_rcu_node(rnp);
+ spin_unlock(&rsp->ofl_lock);
}
+ rcu_gp_slow(rsp, gp_preinit_delay); /* Races with CPU hotplug. */
/*
* Set the quiescent-state-needed bits in all the rcu_node
@@ -1978,22 +1986,27 @@ static bool rcu_gp_init(struct rcu_state *rsp)
* The grace period cannot complete until the initialization
* process finishes, because this kthread handles both.
*/
+ rsp->gp_state = RCU_GP_INIT;
rcu_for_each_node_breadth_first(rsp, rnp) {
rcu_gp_slow(rsp, gp_init_delay);
- raw_spin_lock_irq_rcu_node(rnp);
+ raw_spin_lock_irqsave_rcu_node(rnp, flags);
rdp = this_cpu_ptr(rsp->rda);
- rcu_preempt_check_blocked_tasks(rnp);
+ rcu_preempt_check_blocked_tasks(rsp, rnp);
rnp->qsmask = rnp->qsmaskinit;
- WRITE_ONCE(rnp->gpnum, rsp->gpnum);
- if (WARN_ON_ONCE(rnp->completed != rsp->completed))
- WRITE_ONCE(rnp->completed, rsp->completed);
+ WRITE_ONCE(rnp->gp_seq, rsp->gp_seq);
if (rnp == rdp->mynode)
(void)__note_gp_changes(rsp, rnp, rdp);
rcu_preempt_boost_start_gp(rnp);
- trace_rcu_grace_period_init(rsp->name, rnp->gpnum,
+ trace_rcu_grace_period_init(rsp->name, rnp->gp_seq,
rnp->level, rnp->grplo,
rnp->grphi, rnp->qsmask);
- raw_spin_unlock_irq_rcu_node(rnp);
+ /* Quiescent states for tasks on any now-offline CPUs. */
+ mask = rnp->qsmask & ~rnp->qsmaskinitnext;
+ rnp->rcu_gp_init_mask = mask;
+ if ((mask || rnp->wait_blkd_tasks) && rcu_is_leaf_node(rnp))
+ rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags);
+ else
+ raw_spin_unlock_irq_rcu_node(rnp);
cond_resched_tasks_rcu_qs();
WRITE_ONCE(rsp->gp_activity, jiffies);
}
@@ -2053,6 +2066,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
{
unsigned long gp_duration;
bool needgp = false;
+ unsigned long new_gp_seq;
struct rcu_data *rdp;
struct rcu_node *rnp = rcu_get_root(rsp);
struct swait_queue_head *sq;
@@ -2074,19 +2088,22 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
raw_spin_unlock_irq_rcu_node(rnp);
/*
- * Propagate new ->completed value to rcu_node structures so
- * that other CPUs don't have to wait until the start of the next
- * grace period to process their callbacks. This also avoids
- * some nasty RCU grace-period initialization races by forcing
- * the end of the current grace period to be completely recorded in
- * all of the rcu_node structures before the beginning of the next
- * grace period is recorded in any of the rcu_node structures.
+ * Propagate new ->gp_seq value to rcu_node structures so that
+ * other CPUs don't have to wait until the start of the next grace
+ * period to process their callbacks. This also avoids some nasty
+ * RCU grace-period initialization races by forcing the end of
+ * the current grace period to be completely recorded in all of
+ * the rcu_node structures before the beginning of the next grace
+ * period is recorded in any of the rcu_node structures.
*/
+ new_gp_seq = rsp->gp_seq;
+ rcu_seq_end(&new_gp_seq);
rcu_for_each_node_breadth_first(rsp, rnp) {
raw_spin_lock_irq_rcu_node(rnp);
- WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp));
+ if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)))
+ dump_blkd_tasks(rsp, rnp, 10);
WARN_ON_ONCE(rnp->qsmask);
- WRITE_ONCE(rnp->completed, rsp->gpnum);
+ WRITE_ONCE(rnp->gp_seq, new_gp_seq);
rdp = this_cpu_ptr(rsp->rda);
if (rnp == rdp->mynode)
needgp = __note_gp_changes(rsp, rnp, rdp) || needgp;
@@ -2100,26 +2117,28 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
rcu_gp_slow(rsp, gp_cleanup_delay);
}
rnp = rcu_get_root(rsp);
- raw_spin_lock_irq_rcu_node(rnp); /* Order GP before ->completed update. */
+ raw_spin_lock_irq_rcu_node(rnp); /* GP before rsp->gp_seq update. */
/* Declare grace period done. */
- WRITE_ONCE(rsp->completed, rsp->gpnum);
- trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end"));
+ rcu_seq_end(&rsp->gp_seq);
+ trace_rcu_grace_period(rsp->name, rsp->gp_seq, TPS("end"));
rsp->gp_state = RCU_GP_IDLE;
/* Check for GP requests since above loop. */
rdp = this_cpu_ptr(rsp->rda);
- if (need_any_future_gp(rnp)) {
- trace_rcu_this_gp(rnp, rdp, rsp->completed - 1,
+ if (!needgp && ULONG_CMP_LT(rnp->gp_seq, rnp->gp_seq_needed)) {
+ trace_rcu_this_gp(rnp, rdp, rnp->gp_seq_needed,
TPS("CleanupMore"));
needgp = true;
}
/* Advance CBs to reduce false positives below. */
if (!rcu_accelerate_cbs(rsp, rnp, rdp) && needgp) {
WRITE_ONCE(rsp->gp_flags, RCU_GP_FLAG_INIT);
- trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gpnum),
+ rsp->gp_req_activity = jiffies;
+ trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq),
TPS("newreq"));
+ } else {
+ WRITE_ONCE(rsp->gp_flags, rsp->gp_flags & RCU_GP_FLAG_INIT);
}
- WRITE_ONCE(rsp->gp_flags, rsp->gp_flags & RCU_GP_FLAG_INIT);
raw_spin_unlock_irq_rcu_node(rnp);
}
@@ -2141,7 +2160,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
/* Handle grace-period start. */
for (;;) {
trace_rcu_grace_period(rsp->name,
- READ_ONCE(rsp->gpnum),
+ READ_ONCE(rsp->gp_seq),
TPS("reqwait"));
rsp->gp_state = RCU_GP_WAIT_GPS;
swait_event_idle(rsp->gp_wq, READ_ONCE(rsp->gp_flags) &
@@ -2154,17 +2173,13 @@ static int __noreturn rcu_gp_kthread(void *arg)
WRITE_ONCE(rsp->gp_activity, jiffies);
WARN_ON(signal_pending(current));
trace_rcu_grace_period(rsp->name,
- READ_ONCE(rsp->gpnum),
+ READ_ONCE(rsp->gp_seq),
TPS("reqwaitsig"));
}
/* Handle quiescent-state forcing. */
first_gp_fqs = true;
j = jiffies_till_first_fqs;
- if (j > HZ) {
- j = HZ;
- jiffies_till_first_fqs = HZ;
- }
ret = 0;
for (;;) {
if (!ret) {
@@ -2173,7 +2188,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
jiffies + 3 * j);
}
trace_rcu_grace_period(rsp->name,
- READ_ONCE(rsp->gpnum),
+ READ_ONCE(rsp->gp_seq),
TPS("fqswait"));
rsp->gp_state = RCU_GP_WAIT_FQS;
ret = swait_event_idle_timeout(rsp->gp_wq,
@@ -2188,31 +2203,24 @@ static int __noreturn rcu_gp_kthread(void *arg)
if (ULONG_CMP_GE(jiffies, rsp->jiffies_force_qs) ||
(gf & RCU_GP_FLAG_FQS)) {
trace_rcu_grace_period(rsp->name,
- READ_ONCE(rsp->gpnum),
+ READ_ONCE(rsp->gp_seq),
TPS("fqsstart"));
rcu_gp_fqs(rsp, first_gp_fqs);
first_gp_fqs = false;
trace_rcu_grace_period(rsp->name,
- READ_ONCE(rsp->gpnum),
+ READ_ONCE(rsp->gp_seq),
TPS("fqsend"));
cond_resched_tasks_rcu_qs();
WRITE_ONCE(rsp->gp_activity, jiffies);
ret = 0; /* Force full wait till next FQS. */
j = jiffies_till_next_fqs;
- if (j > HZ) {
- j = HZ;
- jiffies_till_next_fqs = HZ;
- } else if (j < 1) {
- j = 1;
- jiffies_till_next_fqs = 1;
- }
} else {
/* Deal with stray signal. */
cond_resched_tasks_rcu_qs();
WRITE_ONCE(rsp->gp_activity, jiffies);
WARN_ON(signal_pending(current));
trace_rcu_grace_period(rsp->name,
- READ_ONCE(rsp->gpnum),
+ READ_ONCE(rsp->gp_seq),
TPS("fqswaitsig"));
ret = 1; /* Keep old FQS timing. */
j = jiffies;
@@ -2256,8 +2264,12 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
* must be represented by the same rcu_node structure (which need not be a
* leaf rcu_node structure, though it often will be). The gps parameter
* is the grace-period snapshot, which means that the quiescent states
- * are valid only if rnp->gpnum is equal to gps. That structure's lock
+ * are valid only if rnp->gp_seq is equal to gps. That structure's lock
* must be held upon entry, and it is released before return.
+ *
+ * As a special case, if mask is zero, the bit-already-cleared check is
+ * disabled. This allows propagating quiescent state due to resumed tasks
+ * during grace-period initialization.
*/
static void
rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
@@ -2271,7 +2283,7 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
/* Walk up the rcu_node hierarchy. */
for (;;) {
- if (!(rnp->qsmask & mask) || rnp->gpnum != gps) {
+ if ((!(rnp->qsmask & mask) && mask) || rnp->gp_seq != gps) {
/*
* Our bit has already been cleared, or the
@@ -2284,7 +2296,7 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
WARN_ON_ONCE(!rcu_is_leaf_node(rnp) &&
rcu_preempt_blocked_readers_cgp(rnp));
rnp->qsmask &= ~mask;
- trace_rcu_quiescent_state_report(rsp->name, rnp->gpnum,
+ trace_rcu_quiescent_state_report(rsp->name, rnp->gp_seq,
mask, rnp->qsmask, rnp->level,
rnp->grplo, rnp->grphi,
!!rnp->gp_tasks);
@@ -2294,6 +2306,7 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
+ rnp->completedqs = rnp->gp_seq;
mask = rnp->grpmask;
if (rnp->parent == NULL) {
@@ -2323,8 +2336,9 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
* irqs disabled, and this lock is released upon return, but irqs remain
* disabled.
*/
-static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
- struct rcu_node *rnp, unsigned long flags)
+static void __maybe_unused
+rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
+ struct rcu_node *rnp, unsigned long flags)
__releases(rnp->lock)
{
unsigned long gps;
@@ -2332,12 +2346,15 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
struct rcu_node *rnp_p;
raw_lockdep_assert_held_rcu_node(rnp);
- if (rcu_state_p == &rcu_sched_state || rsp != rcu_state_p ||
- rnp->qsmask != 0 || rcu_preempt_blocked_readers_cgp(rnp)) {
+ if (WARN_ON_ONCE(rcu_state_p == &rcu_sched_state) ||
+ WARN_ON_ONCE(rsp != rcu_state_p) ||
+ WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)) ||
+ rnp->qsmask != 0) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return; /* Still need more quiescent states! */
}
+ rnp->completedqs = rnp->gp_seq;
rnp_p = rnp->parent;
if (rnp_p == NULL) {
/*
@@ -2348,8 +2365,8 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
return;
}
- /* Report up the rest of the hierarchy, tracking current ->gpnum. */
- gps = rnp->gpnum;
+ /* Report up the rest of the hierarchy, tracking current ->gp_seq. */
+ gps = rnp->gp_seq;
mask = rnp->grpmask;
raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
raw_spin_lock_rcu_node(rnp_p); /* irqs already disabled. */
@@ -2370,8 +2387,8 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp)
rnp = rdp->mynode;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
- if (rdp->cpu_no_qs.b.norm || rdp->gpnum != rnp->gpnum ||
- rnp->completed == rnp->gpnum || rdp->gpwrap) {
+ if (rdp->cpu_no_qs.b.norm || rdp->gp_seq != rnp->gp_seq ||
+ rdp->gpwrap) {
/*
* The grace period in which this quiescent state was
@@ -2396,7 +2413,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp)
*/
needwake = rcu_accelerate_cbs(rsp, rnp, rdp);
- rcu_report_qs_rnp(mask, rsp, rnp, rnp->gpnum, flags);
+ rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags);
/* ^^^ Released rnp->lock */
if (needwake)
rcu_gp_kthread_wake(rsp);
@@ -2441,17 +2458,16 @@ rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp)
*/
static void rcu_cleanup_dying_cpu(struct rcu_state *rsp)
{
- RCU_TRACE(unsigned long mask;)
+ RCU_TRACE(bool blkd;)
RCU_TRACE(struct rcu_data *rdp = this_cpu_ptr(rsp->rda);)
RCU_TRACE(struct rcu_node *rnp = rdp->mynode;)
if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
return;
- RCU_TRACE(mask = rdp->grpmask;)
- trace_rcu_grace_period(rsp->name,
- rnp->gpnum + 1 - !!(rnp->qsmask & mask),
- TPS("cpuofl"));
+ RCU_TRACE(blkd = !!(rnp->qsmask & rdp->grpmask);)
+ trace_rcu_grace_period(rsp->name, rnp->gp_seq,
+ blkd ? TPS("cpuofl") : TPS("cpuofl-bgp"));
}
/*
@@ -2463,7 +2479,7 @@ static void rcu_cleanup_dying_cpu(struct rcu_state *rsp)
* This function therefore goes up the tree of rcu_node structures,
* clearing the corresponding bits in the ->qsmaskinit fields. Note that
* the leaf rcu_node structure's ->qsmaskinit field has already been
- * updated
+ * updated.
*
* This function does check that the specified rcu_node structure has
* all CPUs offline and no blocked tasks, so it is OK to invoke it
@@ -2476,9 +2492,10 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
long mask;
struct rcu_node *rnp = rnp_leaf;
- raw_lockdep_assert_held_rcu_node(rnp);
+ raw_lockdep_assert_held_rcu_node(rnp_leaf);
if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) ||
- rnp->qsmaskinit || rcu_preempt_has_tasks(rnp))
+ WARN_ON_ONCE(rnp_leaf->qsmaskinit) ||
+ WARN_ON_ONCE(rcu_preempt_has_tasks(rnp_leaf)))
return;
for (;;) {
mask = rnp->grpmask;
@@ -2487,7 +2504,8 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
break;
raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
rnp->qsmaskinit &= ~mask;
- rnp->qsmask &= ~mask;
+ /* Between grace periods, so better already be zero! */
+ WARN_ON_ONCE(rnp->qsmask);
if (rnp->qsmaskinit) {
raw_spin_unlock_rcu_node(rnp);
/* irqs remain disabled. */
@@ -2630,6 +2648,7 @@ void rcu_check_callbacks(int user)
rcu_sched_qs();
rcu_bh_qs();
+ rcu_note_voluntary_context_switch(current);
} else if (!in_softirq()) {
@@ -2645,8 +2664,7 @@ void rcu_check_callbacks(int user)
rcu_preempt_check_callbacks();
if (rcu_pending())
invoke_rcu_core();
- if (user)
- rcu_note_voluntary_context_switch(current);
+
trace_rcu_utilization(TPS("End scheduler-tick"));
}
@@ -2681,17 +2699,8 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp))
/* rcu_initiate_boost() releases rnp->lock */
continue;
}
- if (rnp->parent &&
- (rnp->parent->qsmask & rnp->grpmask)) {
- /*
- * Race between grace-period
- * initialization and task exiting RCU
- * read-side critical section: Report.
- */
- rcu_report_unblock_qs_rnp(rsp, rnp, flags);
- /* rcu_report_unblock_qs_rnp() rlses ->lock */
- continue;
- }
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ continue;
}
for_each_leaf_node_possible_cpu(rnp, cpu) {
unsigned long bit = leaf_node_cpu_bit(rnp, cpu);
@@ -2701,8 +2710,8 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp))
}
}
if (mask != 0) {
- /* Idle/offline CPUs, report (releases rnp->lock. */
- rcu_report_qs_rnp(mask, rsp, rnp, rnp->gpnum, flags);
+ /* Idle/offline CPUs, report (releases rnp->lock). */
+ rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags);
} else {
/* Nothing to do here, so just drop the lock. */
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
@@ -2747,6 +2756,65 @@ static void force_quiescent_state(struct rcu_state *rsp)
}
/*
+ * This function checks for grace-period requests that fail to motivate
+ * RCU to come out of its idle mode.
+ */
+static void
+rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp,
+ struct rcu_data *rdp)
+{
+ const unsigned long gpssdelay = rcu_jiffies_till_stall_check() * HZ;
+ unsigned long flags;
+ unsigned long j;
+ struct rcu_node *rnp_root = rcu_get_root(rsp);
+ static atomic_t warned = ATOMIC_INIT(0);
+
+ if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress(rsp) ||
+ ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed))
+ return;
+ j = jiffies; /* Expensive access, and in common case don't get here. */
+ if (time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) ||
+ time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) ||
+ atomic_read(&warned))
+ return;
+
+ raw_spin_lock_irqsave_rcu_node(rnp, flags);
+ j = jiffies;
+ if (rcu_gp_in_progress(rsp) ||
+ ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
+ time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) ||
+ time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) ||
+ atomic_read(&warned)) {
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ return;
+ }
+ /* Hold onto the leaf lock to make others see warned==1. */
+
+ if (rnp_root != rnp)
+ raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */
+ j = jiffies;
+ if (rcu_gp_in_progress(rsp) ||
+ ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
+ time_before(j, rsp->gp_req_activity + gpssdelay) ||
+ time_before(j, rsp->gp_activity + gpssdelay) ||
+ atomic_xchg(&warned, 1)) {
+ raw_spin_unlock_rcu_node(rnp_root); /* irqs remain disabled. */
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ return;
+ }
+ pr_alert("%s: g%ld->%ld gar:%lu ga:%lu f%#x gs:%d %s->state:%#lx\n",
+ __func__, (long)READ_ONCE(rsp->gp_seq),
+ (long)READ_ONCE(rnp_root->gp_seq_needed),
+ j - rsp->gp_req_activity, j - rsp->gp_activity,
+ rsp->gp_flags, rsp->gp_state, rsp->name,
+ rsp->gp_kthread ? rsp->gp_kthread->state : 0x1ffffL);
+ WARN_ON(1);
+ if (rnp_root != rnp)
+ raw_spin_unlock_rcu_node(rnp_root);
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+}
+
+/*
* This does the RCU core processing work for the specified rcu_state
* and rcu_data structures. This may be called only from the CPU to
* whom the rdp belongs.
@@ -2755,9 +2823,8 @@ static void
__rcu_process_callbacks(struct rcu_state *rsp)
{
unsigned long flags;
- bool needwake;
struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
- struct rcu_node *rnp;
+ struct rcu_node *rnp = rdp->mynode;
WARN_ON_ONCE(!rdp->beenonline);
@@ -2768,18 +2835,13 @@ __rcu_process_callbacks(struct rcu_state *rsp)
if (!rcu_gp_in_progress(rsp) &&
rcu_segcblist_is_enabled(&rdp->cblist)) {
local_irq_save(flags);
- if (rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) {
- local_irq_restore(flags);
- } else {
- rnp = rdp->mynode;
- raw_spin_lock_rcu_node(rnp); /* irqs disabled. */
- needwake = rcu_accelerate_cbs(rsp, rnp, rdp);
- raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
- if (needwake)
- rcu_gp_kthread_wake(rsp);
- }
+ if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL))
+ rcu_accelerate_cbs_unlocked(rsp, rnp, rdp);
+ local_irq_restore(flags);
}
+ rcu_check_gp_start_stall(rsp, rnp, rdp);
+
/* If there are callbacks ready, invoke them. */
if (rcu_segcblist_ready_cbs(&rdp->cblist))
invoke_rcu_callbacks(rsp, rdp);
@@ -2833,8 +2895,6 @@ static void invoke_rcu_core(void)
static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp,
struct rcu_head *head, unsigned long flags)
{
- bool needwake;
-
/*
* If called from an extended quiescent state, invoke the RCU
* core in order to force a re-evaluation of RCU's idleness.
@@ -2861,13 +2921,7 @@ static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp,
/* Start a new grace period if one not already started. */
if (!rcu_gp_in_progress(rsp)) {
- struct rcu_node *rnp = rdp->mynode;
-
- raw_spin_lock_rcu_node(rnp);
- needwake = rcu_accelerate_cbs(rsp, rnp, rdp);
- raw_spin_unlock_rcu_node(rnp);
- if (needwake)
- rcu_gp_kthread_wake(rsp);
+ rcu_accelerate_cbs_unlocked(rsp, rdp->mynode, rdp);
} else {
/* Give the grace period a kick. */
rdp->blimit = LONG_MAX;
@@ -3037,7 +3091,7 @@ EXPORT_SYMBOL_GPL(kfree_call_rcu);
* when there was in fact only one the whole time, as this just adds
* some overhead: RCU still operates correctly.
*/
-static inline int rcu_blocking_is_gp(void)
+static int rcu_blocking_is_gp(void)
{
int ret;
@@ -3136,16 +3190,10 @@ unsigned long get_state_synchronize_rcu(void)
{
/*
* Any prior manipulation of RCU-protected data must happen
- * before the load from ->gpnum.
+ * before the load from ->gp_seq.
*/
smp_mb(); /* ^^^ */
-
- /*
- * Make sure this load happens before the purportedly
- * time-consuming work between get_state_synchronize_rcu()
- * and cond_synchronize_rcu().
- */
- return smp_load_acquire(&rcu_state_p->gpnum);
+ return rcu_seq_snap(&rcu_state_p->gp_seq);
}
EXPORT_SYMBOL_GPL(get_state_synchronize_rcu);
@@ -3165,15 +3213,10 @@ EXPORT_SYMBOL_GPL(get_state_synchronize_rcu);
*/
void cond_synchronize_rcu(unsigned long oldstate)
{
- unsigned long newstate;
-
- /*
- * Ensure that this load happens before any RCU-destructive
- * actions the caller might carry out after we return.
- */
- newstate = smp_load_acquire(&rcu_state_p->completed);
- if (ULONG_CMP_GE(oldstate, newstate))
+ if (!rcu_seq_done(&rcu_state_p->gp_seq, oldstate))
synchronize_rcu();
+ else
+ smp_mb(); /* Ensure GP ends before subsequent accesses. */
}
EXPORT_SYMBOL_GPL(cond_synchronize_rcu);
@@ -3188,16 +3231,10 @@ unsigned long get_state_synchronize_sched(void)
{
/*
* Any prior manipulation of RCU-protected data must happen
- * before the load from ->gpnum.
+ * before the load from ->gp_seq.
*/
smp_mb(); /* ^^^ */
-
- /*
- * Make sure this load happens before the purportedly
- * time-consuming work between get_state_synchronize_sched()
- * and cond_synchronize_sched().
- */
- return smp_load_acquire(&rcu_sched_state.gpnum);
+ return rcu_seq_snap(&rcu_sched_state.gp_seq);
}
EXPORT_SYMBOL_GPL(get_state_synchronize_sched);
@@ -3217,15 +3254,10 @@ EXPORT_SYMBOL_GPL(get_state_synchronize_sched);
*/
void cond_synchronize_sched(unsigned long oldstate)
{
- unsigned long newstate;
-
- /*
- * Ensure that this load happens before any RCU-destructive
- * actions the caller might carry out after we return.
- */
- newstate = smp_load_acquire(&rcu_sched_state.completed);
- if (ULONG_CMP_GE(oldstate, newstate))
+ if (!rcu_seq_done(&rcu_sched_state.gp_seq, oldstate))
synchronize_sched();
+ else
+ smp_mb(); /* Ensure GP ends before subsequent accesses. */
}
EXPORT_SYMBOL_GPL(cond_synchronize_sched);
@@ -3261,12 +3293,8 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL))
return 1;
- /* Has another RCU grace period completed? */
- if (READ_ONCE(rnp->completed) != rdp->completed) /* outside lock */
- return 1;
-
- /* Has a new RCU grace period started? */
- if (READ_ONCE(rnp->gpnum) != rdp->gpnum ||
+ /* Have RCU grace period completed or started? */
+ if (rcu_seq_current(&rnp->gp_seq) != rdp->gp_seq ||
unlikely(READ_ONCE(rdp->gpwrap))) /* outside lock */
return 1;
@@ -3298,7 +3326,7 @@ static int rcu_pending(void)
* non-NULL, store an indication of whether all callbacks are lazy.
* (If there are no callbacks, all of them are deemed to be lazy.)
*/
-static bool __maybe_unused rcu_cpu_has_callbacks(bool *all_lazy)
+static bool rcu_cpu_has_callbacks(bool *all_lazy)
{
bool al = true;
bool hc = false;
@@ -3484,17 +3512,22 @@ EXPORT_SYMBOL_GPL(rcu_barrier_sched);
static void rcu_init_new_rnp(struct rcu_node *rnp_leaf)
{
long mask;
+ long oldmask;
struct rcu_node *rnp = rnp_leaf;
- raw_lockdep_assert_held_rcu_node(rnp);
+ raw_lockdep_assert_held_rcu_node(rnp_leaf);
+ WARN_ON_ONCE(rnp->wait_blkd_tasks);
for (;;) {
mask = rnp->grpmask;
rnp = rnp->parent;
if (rnp == NULL)
return;
raw_spin_lock_rcu_node(rnp); /* Interrupts already disabled. */
+ oldmask = rnp->qsmaskinit;
rnp->qsmaskinit |= mask;
raw_spin_unlock_rcu_node(rnp); /* Interrupts remain disabled. */
+ if (oldmask)
+ return;
}
}
@@ -3511,6 +3544,10 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
rdp->dynticks = &per_cpu(rcu_dynticks, cpu);
WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != 1);
WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp->dynticks)));
+ rdp->rcu_ofl_gp_seq = rsp->gp_seq;
+ rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED;
+ rdp->rcu_onl_gp_seq = rsp->gp_seq;
+ rdp->rcu_onl_gp_flags = RCU_GP_CLEANED;
rdp->cpu = cpu;
rdp->rsp = rsp;
rcu_boot_init_nocb_percpu_data(rdp);
@@ -3518,9 +3555,9 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
/*
* Initialize a CPU's per-CPU RCU data. Note that only one online or
- * offline event can be happening at a given time. Note also that we
- * can accept some slop in the rsp->completed access due to the fact
- * that this CPU cannot possibly have any RCU callbacks in flight yet.
+ * offline event can be happening at a given time. Note also that we can
+ * accept some slop in the rsp->gp_seq access due to the fact that this
+ * CPU cannot possibly have any RCU callbacks in flight yet.
*/
static void
rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
@@ -3549,14 +3586,14 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
rnp = rdp->mynode;
raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
rdp->beenonline = true; /* We have now been online. */
- rdp->gpnum = rnp->completed; /* Make CPU later note any new GP. */
- rdp->completed = rnp->completed;
+ rdp->gp_seq = rnp->gp_seq;
+ rdp->gp_seq_needed = rnp->gp_seq;
rdp->cpu_no_qs.b.norm = true;
rdp->rcu_qs_ctr_snap = per_cpu(rcu_dynticks.rcu_qs_ctr, cpu);
rdp->core_needs_qs = false;
rdp->rcu_iw_pending = false;
- rdp->rcu_iw_gpnum = rnp->gpnum - 1;
- trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("cpuonl"));
+ rdp->rcu_iw_gp_seq = rnp->gp_seq - 1;
+ trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuonl"));
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
@@ -3705,7 +3742,15 @@ void rcu_cpu_starting(unsigned int cpu)
nbits = bitmap_weight(&oldmask, BITS_PER_LONG);
/* Allow lockless access for expedited grace periods. */
smp_store_release(&rsp->ncpus, rsp->ncpus + nbits); /* ^^^ */
- raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */
+ rdp->rcu_onl_gp_seq = READ_ONCE(rsp->gp_seq);
+ rdp->rcu_onl_gp_flags = READ_ONCE(rsp->gp_flags);
+ if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */
+ /* Report QS -after- changing ->qsmaskinitnext! */
+ rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags);
+ } else {
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ }
}
smp_mb(); /* Ensure RCU read-side usage follows above initialization. */
}
@@ -3713,7 +3758,7 @@ void rcu_cpu_starting(unsigned int cpu)
#ifdef CONFIG_HOTPLUG_CPU
/*
* The CPU is exiting the idle loop into the arch_cpu_idle_dead()
- * function. We now remove it from the rcu_node tree's ->qsmaskinit
+ * function. We now remove it from the rcu_node tree's ->qsmaskinitnext
* bit masks.
*/
static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp)
@@ -3725,9 +3770,18 @@ static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp)
/* Remove outgoing CPU from mask in the leaf rcu_node structure. */
mask = rdp->grpmask;
+ spin_lock(&rsp->ofl_lock);
raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */
+ rdp->rcu_ofl_gp_seq = READ_ONCE(rsp->gp_seq);
+ rdp->rcu_ofl_gp_flags = READ_ONCE(rsp->gp_flags);
+ if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */
+ /* Report quiescent state -before- changing ->qsmaskinitnext! */
+ rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags);
+ raw_spin_lock_irqsave_rcu_node(rnp, flags);
+ }
rnp->qsmaskinitnext &= ~mask;
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ spin_unlock(&rsp->ofl_lock);
}
/*
@@ -3839,12 +3893,16 @@ static int __init rcu_spawn_gp_kthread(void)
struct task_struct *t;
/* Force priority into range. */
- if (IS_ENABLED(CONFIG_RCU_BOOST) && kthread_prio < 1)
+ if (IS_ENABLED(CONFIG_RCU_BOOST) && kthread_prio < 2
+ && IS_BUILTIN(CONFIG_RCU_TORTURE_TEST))
+ kthread_prio = 2;
+ else if (IS_ENABLED(CONFIG_RCU_BOOST) && kthread_prio < 1)
kthread_prio = 1;
else if (kthread_prio < 0)
kthread_prio = 0;
else if (kthread_prio > 99)
kthread_prio = 99;
+
if (kthread_prio != kthread_prio_in)
pr_alert("rcu_spawn_gp_kthread(): Limited prio to %d from %d\n",
kthread_prio, kthread_prio_in);
@@ -3928,8 +3986,9 @@ static void __init rcu_init_one(struct rcu_state *rsp)
raw_spin_lock_init(&rnp->fqslock);
lockdep_set_class_and_name(&rnp->fqslock,
&rcu_fqs_class[i], fqs[i]);
- rnp->gpnum = rsp->gpnum;
- rnp->completed = rsp->completed;
+ rnp->gp_seq = rsp->gp_seq;
+ rnp->gp_seq_needed = rsp->gp_seq;
+ rnp->completedqs = rsp->gp_seq;
rnp->qsmask = 0;
rnp->qsmaskinit = 0;
rnp->grplo = j * cpustride;
@@ -3997,7 +4056,7 @@ static void __init rcu_init_geometry(void)
if (rcu_fanout_leaf == RCU_FANOUT_LEAF &&
nr_cpu_ids == NR_CPUS)
return;
- pr_info("RCU: Adjusting geometry for rcu_fanout_leaf=%d, nr_cpu_ids=%u\n",
+ pr_info("Adjusting geometry for rcu_fanout_leaf=%d, nr_cpu_ids=%u\n",
rcu_fanout_leaf, nr_cpu_ids);
/*
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 78e051dffc5b..4e74df768c57 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -81,18 +81,16 @@ struct rcu_node {
raw_spinlock_t __private lock; /* Root rcu_node's lock protects */
/* some rcu_state fields as well as */
/* following. */
- unsigned long gpnum; /* Current grace period for this node. */
- /* This will either be equal to or one */
- /* behind the root rcu_node's gpnum. */
- unsigned long completed; /* Last GP completed for this node. */
- /* This will either be equal to or one */
- /* behind the root rcu_node's gpnum. */
+ unsigned long gp_seq; /* Track rsp->rcu_gp_seq. */
+ unsigned long gp_seq_needed; /* Track rsp->rcu_gp_seq_needed. */
+ unsigned long completedqs; /* All QSes done for this node. */
unsigned long qsmask; /* CPUs or groups that need to switch in */
/* order for current grace period to proceed.*/
/* In leaf rcu_node, each bit corresponds to */
/* an rcu_data structure, otherwise, each */
/* bit corresponds to a child rcu_node */
/* structure. */
+ unsigned long rcu_gp_init_mask; /* Mask of offline CPUs at GP init. */
unsigned long qsmaskinit;
/* Per-GP initial value for qsmask. */
/* Initialized from ->qsmaskinitnext at the */
@@ -158,7 +156,6 @@ struct rcu_node {
struct swait_queue_head nocb_gp_wq[2];
/* Place for rcu_nocb_kthread() to wait GP. */
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */
- u8 need_future_gp[4]; /* Counts of upcoming GP requests. */
raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp;
spinlock_t exp_lock ____cacheline_internodealigned_in_smp;
@@ -168,22 +165,6 @@ struct rcu_node {
bool exp_need_flush; /* Need to flush workitem? */
} ____cacheline_internodealigned_in_smp;
-/* Accessors for ->need_future_gp[] array. */
-#define need_future_gp_mask() \
- (ARRAY_SIZE(((struct rcu_node *)NULL)->need_future_gp) - 1)
-#define need_future_gp_element(rnp, c) \
- ((rnp)->need_future_gp[(c) & need_future_gp_mask()])
-#define need_any_future_gp(rnp) \
-({ \
- int __i; \
- bool __nonzero = false; \
- \
- for (__i = 0; __i < ARRAY_SIZE((rnp)->need_future_gp); __i++) \
- __nonzero = __nonzero || \
- READ_ONCE((rnp)->need_future_gp[__i]); \
- __nonzero; \
-})
-
/*
* Bitmasks in an rcu_node cover the interval [grplo, grphi] of CPU IDs, and
* are indexed relative to this interval rather than the global CPU ID space.
@@ -206,16 +187,14 @@ union rcu_noqs {
/* Per-CPU data for read-copy update. */
struct rcu_data {
/* 1) quiescent-state and grace-period handling : */
- unsigned long completed; /* Track rsp->completed gp number */
- /* in order to detect GP end. */
- unsigned long gpnum; /* Highest gp number that this CPU */
- /* is aware of having started. */
+ unsigned long gp_seq; /* Track rsp->rcu_gp_seq counter. */
+ unsigned long gp_seq_needed; /* Track rsp->rcu_gp_seq_needed ctr. */
unsigned long rcu_qs_ctr_snap;/* Snapshot of rcu_qs_ctr to check */
/* for rcu_all_qs() invocations. */
union rcu_noqs cpu_no_qs; /* No QSes yet for this CPU. */
bool core_needs_qs; /* Core waits for quiesc state. */
bool beenonline; /* CPU online at least once. */
- bool gpwrap; /* Possible gpnum/completed wrap. */
+ bool gpwrap; /* Possible ->gp_seq wrap. */
struct rcu_node *mynode; /* This CPU's leaf of hierarchy */
unsigned long grpmask; /* Mask to apply to leaf qsmask. */
unsigned long ticks_this_gp; /* The number of scheduling-clock */
@@ -239,7 +218,6 @@ struct rcu_data {
/* 4) reasons this CPU needed to be kicked by force_quiescent_state */
unsigned long dynticks_fqs; /* Kicked due to dynticks idle. */
- unsigned long offline_fqs; /* Kicked due to being offline. */
unsigned long cond_resched_completed;
/* Grace period that needs help */
/* from cond_resched(). */
@@ -278,12 +256,16 @@ struct rcu_data {
/* Leader CPU takes GP-end wakeups. */
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */
- /* 7) RCU CPU stall data. */
+ /* 7) Diagnostic data, including RCU CPU stall warnings. */
unsigned int softirq_snap; /* Snapshot of softirq activity. */
/* ->rcu_iw* fields protected by leaf rcu_node ->lock. */
struct irq_work rcu_iw; /* Check for non-irq activity. */
bool rcu_iw_pending; /* Is ->rcu_iw pending? */
- unsigned long rcu_iw_gpnum; /* ->gpnum associated with ->rcu_iw. */
+ unsigned long rcu_iw_gp_seq; /* ->gp_seq associated with ->rcu_iw. */
+ unsigned long rcu_ofl_gp_seq; /* ->gp_seq at last offline. */
+ short rcu_ofl_gp_flags; /* ->gp_flags at last offline. */
+ unsigned long rcu_onl_gp_seq; /* ->gp_seq at last online. */
+ short rcu_onl_gp_flags; /* ->gp_flags at last online. */
int cpu;
struct rcu_state *rsp;
@@ -340,8 +322,7 @@ struct rcu_state {
u8 boost ____cacheline_internodealigned_in_smp;
/* Subject to priority boost. */
- unsigned long gpnum; /* Current gp number. */
- unsigned long completed; /* # of last completed gp. */
+ unsigned long gp_seq; /* Grace-period sequence #. */
struct task_struct *gp_kthread; /* Task for grace periods. */
struct swait_queue_head gp_wq; /* Where GP task waits. */
short gp_flags; /* Commands for GP task. */
@@ -373,6 +354,8 @@ struct rcu_state {
/* but in jiffies. */
unsigned long gp_activity; /* Time of last GP kthread */
/* activity in jiffies. */
+ unsigned long gp_req_activity; /* Time of last GP request */
+ /* in jiffies. */
unsigned long jiffies_stall; /* Time at which to check */
/* for CPU stalls. */
unsigned long jiffies_resched; /* Time at which to resched */
@@ -384,6 +367,10 @@ struct rcu_state {
const char *name; /* Name of structure. */
char abbr; /* Abbreviated name. */
struct list_head flavors; /* List of RCU flavors. */
+
+ spinlock_t ofl_lock ____cacheline_internodealigned_in_smp;
+ /* Synchronize offline with */
+ /* GP pre-initialization. */
};
/* Values for rcu_state structure's gp_flags field. */
@@ -394,16 +381,20 @@ struct rcu_state {
#define RCU_GP_IDLE 0 /* Initial state and no GP in progress. */
#define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */
#define RCU_GP_DONE_GPS 2 /* Wait done for grace-period start. */
-#define RCU_GP_WAIT_FQS 3 /* Wait for force-quiescent-state time. */
-#define RCU_GP_DOING_FQS 4 /* Wait done for force-quiescent-state time. */
-#define RCU_GP_CLEANUP 5 /* Grace-period cleanup started. */
-#define RCU_GP_CLEANED 6 /* Grace-period cleanup complete. */
+#define RCU_GP_ONOFF 3 /* Grace-period initialization hotplug. */
+#define RCU_GP_INIT 4 /* Grace-period initialization. */
+#define RCU_GP_WAIT_FQS 5 /* Wait for force-quiescent-state time. */
+#define RCU_GP_DOING_FQS 6 /* Wait done for force-quiescent-state time. */
+#define RCU_GP_CLEANUP 7 /* Grace-period cleanup started. */
+#define RCU_GP_CLEANED 8 /* Grace-period cleanup complete. */
#ifndef RCU_TREE_NONCORE
static const char * const gp_state_names[] = {
"RCU_GP_IDLE",
"RCU_GP_WAIT_GPS",
"RCU_GP_DONE_GPS",
+ "RCU_GP_ONOFF",
+ "RCU_GP_INIT",
"RCU_GP_WAIT_FQS",
"RCU_GP_DOING_FQS",
"RCU_GP_CLEANUP",
@@ -449,10 +440,13 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
static void rcu_print_detail_task_stall(struct rcu_state *rsp);
static int rcu_print_task_stall(struct rcu_node *rnp);
static int rcu_print_task_exp_stall(struct rcu_node *rnp);
-static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp);
+static void rcu_preempt_check_blocked_tasks(struct rcu_state *rsp,
+ struct rcu_node *rnp);
static void rcu_preempt_check_callbacks(void);
void call_rcu(struct rcu_head *head, rcu_callback_t func);
static void __init __rcu_init_preempt(void);
+static void dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp,
+ int ncheck);
static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags);
static void rcu_preempt_boost_start_gp(struct rcu_node *rnp);
static void invoke_rcu_callbacks_kthread(void);
@@ -489,7 +483,6 @@ static void __init rcu_spawn_nocb_kthreads(void);
#ifdef CONFIG_RCU_NOCB_CPU
static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp);
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */
-static void __maybe_unused rcu_kick_nohz_cpu(int cpu);
static bool init_nocb_callback_list(struct rcu_data *rdp);
static void rcu_bind_gp_kthread(void);
static bool rcu_nohz_full_cpu(struct rcu_state *rsp);
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index d40708e8c5d6..b3df3b770afb 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -472,6 +472,7 @@ retry_ipi:
static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
smp_call_func_t func)
{
+ int cpu;
struct rcu_node *rnp;
trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("reset"));
@@ -486,13 +487,20 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
rnp->rew.rew_func = func;
rnp->rew.rew_rsp = rsp;
if (!READ_ONCE(rcu_par_gp_wq) ||
- rcu_scheduler_active != RCU_SCHEDULER_RUNNING) {
- /* No workqueues yet. */
+ rcu_scheduler_active != RCU_SCHEDULER_RUNNING ||
+ rcu_is_last_leaf_node(rsp, rnp)) {
+ /* No workqueues yet or last leaf, do direct call. */
sync_rcu_exp_select_node_cpus(&rnp->rew.rew_work);
continue;
}
INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
- queue_work_on(rnp->grplo, rcu_par_gp_wq, &rnp->rew.rew_work);
+ preempt_disable();
+ cpu = cpumask_next(rnp->grplo - 1, cpu_online_mask);
+ /* If all offline, queue the work on an unbound CPU. */
+ if (unlikely(cpu > rnp->grphi))
+ cpu = WORK_CPU_UNBOUND;
+ queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
+ preempt_enable();
rnp->exp_need_flush = true;
}
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 7fd12039e512..c1b17f5b9361 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -74,8 +74,8 @@ static void __init rcu_bootup_announce_oddness(void)
pr_info("\tRCU event tracing is enabled.\n");
if ((IS_ENABLED(CONFIG_64BIT) && RCU_FANOUT != 64) ||
(!IS_ENABLED(CONFIG_64BIT) && RCU_FANOUT != 32))
- pr_info("\tCONFIG_RCU_FANOUT set to non-default value of %d\n",
- RCU_FANOUT);
+ pr_info("\tCONFIG_RCU_FANOUT set to non-default value of %d.\n",
+ RCU_FANOUT);
if (rcu_fanout_exact)
pr_info("\tHierarchical RCU autobalancing is disabled.\n");
if (IS_ENABLED(CONFIG_RCU_FAST_NO_HZ))
@@ -88,11 +88,13 @@ static void __init rcu_bootup_announce_oddness(void)
pr_info("\tBuild-time adjustment of leaf fanout to %d.\n",
RCU_FANOUT_LEAF);
if (rcu_fanout_leaf != RCU_FANOUT_LEAF)
- pr_info("\tBoot-time adjustment of leaf fanout to %d.\n", rcu_fanout_leaf);
+ pr_info("\tBoot-time adjustment of leaf fanout to %d.\n",
+ rcu_fanout_leaf);
if (nr_cpu_ids != NR_CPUS)
pr_info("\tRCU restricting CPUs from NR_CPUS=%d to nr_cpu_ids=%u.\n", NR_CPUS, nr_cpu_ids);
#ifdef CONFIG_RCU_BOOST
- pr_info("\tRCU priority boosting: priority %d delay %d ms.\n", kthread_prio, CONFIG_RCU_BOOST_DELAY);
+ pr_info("\tRCU priority boosting: priority %d delay %d ms.\n",
+ kthread_prio, CONFIG_RCU_BOOST_DELAY);
#endif
if (blimit != DEFAULT_RCU_BLIMIT)
pr_info("\tBoot-time adjustment of callback invocation limit to %ld.\n", blimit);
@@ -127,6 +129,7 @@ static struct rcu_data __percpu *const rcu_data_p = &rcu_preempt_data;
static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp,
bool wake);
+static void rcu_read_unlock_special(struct task_struct *t);
/*
* Tell them what RCU they are running.
@@ -183,6 +186,9 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
raw_lockdep_assert_held_rcu_node(rnp);
WARN_ON_ONCE(rdp->mynode != rnp);
WARN_ON_ONCE(!rcu_is_leaf_node(rnp));
+ /* RCU better not be waiting on newly onlined CPUs! */
+ WARN_ON_ONCE(rnp->qsmaskinitnext & ~rnp->qsmaskinit & rnp->qsmask &
+ rdp->grpmask);
/*
* Decide where to queue the newly blocked task. In theory,
@@ -260,8 +266,10 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
* ->exp_tasks pointers, respectively, to reference the newly
* blocked tasks.
*/
- if (!rnp->gp_tasks && (blkd_state & RCU_GP_BLKD))
+ if (!rnp->gp_tasks && (blkd_state & RCU_GP_BLKD)) {
rnp->gp_tasks = &t->rcu_node_entry;
+ WARN_ON_ONCE(rnp->completedqs == rnp->gp_seq);
+ }
if (!rnp->exp_tasks && (blkd_state & RCU_EXP_BLKD))
rnp->exp_tasks = &t->rcu_node_entry;
WARN_ON_ONCE(!(blkd_state & RCU_GP_BLKD) !=
@@ -286,20 +294,24 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
}
/*
- * Record a preemptible-RCU quiescent state for the specified CPU. Note
- * that this just means that the task currently running on the CPU is
- * not in a quiescent state. There might be any number of tasks blocked
- * while in an RCU read-side critical section.
+ * Record a preemptible-RCU quiescent state for the specified CPU.
+ * Note that this does not necessarily mean that the task currently running
+ * on the CPU is in a quiescent state: Instead, it means that the current
+ * grace period need not wait on any RCU read-side critical section that
+ * starts later on this CPU. It also means that if the current task is
+ * in an RCU read-side critical section, it has already added itself to
+ * some leaf rcu_node structure's ->blkd_tasks list. In addition to the
+ * current task, there might be any number of other tasks blocked while
+ * in an RCU read-side critical section.
*
- * As with the other rcu_*_qs() functions, callers to this function
- * must disable preemption.
+ * Callers to this function must disable preemption.
*/
static void rcu_preempt_qs(void)
{
RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_qs() invoked with preemption enabled!!!\n");
if (__this_cpu_read(rcu_data_p->cpu_no_qs.s)) {
trace_rcu_grace_period(TPS("rcu_preempt"),
- __this_cpu_read(rcu_data_p->gpnum),
+ __this_cpu_read(rcu_data_p->gp_seq),
TPS("cpuqs"));
__this_cpu_write(rcu_data_p->cpu_no_qs.b.norm, false);
barrier(); /* Coordinate with rcu_preempt_check_callbacks(). */
@@ -348,8 +360,8 @@ static void rcu_preempt_note_context_switch(bool preempt)
trace_rcu_preempt_task(rdp->rsp->name,
t->pid,
(rnp->qsmask & rdp->grpmask)
- ? rnp->gpnum
- : rnp->gpnum + 1);
+ ? rnp->gp_seq
+ : rcu_seq_snap(&rnp->gp_seq));
rcu_preempt_ctxt_queue(rnp, rdp);
} else if (t->rcu_read_lock_nesting < 0 &&
t->rcu_read_unlock_special.s) {
@@ -456,7 +468,7 @@ static bool rcu_preempt_has_tasks(struct rcu_node *rnp)
* notify RCU core processing or task having blocked during the RCU
* read-side critical section.
*/
-void rcu_read_unlock_special(struct task_struct *t)
+static void rcu_read_unlock_special(struct task_struct *t)
{
bool empty_exp;
bool empty_norm;
@@ -535,13 +547,15 @@ void rcu_read_unlock_special(struct task_struct *t)
WARN_ON_ONCE(rnp != t->rcu_blocked_node);
WARN_ON_ONCE(!rcu_is_leaf_node(rnp));
empty_norm = !rcu_preempt_blocked_readers_cgp(rnp);
+ WARN_ON_ONCE(rnp->completedqs == rnp->gp_seq &&
+ (!empty_norm || rnp->qsmask));
empty_exp = sync_rcu_preempt_exp_done(rnp);
smp_mb(); /* ensure expedited fastpath sees end of RCU c-s. */
np = rcu_next_node_entry(t, rnp);
list_del_init(&t->rcu_node_entry);
t->rcu_blocked_node = NULL;
trace_rcu_unlock_preempted_task(TPS("rcu_preempt"),
- rnp->gpnum, t->pid);
+ rnp->gp_seq, t->pid);
if (&t->rcu_node_entry == rnp->gp_tasks)
rnp->gp_tasks = np;
if (&t->rcu_node_entry == rnp->exp_tasks)
@@ -562,7 +576,7 @@ void rcu_read_unlock_special(struct task_struct *t)
empty_exp_now = sync_rcu_preempt_exp_done(rnp);
if (!empty_norm && !rcu_preempt_blocked_readers_cgp(rnp)) {
trace_rcu_quiescent_state_report(TPS("preempt_rcu"),
- rnp->gpnum,
+ rnp->gp_seq,
0, rnp->qsmask,
rnp->level,
rnp->grplo,
@@ -686,24 +700,27 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
* Check that the list of blocked tasks for the newly completed grace
* period is in fact empty. It is a serious bug to complete a grace
* period that still has RCU readers blocked! This function must be
- * invoked -before- updating this rnp's ->gpnum, and the rnp's ->lock
+ * invoked -before- updating this rnp's ->gp_seq, and the rnp's ->lock
* must be held by the caller.
*
* Also, if there are blocked tasks on the list, they automatically
* block the newly created grace period, so set up ->gp_tasks accordingly.
*/
-static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+static void
+rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp)
{
struct task_struct *t;
RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n");
- WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp));
- if (rcu_preempt_has_tasks(rnp)) {
+ if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)))
+ dump_blkd_tasks(rsp, rnp, 10);
+ if (rcu_preempt_has_tasks(rnp) &&
+ (rnp->qsmaskinit || rnp->wait_blkd_tasks)) {
rnp->gp_tasks = rnp->blkd_tasks.next;
t = container_of(rnp->gp_tasks, struct task_struct,
rcu_node_entry);
trace_rcu_unlock_preempted_task(TPS("rcu_preempt-GPS"),
- rnp->gpnum, t->pid);
+ rnp->gp_seq, t->pid);
}
WARN_ON_ONCE(rnp->qsmask);
}
@@ -717,6 +734,7 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
*/
static void rcu_preempt_check_callbacks(void)
{
+ struct rcu_state *rsp = &rcu_preempt_state;
struct task_struct *t = current;
if (t->rcu_read_lock_nesting == 0) {
@@ -725,7 +743,9 @@ static void rcu_preempt_check_callbacks(void)
}
if (t->rcu_read_lock_nesting > 0 &&
__this_cpu_read(rcu_data_p->core_needs_qs) &&
- __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm))
+ __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm) &&
+ !t->rcu_read_unlock_special.b.need_qs &&
+ time_after(jiffies, rsp->gp_start + HZ))
t->rcu_read_unlock_special.b.need_qs = true;
}
@@ -841,6 +861,47 @@ void exit_rcu(void)
__rcu_read_unlock();
}
+/*
+ * Dump the blocked-tasks state, but limit the list dump to the
+ * specified number of elements.
+ */
+static void
+dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck)
+{
+ int cpu;
+ int i;
+ struct list_head *lhp;
+ bool onl;
+ struct rcu_data *rdp;
+ struct rcu_node *rnp1;
+
+ raw_lockdep_assert_held_rcu_node(rnp);
+ pr_info("%s: grp: %d-%d level: %d ->gp_seq %ld ->completedqs %ld\n",
+ __func__, rnp->grplo, rnp->grphi, rnp->level,
+ (long)rnp->gp_seq, (long)rnp->completedqs);
+ for (rnp1 = rnp; rnp1; rnp1 = rnp1->parent)
+ pr_info("%s: %d:%d ->qsmask %#lx ->qsmaskinit %#lx ->qsmaskinitnext %#lx\n",
+ __func__, rnp1->grplo, rnp1->grphi, rnp1->qsmask, rnp1->qsmaskinit, rnp1->qsmaskinitnext);
+ pr_info("%s: ->gp_tasks %p ->boost_tasks %p ->exp_tasks %p\n",
+ __func__, rnp->gp_tasks, rnp->boost_tasks, rnp->exp_tasks);
+ pr_info("%s: ->blkd_tasks", __func__);
+ i = 0;
+ list_for_each(lhp, &rnp->blkd_tasks) {
+ pr_cont(" %p", lhp);
+ if (++i >= 10)
+ break;
+ }
+ pr_cont("\n");
+ for (cpu = rnp->grplo; cpu <= rnp->grphi; cpu++) {
+ rdp = per_cpu_ptr(rsp->rda, cpu);
+ onl = !!(rdp->grpmask & rcu_rnp_online_cpus(rnp));
+ pr_info("\t%d: %c online: %ld(%d) offline: %ld(%d)\n",
+ cpu, ".o"[onl],
+ (long)rdp->rcu_onl_gp_seq, rdp->rcu_onl_gp_flags,
+ (long)rdp->rcu_ofl_gp_seq, rdp->rcu_ofl_gp_flags);
+ }
+}
+
#else /* #ifdef CONFIG_PREEMPT_RCU */
static struct rcu_state *const rcu_state_p = &rcu_sched_state;
@@ -911,7 +972,8 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
* so there is no need to check for blocked tasks. So check only for
* bogus qsmask values.
*/
-static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+static void
+rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp)
{
WARN_ON_ONCE(rnp->qsmask);
}
@@ -949,6 +1011,15 @@ void exit_rcu(void)
{
}
+/*
+ * Dump the guaranteed-empty blocked-tasks state. Trust but verify.
+ */
+static void
+dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck)
+{
+ WARN_ON_ONCE(!list_empty(&rnp->blkd_tasks));
+}
+
#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
#ifdef CONFIG_RCU_BOOST
@@ -1433,7 +1504,8 @@ static bool __maybe_unused rcu_try_advance_all_cbs(void)
* completed since we last checked and there are
* callbacks not yet ready to invoke.
*/
- if ((rdp->completed != rnp->completed ||
+ if ((rcu_seq_completed_gp(rdp->gp_seq,
+ rcu_seq_current(&rnp->gp_seq)) ||
unlikely(READ_ONCE(rdp->gpwrap))) &&
rcu_segcblist_pend_cbs(&rdp->cblist))
note_gp_changes(rsp, rdp);
@@ -1720,16 +1792,16 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu)
*/
touch_nmi_watchdog();
- if (rsp->gpnum == rdp->gpnum) {
+ ticks_value = rcu_seq_ctr(rsp->gp_seq - rdp->gp_seq);
+ if (ticks_value) {
+ ticks_title = "GPs behind";
+ } else {
ticks_title = "ticks this GP";
ticks_value = rdp->ticks_this_gp;
- } else {
- ticks_title = "GPs behind";
- ticks_value = rsp->gpnum - rdp->gpnum;
}
print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
- delta = rdp->mynode->gpnum - rdp->rcu_iw_gpnum;
- pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%ld softirq=%u/%u fqs=%ld %s\n",
+ delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
+ pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n",
cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
@@ -1817,7 +1889,7 @@ static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp)
{
- return &rnp->nocb_gp_wq[rnp->completed & 0x1];
+ return &rnp->nocb_gp_wq[rcu_seq_ctr(rnp->gp_seq) & 0x1];
}
static void rcu_init_one_nocb(struct rcu_node *rnp)
@@ -2069,12 +2141,17 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp)
bool needwake;
struct rcu_node *rnp = rdp->mynode;
- raw_spin_lock_irqsave_rcu_node(rnp, flags);
- c = rcu_cbs_completed(rdp->rsp, rnp);
- needwake = rcu_start_this_gp(rnp, rdp, c);
- raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
- if (needwake)
- rcu_gp_kthread_wake(rdp->rsp);
+ local_irq_save(flags);
+ c = rcu_seq_snap(&rdp->rsp->gp_seq);
+ if (!rdp->gpwrap && ULONG_CMP_GE(rdp->gp_seq_needed, c)) {
+ local_irq_restore(flags);
+ } else {
+ raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
+ needwake = rcu_start_this_gp(rnp, rdp, c);
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ if (needwake)
+ rcu_gp_kthread_wake(rdp->rsp);
+ }
/*
* Wait for the grace period. Do so interruptibly to avoid messing
@@ -2083,8 +2160,8 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp)
trace_rcu_this_gp(rnp, rdp, c, TPS("StartWait"));
for (;;) {
swait_event_interruptible(
- rnp->nocb_gp_wq[c & 0x1],
- (d = ULONG_CMP_GE(READ_ONCE(rnp->completed), c)));
+ rnp->nocb_gp_wq[rcu_seq_ctr(c) & 0x1],
+ (d = rcu_seq_done(&rnp->gp_seq, c)));
if (likely(d))
break;
WARN_ON(signal_pending(current));
@@ -2569,23 +2646,6 @@ static bool init_nocb_callback_list(struct rcu_data *rdp)
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
/*
- * An adaptive-ticks CPU can potentially execute in kernel mode for an
- * arbitrarily long period of time with the scheduling-clock tick turned
- * off. RCU will be paying attention to this CPU because it is in the
- * kernel, but the CPU cannot be guaranteed to be executing the RCU state
- * machine because the scheduling-clock tick has been disabled. Therefore,
- * if an adaptive-ticks CPU is failing to respond to the current grace
- * period and has not be idle from an RCU perspective, kick it.
- */
-static void __maybe_unused rcu_kick_nohz_cpu(int cpu)
-{
-#ifdef CONFIG_NO_HZ_FULL
- if (tick_nohz_full_cpu(cpu))
- smp_send_reschedule(cpu);
-#endif /* #ifdef CONFIG_NO_HZ_FULL */
-}
-
-/*
* Is this CPU a NO_HZ_FULL CPU that should ignore RCU so that the
* grace-period kthread will do force_quiescent_state() processing?
* The idea is to avoid waking up RCU core processing on such a
@@ -2610,8 +2670,6 @@ static bool rcu_nohz_full_cpu(struct rcu_state *rsp)
*/
static void rcu_bind_gp_kthread(void)
{
- int __maybe_unused cpu;
-
if (!tick_nohz_full_enabled())
return;
housekeeping_affine(current, HK_FLAG_RCU);
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 4c230a60ece4..39cb23d22109 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -507,14 +507,15 @@ early_initcall(check_cpu_stall_init);
#ifdef CONFIG_TASKS_RCU
/*
- * Simple variant of RCU whose quiescent states are voluntary context switch,
- * user-space execution, and idle. As such, grace periods can take one good
- * long time. There are no read-side primitives similar to rcu_read_lock()
- * and rcu_read_unlock() because this implementation is intended to get
- * the system into a safe state for some of the manipulations involved in
- * tracing and the like. Finally, this implementation does not support
- * high call_rcu_tasks() rates from multiple CPUs. If this is required,
- * per-CPU callback lists will be needed.
+ * Simple variant of RCU whose quiescent states are voluntary context
+ * switch, cond_resched_rcu_qs(), user-space execution, and idle.
+ * As such, grace periods can take one good long time. There are no
+ * read-side primitives similar to rcu_read_lock() and rcu_read_unlock()
+ * because this implementation is intended to get the system into a safe
+ * state for some of the manipulations involved in tracing and the like.
+ * Finally, this implementation does not support high call_rcu_tasks()
+ * rates from multiple CPUs. If this is required, per-CPU callback lists
+ * will be needed.
*/
/* Global list of callbacks and associated lock. */
@@ -542,11 +543,11 @@ static struct task_struct *rcu_tasks_kthread_ptr;
* period elapses, in other words after all currently executing RCU
* read-side critical sections have completed. call_rcu_tasks() assumes
* that the read-side critical sections end at a voluntary context
- * switch (not a preemption!), entry into idle, or transition to usermode
- * execution. As such, there are no read-side primitives analogous to
- * rcu_read_lock() and rcu_read_unlock() because this primitive is intended
- * to determine that all tasks have passed through a safe state, not so
- * much for data-strcuture synchronization.
+ * switch (not a preemption!), cond_resched_rcu_qs(), entry into idle,
+ * or transition to usermode execution. As such, there are no read-side
+ * primitives analogous to rcu_read_lock() and rcu_read_unlock() because
+ * this primitive is intended to determine that all tasks have passed
+ * through a safe state, not so much for data-strcuture synchronization.
*
* See the description of call_rcu() for more detailed information on
* memory ordering guarantees.
@@ -667,6 +668,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
struct rcu_head *list;
struct rcu_head *next;
LIST_HEAD(rcu_tasks_holdouts);
+ int fract;
/* Run on housekeeping CPUs by default. Sysadm can move if desired. */
housekeeping_affine(current, HK_FLAG_RCU);
@@ -748,13 +750,25 @@ static int __noreturn rcu_tasks_kthread(void *arg)
* holdouts. When the list is empty, we are done.
*/
lastreport = jiffies;
- while (!list_empty(&rcu_tasks_holdouts)) {
+
+ /* Start off with HZ/10 wait and slowly back off to 1 HZ wait*/
+ fract = 10;
+
+ for (;;) {
bool firstreport;
bool needreport;
int rtst;
struct task_struct *t1;
- schedule_timeout_interruptible(HZ);
+ if (list_empty(&rcu_tasks_holdouts))
+ break;
+
+ /* Slowly back off waiting for holdouts */
+ schedule_timeout_interruptible(HZ/fract);
+
+ if (fract > 1)
+ fract--;
+
rtst = READ_ONCE(rcu_task_stall_timeout);
needreport = rtst > 0 &&
time_after(jiffies, lastreport + rtst);
@@ -800,6 +814,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
list = next;
cond_resched();
}
+ /* Paranoid sleep to keep this from entering a tight loop */
schedule_timeout_uninterruptible(HZ/10);
}
}
diff --git a/kernel/torture.c b/kernel/torture.c
index 3de1efbecd6a..1ac24a826589 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -20,6 +20,9 @@
* Author: Paul E. McKenney <paulmck@us.ibm.com>
* Based on kernel/rcu/torture.c.
*/
+
+#define pr_fmt(fmt) fmt
+
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/init.h>
@@ -53,7 +56,7 @@ MODULE_LICENSE("GPL");
MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com>");
static char *torture_type;
-static bool verbose;
+static int verbose;
/* Mediate rmmod and system shutdown. Concurrent rmmod & shutdown illegal! */
#define FULLSTOP_DONTSTOP 0 /* Normal operation. */
@@ -98,7 +101,7 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
if (!cpu_online(cpu) || !cpu_is_hotpluggable(cpu))
return false;
- if (verbose)
+ if (verbose > 1)
pr_alert("%s" TORTURE_FLAG
"torture_onoff task: offlining %d\n",
torture_type, cpu);
@@ -111,7 +114,7 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
"torture_onoff task: offline %d failed: errno %d\n",
torture_type, cpu, ret);
} else {
- if (verbose)
+ if (verbose > 1)
pr_alert("%s" TORTURE_FLAG
"torture_onoff task: offlined %d\n",
torture_type, cpu);
@@ -147,7 +150,7 @@ bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes,
if (cpu_online(cpu) || !cpu_is_hotpluggable(cpu))
return false;
- if (verbose)
+ if (verbose > 1)
pr_alert("%s" TORTURE_FLAG
"torture_onoff task: onlining %d\n",
torture_type, cpu);
@@ -160,7 +163,7 @@ bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes,
"torture_onoff task: online %d failed: errno %d\n",
torture_type, cpu, ret);
} else {
- if (verbose)
+ if (verbose > 1)
pr_alert("%s" TORTURE_FLAG
"torture_onoff task: onlined %d\n",
torture_type, cpu);
@@ -647,7 +650,7 @@ static void torture_stutter_cleanup(void)
* The runnable parameter points to a flag that controls whether or not
* the test is currently runnable. If there is no such flag, pass in NULL.
*/
-bool torture_init_begin(char *ttype, bool v)
+bool torture_init_begin(char *ttype, int v)
{
mutex_lock(&fullstop_mutex);
if (torture_type != NULL) {
diff --git a/tools/testing/selftests/rcutorture/bin/configinit.sh b/tools/testing/selftests/rcutorture/bin/configinit.sh
index c15f270e121d..65541c21a544 100755
--- a/tools/testing/selftests/rcutorture/bin/configinit.sh
+++ b/tools/testing/selftests/rcutorture/bin/configinit.sh
@@ -1,6 +1,6 @@
#!/bin/bash
#
-# Usage: configinit.sh config-spec-file [ build output dir ]
+# Usage: configinit.sh config-spec-file build-output-dir results-dir
#
# Create a .config file from the spec file. Run from the kernel source tree.
# Exits with 0 if all went well, with 1 if all went well but the config
@@ -40,20 +40,18 @@ mkdir $T
c=$1
buildloc=$2
+resdir=$3
builddir=
-if test -n $buildloc
+if echo $buildloc | grep -q '^O='
then
- if echo $buildloc | grep -q '^O='
+ builddir=`echo $buildloc | sed -e 's/^O=//'`
+ if test ! -d $builddir
then
- builddir=`echo $buildloc | sed -e 's/^O=//'`
- if test ! -d $builddir
- then
- mkdir $builddir
- fi
- else
- echo Bad build directory: \"$buildloc\"
- exit 2
+ mkdir $builddir
fi
+else
+ echo Bad build directory: \"$buildloc\"
+ exit 2
fi
sed -e 's/^\(CONFIG[0-9A-Z_]*\)=.*$/grep -v "^# \1" |/' < $c > $T/u.sh
@@ -61,12 +59,12 @@ sed -e 's/^\(CONFIG[0-9A-Z_]*=\).*$/grep -v \1 |/' < $c >> $T/u.sh
grep '^grep' < $T/u.sh > $T/upd.sh
echo "cat - $c" >> $T/upd.sh
make mrproper
-make $buildloc distclean > $builddir/Make.distclean 2>&1
-make $buildloc $TORTURE_DEFCONFIG > $builddir/Make.defconfig.out 2>&1
+make $buildloc distclean > $resdir/Make.distclean 2>&1
+make $buildloc $TORTURE_DEFCONFIG > $resdir/Make.defconfig.out 2>&1
mv $builddir/.config $builddir/.config.sav
sh $T/upd.sh < $builddir/.config.sav > $builddir/.config
cp $builddir/.config $builddir/.config.new
-yes '' | make $buildloc oldconfig > $builddir/Make.oldconfig.out 2> $builddir/Make.oldconfig.err
+yes '' | make $buildloc oldconfig > $resdir/Make.oldconfig.out 2> $resdir/Make.oldconfig.err
# verify new config matches specification.
configcheck.sh $builddir/.config $c
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-build.sh b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
index 34d126734cde..9115fcdb5617 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-build.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
@@ -2,7 +2,7 @@
#
# Build a kvm-ready Linux kernel from the tree in the current directory.
#
-# Usage: kvm-build.sh config-template build-dir
+# Usage: kvm-build.sh config-template build-dir resdir
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@@ -29,6 +29,7 @@ then
exit 1
fi
builddir=${2}
+resdir=${3}
T=${TMPDIR-/tmp}/test-linux.sh.$$
trap 'rm -rf $T' 0
@@ -41,19 +42,19 @@ CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_CONSOLE=y
___EOF___
-configinit.sh $T/config O=$builddir
+configinit.sh $T/config O=$builddir $resdir
retval=$?
if test $retval -gt 1
then
exit 2
fi
ncpus=`cpus2use.sh`
-make O=$builddir -j$ncpus $TORTURE_KMAKE_ARG > $builddir/Make.out 2>&1
+make O=$builddir -j$ncpus $TORTURE_KMAKE_ARG > $resdir/Make.out 2>&1
retval=$?
-if test $retval -ne 0 || grep "rcu[^/]*": < $builddir/Make.out | egrep -q "Stop|Error|error:|warning:" || egrep -q "Stop|Error|error:" < $builddir/Make.out
+if test $retval -ne 0 || grep "rcu[^/]*": < $resdir/Make.out | egrep -q "Stop|Error|error:|warning:" || egrep -q "Stop|Error|error:" < $resdir/Make.out
then
echo Kernel build error
- egrep "Stop|Error|error:|warning:" < $builddir/Make.out
+ egrep "Stop|Error|error:|warning:" < $resdir/Make.out
echo Run aborted.
exit 3
fi
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
index 477ecb1293ab..0fa8a61ccb7b 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
@@ -70,4 +70,5 @@ else
else
print_warning $nclosecalls "Reader Batch close calls in" $(($dur/60)) minute run: $i
fi
+ echo $nclosecalls "Reader Batch close calls in" $(($dur/60)) minute run: $i > $i/console.log.rcu.diags
fi
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh b/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
index c27e97824163..c9bab57a77eb 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
@@ -39,6 +39,7 @@ do
head -1 $resdir/log
fi
TORTURE_SUITE="`cat $i/../TORTURE_SUITE`"
+ rm -f $i/console.log.*.diags
kvm-recheck-${TORTURE_SUITE}.sh $i
if test -f "$i/console.log"
then
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
index c5b0f94341d9..f7247ee00514 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
@@ -98,14 +98,15 @@ then
ln -s $base_resdir/.config $resdir # for kvm-recheck.sh
# Arch-independent indicator
touch $resdir/builtkernel
-elif kvm-build.sh $T/Kc2 $builddir
+elif kvm-build.sh $T/Kc2 $builddir $resdir
then
# Had to build a kernel for this test.
QEMU="`identify_qemu $builddir/vmlinux`"
BOOT_IMAGE="`identify_boot_image $QEMU`"
- cp $builddir/Make*.out $resdir
cp $builddir/vmlinux $resdir
cp $builddir/.config $resdir
+ cp $builddir/Module.symvers $resdir > /dev/null || :
+ cp $builddir/System.map $resdir > /dev/null || :
if test -n "$BOOT_IMAGE"
then
cp $builddir/$BOOT_IMAGE $resdir
diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh b/tools/testing/selftests/rcutorture/bin/kvm.sh
index 56610dbbdf73..5a7a62d76a50 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -347,7 +347,7 @@ function dump(first, pastlast, batchnum)
print "needqemurun="
jn=1
for (j = first; j < pastlast; j++) {
- builddir=KVM "/b" jn
+ builddir=KVM "/b1"
cpusr[jn] = cpus[j];
if (cfrep[cf[j]] == "") {
cfr[jn] = cf[j];
diff --git a/tools/testing/selftests/rcutorture/bin/parse-console.sh b/tools/testing/selftests/rcutorture/bin/parse-console.sh
index 17293436f551..84933f6aed77 100755
--- a/tools/testing/selftests/rcutorture/bin/parse-console.sh
+++ b/tools/testing/selftests/rcutorture/bin/parse-console.sh
@@ -163,6 +163,13 @@ then
print_warning Summary: $summary
cat $T.diags >> $file.diags
fi
+for i in $file.*.diags
+do
+ if test -f "$i"
+ then
+ cat $i >> $file.diags
+ fi
+done
if ! test -s $file.diags
then
rm -f $file.diags
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
index 5d2cc0bd50a0..5c3213cc3ad7 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
@@ -1,5 +1,5 @@
-rcutorture.onoff_interval=1 rcutorture.onoff_holdoff=30
-rcutree.gp_preinit_delay=3
+rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30
+rcutree.gp_preinit_delay=12
rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3
rcutree.kthread_prio=2
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T.boot
deleted file mode 100644
index 883149b5f2d1..000000000000
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE08-T.boot
+++ /dev/null
@@ -1 +0,0 @@
-rcutree.rcu_fanout_exact=1
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh b/tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh
index 24ec91041957..7bab8246392b 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh
+++ b/tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh
@@ -39,7 +39,7 @@ rcutorture_param_onoff () {
if ! bootparam_hotplug_cpu "$1" && configfrag_hotplug_cpu "$2"
then
echo CPU-hotplug kernel, adding rcutorture onoff. 1>&2
- echo rcutorture.onoff_interval=3 rcutorture.onoff_holdoff=30
+ echo rcutorture.onoff_interval=1000 rcutorture.onoff_holdoff=30
fi
}