diff options
Diffstat (limited to 'Documentation/RCU/Design')
-rw-r--r-- | Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst | 35 | ||||
-rw-r--r-- | Documentation/RCU/Design/Requirements/Requirements.rst | 8 |
2 files changed, 37 insertions, 6 deletions
diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst index a648b423ba0e..eeb351296df1 100644 --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst @@ -21,7 +21,7 @@ Any code that happens after the end of a given RCU grace period is guaranteed to see the effects of all accesses prior to the beginning of that grace period that are within RCU read-side critical sections. Similarly, any code that happens before the beginning of a given RCU grace -period is guaranteed to see the effects of all accesses following the end +period is guaranteed to not see the effects of all accesses following the end of that grace period that are within RCU read-side critical sections. Note well that RCU-sched read-side critical sections include any region @@ -112,6 +112,35 @@ on PowerPC. The ``smp_mb__after_unlock_lock()`` invocations prevent this ``WARN_ON()`` from triggering. ++-----------------------------------------------------------------------+ +| **Quick Quiz**: | ++-----------------------------------------------------------------------+ +| But the chain of rcu_node-structure lock acquisitions guarantees | +| that new readers will see all of the updater's pre-grace-period | +| accesses and also guarantees that the updater's post-grace-period | +| accesses will see all of the old reader's accesses. So why do we | +| need all of those calls to smp_mb__after_unlock_lock()? | ++-----------------------------------------------------------------------+ +| **Answer**: | ++-----------------------------------------------------------------------+ +| Because we must provide ordering for RCU's polling grace-period | +| primitives, for example, get_state_synchronize_rcu() and | +| poll_state_synchronize_rcu(). Consider this code:: | +| | +| CPU 0 CPU 1 | +| ---- ---- | +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) | +| g = get_state_synchronize_rcu() smp_mb() | +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) | +| continue; | +| r0 = READ_ONCE(Y) | +| | +| RCU guarantees that the outcome r0 == 0 && r1 == 0 will not | +| happen, even if CPU 1 is in an RCU extended quiescent state | +| (idle or offline) and thus won't interact directly with the RCU | +| core processing at all. | ++-----------------------------------------------------------------------+ + This approach must be extended to include idle CPUs, which need RCU's grace-period memory ordering guarantee to extend to any RCU read-side critical sections preceding and following the current @@ -339,14 +368,14 @@ The diagram below shows the path of ordering if the leftmost leftmost ``rcu_node`` structure offlines its last CPU and if the next ``rcu_node`` structure has no online CPUs). -.. kernel-figure:: TreeRCU-gp-init-1.svg +.. kernel-figure:: TreeRCU-gp-init-2.svg The final ``rcu_gp_init()`` pass through the ``rcu_node`` tree traverses breadth-first, setting each ``rcu_node`` structure's ``->gp_seq`` field to the newly advanced value from the ``rcu_state`` structure, as shown in the following diagram. -.. kernel-figure:: TreeRCU-gp-init-1.svg +.. kernel-figure:: TreeRCU-gp-init-3.svg This change will also cause each CPU's next call to ``__note_gp_changes()`` to notice that a new grace period has started, diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index 38a39476fc24..45278e2974c0 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -362,9 +362,8 @@ do_something_gp() uses rcu_dereference() to fetch from ``gp``: 12 } The rcu_dereference() uses volatile casts and (for DEC Alpha) memory -barriers in the Linux kernel. Should a `high-quality implementation of -C11 ``memory_order_consume`` -[PDF] <http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf>`__ +barriers in the Linux kernel. Should a |high-quality implementation of +C11 memory_order_consume [PDF]|_ ever appear, then rcu_dereference() could be implemented as a ``memory_order_consume`` load. Regardless of the exact implementation, a pointer fetched by rcu_dereference() may not be used outside of the @@ -374,6 +373,9 @@ element has been passed from RCU to some other synchronization mechanism, most commonly locking or `reference counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__. +.. |high-quality implementation of C11 memory_order_consume [PDF]| replace:: high-quality implementation of C11 ``memory_order_consume`` [PDF] +.. _high-quality implementation of C11 memory_order_consume [PDF]: http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf + In short, updaters use rcu_assign_pointer() and readers use rcu_dereference(), and these two RCU API elements work together to ensure that readers have a consistent view of newly added data elements. |