atomic: remove all traces of READ_ONCE_CTRL() and atomic*_read_ctrl()

This seems to be a mis-reading of how alpha memory ordering works, and
is not backed up by the alpha architecture manual.  The helper functions
don't do anything special on any other architectures, and the arguments
that support them being safe on other architectures also argue that they
are safe on alpha.

Basically, the "control dependency" is between a previous read and a
subsequent write that is dependent on the value read.  Even if the
subsequent write is actually done speculatively, there is no way that
such a speculative write could be made visible to other cpu's until it
has been committed, which requires validating the speculation.

Note that most weakely ordered architectures (very much including alpha)
do not guarantee any ordering relationship between two loads that depend
on each other on a control dependency:

    read A
    if (val == 1)
        read B

because the conditional may be predicted, and the "read B" may be
speculatively moved up to before reading the value A.  So we require the
user to insert a smp_rmb() between the two accesses to be correct:

    read A;
    if (A == 1)
        smp_rmb()
        read B

Alpha is further special in that it can break that ordering even if the
*address* of B depends on the read of A, because the cacheline that is
read later may be stale unless you have a memory barrier in between the
pointer read and the read of the value behind a pointer:

    read ptr
    read offset(ptr)

whereas all other weakly ordered architectures guarantee that the data
dependency (as opposed to just a control dependency) will order the two
accesses.  As a result, alpha needs a "smp_read_barrier_depends()" in
between those two reads for them to be ordered.

The coontrol dependency that "READ_ONCE_CTRL()" and "atomic_read_ctrl()"
had was a control dependency to a subsequent *write*, however, and
nobody can finalize such a subsequent write without having actually done
the read.  And were you to write such a value to a "stale" cacheline
(the way the unordered reads came to be), that would seem to lose the
write entirely.

So the things that make alpha able to re-order reads even more
aggressively than other weak architectures do not seem to be relevant
for a subsequent write.  Alpha memory ordering may be strange, but
there's no real indication that it is *that* strange.

Also, the alpha architecture reference manual very explicitly talks
about the definition of "Dependence Constraints" in section 5.6.1.7,
where a preceding read dominates a subsequent write.

Such a dependence constraint admittedly does not impose a BEFORE (alpha
architecture term for globally visible ordering), but it does guarantee
that there can be no "causal loop".  I don't see how you could avoid
such a loop if another cpu could see the stored value and then impact
the value of the first read.  Put another way: the read and the write
could not be seen as being out of order wrt other cpus.

So I do not see how these "x_ctrl()" functions can currently be necessary.

I may have to eat my words at some point, but in the absense of clear
proof that alpha actually needs this, or indeed even an explanation of
how alpha could _possibly_ need it, I do not believe these functions are
called for.

And if it turns out that alpha really _does_ need a barrier for this
case, that barrier still should not be "smp_read_barrier_depends()".
We'd have to make up some new speciality barrier just for alpha, along
with the documentation for why it really is necessary.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul E McKenney <paulmck@us.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Linus Torvalds 2015-11-03 17:22:17 -08:00
parent d63a978865
commit 105ff3cbf2
5 changed files with 23 additions and 68 deletions

View file

@ -617,16 +617,16 @@ case what's actually required is:
However, stores are not speculated. This means that ordering -is- provided However, stores are not speculated. This means that ordering -is- provided
for load-store control dependencies, as in the following example: for load-store control dependencies, as in the following example:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
if (q) { if (q) {
WRITE_ONCE(b, p); WRITE_ONCE(b, p);
} }
Control dependencies pair normally with other types of barriers. That Control dependencies pair normally with other types of barriers. That
said, please note that READ_ONCE_CTRL() is not optional! Without the said, please note that READ_ONCE() is not optional! Without the
READ_ONCE_CTRL(), the compiler might combine the load from 'a' with READ_ONCE(), the compiler might combine the load from 'a' with other
other loads from 'a', and the store to 'b' with other stores to 'b', loads from 'a', and the store to 'b' with other stores to 'b', with
with possible highly counterintuitive effects on ordering. possible highly counterintuitive effects on ordering.
Worse yet, if the compiler is able to prove (say) that the value of Worse yet, if the compiler is able to prove (say) that the value of
variable 'a' is always non-zero, it would be well within its rights variable 'a' is always non-zero, it would be well within its rights
@ -636,16 +636,12 @@ as follows:
q = a; q = a;
b = p; /* BUG: Compiler and CPU can both reorder!!! */ b = p; /* BUG: Compiler and CPU can both reorder!!! */
Finally, the READ_ONCE_CTRL() includes an smp_read_barrier_depends() So don't leave out the READ_ONCE().
that DEC Alpha needs in order to respect control depedencies. Alternatively
use one of atomic{,64}_read_ctrl().
So don't leave out the READ_ONCE_CTRL().
It is tempting to try to enforce ordering on identical stores on both It is tempting to try to enforce ordering on identical stores on both
branches of the "if" statement as follows: branches of the "if" statement as follows:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
if (q) { if (q) {
barrier(); barrier();
WRITE_ONCE(b, p); WRITE_ONCE(b, p);
@ -659,7 +655,7 @@ branches of the "if" statement as follows:
Unfortunately, current compilers will transform this as follows at high Unfortunately, current compilers will transform this as follows at high
optimization levels: optimization levels:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
barrier(); barrier();
WRITE_ONCE(b, p); /* BUG: No ordering vs. load from a!!! */ WRITE_ONCE(b, p); /* BUG: No ordering vs. load from a!!! */
if (q) { if (q) {
@ -689,7 +685,7 @@ memory barriers, for example, smp_store_release():
In contrast, without explicit memory barriers, two-legged-if control In contrast, without explicit memory barriers, two-legged-if control
ordering is guaranteed only when the stores differ, for example: ordering is guaranteed only when the stores differ, for example:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
if (q) { if (q) {
WRITE_ONCE(b, p); WRITE_ONCE(b, p);
do_something(); do_something();
@ -698,14 +694,14 @@ ordering is guaranteed only when the stores differ, for example:
do_something_else(); do_something_else();
} }
The initial READ_ONCE_CTRL() is still required to prevent the compiler The initial READ_ONCE() is still required to prevent the compiler from
from proving the value of 'a'. proving the value of 'a'.
In addition, you need to be careful what you do with the local variable 'q', In addition, you need to be careful what you do with the local variable 'q',
otherwise the compiler might be able to guess the value and again remove otherwise the compiler might be able to guess the value and again remove
the needed conditional. For example: the needed conditional. For example:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
if (q % MAX) { if (q % MAX) {
WRITE_ONCE(b, p); WRITE_ONCE(b, p);
do_something(); do_something();
@ -718,7 +714,7 @@ If MAX is defined to be 1, then the compiler knows that (q % MAX) is
equal to zero, in which case the compiler is within its rights to equal to zero, in which case the compiler is within its rights to
transform the above code into the following: transform the above code into the following:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
WRITE_ONCE(b, p); WRITE_ONCE(b, p);
do_something_else(); do_something_else();
@ -729,7 +725,7 @@ is gone, and the barrier won't bring it back. Therefore, if you are
relying on this ordering, you should make sure that MAX is greater than relying on this ordering, you should make sure that MAX is greater than
one, perhaps as follows: one, perhaps as follows:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */ BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
if (q % MAX) { if (q % MAX) {
WRITE_ONCE(b, p); WRITE_ONCE(b, p);
@ -746,7 +742,7 @@ of the 'if' statement.
You must also be careful not to rely too much on boolean short-circuit You must also be careful not to rely too much on boolean short-circuit
evaluation. Consider this example: evaluation. Consider this example:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
if (q || 1 > 0) if (q || 1 > 0)
WRITE_ONCE(b, 1); WRITE_ONCE(b, 1);
@ -754,7 +750,7 @@ Because the first condition cannot fault and the second condition is
always true, the compiler can transform this example as following, always true, the compiler can transform this example as following,
defeating control dependency: defeating control dependency:
q = READ_ONCE_CTRL(a); q = READ_ONCE(a);
WRITE_ONCE(b, 1); WRITE_ONCE(b, 1);
This example underscores the need to ensure that the compiler cannot This example underscores the need to ensure that the compiler cannot
@ -768,7 +764,7 @@ x and y both being zero:
CPU 0 CPU 1 CPU 0 CPU 1
======================= ======================= ======================= =======================
r1 = READ_ONCE_CTRL(x); r2 = READ_ONCE_CTRL(y); r1 = READ_ONCE(x); r2 = READ_ONCE(y);
if (r1 > 0) if (r2 > 0) if (r1 > 0) if (r2 > 0)
WRITE_ONCE(y, 1); WRITE_ONCE(x, 1); WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
@ -797,11 +793,6 @@ site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.
In summary: In summary:
(*) Control dependencies must be headed by READ_ONCE_CTRL(),
atomic{,64}_read_ctrl(). Or, as a much less preferable alternative,
interpose smp_read_barrier_depends() between a READ_ONCE() and the
control-dependent write.
(*) Control dependencies can order prior loads against later stores. (*) Control dependencies can order prior loads against later stores.
However, they do -not- guarantee any other sort of ordering: However, they do -not- guarantee any other sort of ordering:
Not prior loads against later loads, nor prior stores against Not prior loads against later loads, nor prior stores against
@ -817,14 +808,13 @@ In summary:
between the prior load and the subsequent store, and this between the prior load and the subsequent store, and this
conditional must involve the prior load. If the compiler is able conditional must involve the prior load. If the compiler is able
to optimize the conditional away, it will have also optimized to optimize the conditional away, it will have also optimized
away the ordering. Careful use of READ_ONCE_CTRL() READ_ONCE(), away the ordering. Careful use of READ_ONCE() and WRITE_ONCE()
and WRITE_ONCE() can help to preserve the needed conditional. can help to preserve the needed conditional.
(*) Control dependencies require that the compiler avoid reordering the (*) Control dependencies require that the compiler avoid reordering the
dependency into nonexistence. Careful use of READ_ONCE_CTRL(), dependency into nonexistence. Careful use of READ_ONCE() or
atomic{,64}_read_ctrl() or smp_read_barrier_depends() can help to atomic{,64}_read() can help to preserve your control dependency.
preserve your control dependency. Please see the Compiler Barrier Please see the Compiler Barrier section for more information.
section for more information.
(*) Control dependencies pair normally with other types of barriers. (*) Control dependencies pair normally with other types of barriers.

View file

@ -43,7 +43,6 @@ static inline long atomic_long_read##mo(const atomic_long_t *l) \
} }
ATOMIC_LONG_READ_OP() ATOMIC_LONG_READ_OP()
ATOMIC_LONG_READ_OP(_acquire) ATOMIC_LONG_READ_OP(_acquire)
ATOMIC_LONG_READ_OP(_ctrl)
#undef ATOMIC_LONG_READ_OP #undef ATOMIC_LONG_READ_OP

View file

@ -4,15 +4,6 @@
#include <asm/atomic.h> #include <asm/atomic.h>
#include <asm/barrier.h> #include <asm/barrier.h>
#ifndef atomic_read_ctrl
static inline int atomic_read_ctrl(const atomic_t *v)
{
int val = atomic_read(v);
smp_read_barrier_depends(); /* Enforce control dependency. */
return val;
}
#endif
/* /*
* Relaxed variants of xchg, cmpxchg and some atomic operations. * Relaxed variants of xchg, cmpxchg and some atomic operations.
* *
@ -561,15 +552,6 @@ static inline int atomic_dec_if_positive(atomic_t *v)
#include <asm-generic/atomic64.h> #include <asm-generic/atomic64.h>
#endif #endif
#ifndef atomic64_read_ctrl
static inline long long atomic64_read_ctrl(const atomic64_t *v)
{
long long val = atomic64_read(v);
smp_read_barrier_depends(); /* Enforce control dependency. */
return val;
}
#endif
#ifndef atomic64_andnot #ifndef atomic64_andnot
static inline void atomic64_andnot(long long i, atomic64_t *v) static inline void atomic64_andnot(long long i, atomic64_t *v)
{ {

View file

@ -299,22 +299,6 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
__u.__val; \ __u.__val; \
}) })
/**
* READ_ONCE_CTRL - Read a value heading a control dependency
* @x: The value to be read, heading the control dependency
*
* Control dependencies are tricky. See Documentation/memory-barriers.txt
* for important information on how to use them. Note that in many cases,
* use of smp_load_acquire() will be much simpler. Control dependencies
* should be avoided except on the hottest of hotpaths.
*/
#define READ_ONCE_CTRL(x) \
({ \
typeof(x) __val = READ_ONCE(x); \
smp_read_barrier_depends(); /* Enforce control dependency. */ \
__val; \
})
#endif /* __KERNEL__ */ #endif /* __KERNEL__ */
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */

View file

@ -141,7 +141,7 @@ int perf_output_begin(struct perf_output_handle *handle,
perf_output_get_handle(handle); perf_output_get_handle(handle);
do { do {
tail = READ_ONCE_CTRL(rb->user_page->data_tail); tail = READ_ONCE(rb->user_page->data_tail);
offset = head = local_read(&rb->head); offset = head = local_read(&rb->head);
if (!rb->overwrite && if (!rb->overwrite &&
unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size)) unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))