Reorder relaxed atomic operations on the same object

107 views Asked by At

http://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync

Assuming x is initially 0:

-Thread 1-
x.store (1, memory_order_relaxed)
x.store (2, memory_order_relaxed)

-Thread 2-
y = x.load (memory_order_relaxed)
z = x.load (memory_order_relaxed)
assert (y <= z)

The assert cannot fail.

I don't understand why the two loads cannot be reordered so that z is read before y which can give z = 1 and y = 2, and so y <= z fails.

3

There are 3 answers

1
Peter Cordes On BEST ANSWER

Operations on the same object by the same thread be reordered; the modification-order for each object separately is some interleaving of program-order across threads.

Real hardware does that for free, so C++ makes that guarantee available for free. (And doesn't allow compile-time reordering which could leave the long-term value wrong once the dust settles.)

If the later store was temporarily visible first, the value would have to change twice to have the right final value. Again, real hardware isn't that weak / crazy, thanks to cache coherency. http://eel.is/c++draft/intro.races#19

2
Jan Schultke On

Standard Perspective

From a C++ standard perspective, the reason is that every atomic object has a modification order, which a total order in which all modifications occur. The two statements ...

x.store (1, memory_order_relaxed)
x.store (2, memory_order_relaxed)

... are sequenced. Namely store(1) is sequenced before store(2), and the two load()s on Thread 2 are also sequenced. They are sequenced because they are in separate statements, and std::memory_order::relaxed doesn't affect sequencing within one thread.

The modification order consists of at least:

  1. initialization of x
  2. x.store(1)
  3. x.store(2)

std::memory_order_relaxed doesn't break these rules. It doesn't allow another thread to "time travel" and first see 3., forget about it, and then see 2.

If a value computation A of an atomic object M happens before a value computation B of M, and A takes its value from a side effect X on M, then the value computed by B shall either be the value stored by X or the value stored by a side effect Y on M, where Y follows X in the modification order of M.

- [intro.races] p16

In your case, the value computation load() (first) happens before load() (second), so load() (second) either needs to take its value from the same side effect as load() (first), or from a later side effect.

std::memory_order_relaxed means that Thread 2 can selectively see 1., 2., and 3. (and an arbitrary combination of these) when it calls x.load(), but it cannot see 3. and then ignore it.

Hardware Perspective

As @PeterCordes has already pointed out, even though the two stores can't be reordered by the compiler, hardware reordering can still take place. However, any reordering that is done by the CPU cannot change the semantics of the assembly that the compiler has emitted.

0
Brian Bi On

A C++ compiler is not permitted to reorder operations in a way that would change the observable behaviour of a program.

However, when using atomic variables, you may observe results that appear to be caused by "reordering". In actual fact, what is happening is that not all threads will see side effects become visible in the same order. This is illustrated by the following well-known example:

std::atomic<int> x;
std::atomic<int> y;
// thread 1
x.store(1, memory_order_relaxed);
y.store(1, memory_order_relaxed);
// thread 2
if (y.load(memory_order_relaxed) == 1) {
    assert(x.load(memory_order_relaxed) == 1);  // can fail
}

It is possible that at some point in time thread 2 might observe the result of the side effect on y (i.e. that its value has been changed to 1) yet still observe x as if the side effect had not happened yet. The assertion may fail for this reason—not because two stores somehow got reordered.

Again, when two different atomic variables are written to, then two threads might not see the results of those two writes become visible in the same order.

However, once a write to one single atomic variable has become visible to a particular thread, any subsequent reads from that variable will also behave as if that write is visible—meaning that all previous writes to that variable have been superseded, and the old value will not be observed. In the OP's example, if y is set to 2, then it means that the side effect that changed x's value from 1 to 2 has already become visible to thread 2.