vendredi 20 septembre 2019

Relaxed atomics and the reorder buffer

Modern processors execute instructions out of order as per Tomasulo algorithm, so the purpose of the reorder buffer is to ensure that the results of the instructions are written out in program order via its ISSUE and COMMIT pointers. So if I have instructions X, Y, Z, the actual execution order could be Y, Z, X but the order any results are written back in (committed/retired) is X, Y, Z.

Now, I am confused as to how this ties in with C++11 relaxed atomics. Wikipedia tells me firstly that (emphasis mine)

Memory ordering describes the order of accesses to computer memory by a CPU. The term can refer either to the memory ordering generated by the compiler during compile time, or to the memory ordering generated by a CPU during runtime.

...and that with relaxed consistency:

Loads can be reordered after loads (for better working of cache coherency, better scaling)

Loads can be reordered after stores

Stores can be reordered after stores Stores can be reordered after loads

So I don't understand what relaxed atomics are actually giving me, since any instruction can be executed out-of-order, but always has to be "committed/retired" in order as per the reorder buffer.

ASSUMPTION: Surely it can't be that relaxed consistency allows the reorder buffer to commit/retire instructions truly out-of-the-order that the assembly dictates (since then why have the reorder buffer?), so the only thing I can think of - which I have read numerous times (and reasonably understand) - is that it affects the order that other cores "see" writes to memory locations in. So as a simple example, a relaxed write to variable X on Core1 might not be immediately visible in Core2; whereas with an acquire/release or sequentially consistent write it will be.


1) Can someone clarify if my assumption here is correct?

2) When a "store word" instruction is retired by the reorder buffer, what is the difference on a hardware level in what happens when that store is "relaxed" versus when it is "sequentially consistent"? It has got to be to do with the caches, surely? Kind of "when you feel like it, push this value out to other caches" as opposed to "push this out to all other caches now" (or is that wrong?)


I have searched extensively for information on this before posting my question, FYI.

Aucun commentaire:

Enregistrer un commentaire