Regarding memory_order_relaxed it is often written that this operation provides no synchronization, however in reality all writes to an atomic appear to propagate immediately to other threads, even with memory_order_relaxed, in fact the example given on https://en.cppreference.com/w/cpp/atomic/memory_order seems to be entirely dependent on this fact:
std::atomic<int> cnt = {0};
void f()
{
for (int n = 0; n < 1000; ++n) {
cnt.fetch_add(1, std::memory_order_relaxed);
}
}
int main()
{
std::vector<std::thread> v;
for (int n = 0; n < 10; ++n) {
v.emplace_back(f);
}
for (auto& t : v) {
t.join();
}
std::cout << "Final counter value is " << cnt << '\n';
}
The final value of cnt is always 10 000. The atomicty of operation fetch_add is not enough to guarantee this, unless the value is always synchronized between the threads before the 'fetch_add', for example if the writing thread would not have synchronized with the thread who was writing latest value, the two modifications could be done on the same value, resulting in cnt < 10 000 at the end. It appears that every atomic write is immediately shared and made visible to other threads.
So my questions are:
1] is this immediate visibility/synchronization guaranteed in the C++ standard?
2] Is there any synchronization happening before calls to join()?
3] If there is no synchronization, how can the result be always 10 000?
The closest thing I found in CPP reference is this sentence: 'All modifications to any particular atomic variable occur in a total order that is specific to this one atomic variable.' Would this mean that in this case memory_order_relaxed appears to behave exactly like memory_order_seq_cst? (my limited benchmark showed that performance is identical in this case), that is there exists a total order, and the writing thread always gets the latest value in this order.
I would ask to refrain from explanations on the hardware level, im interested purely in C++ standard.
Aucun commentaire:
Enregistrer un commentaire