samedi 17 décembre 2022

When is ok to break up atomic Read-Modify-Write operations into constituent relaxed operations + barrier?

In the model-checked implementation of the CL-Deque, they use the following strategy to decrement the bottom pointer:

size_t b = atomic_load_explicit(&q->bottom, memory_order_relaxed) - 1;
Array *a = (Array *) atomic_load_explicit(&q->array, memory_order_relaxed);
atomic_store_explicit(&q->bottom, b, memory_order_relaxed);
atomic_thread_fence(memory_order_seq_cst);

So they load the bottom pointer, decrement it locally, then store it. Why is it valid to do this? Wouldn't a concurrent thief be able to see either one of the bottom values?

An alternative way to perform this would be to combine the read-modify-write operation into a single atomic_fetch_sub, like so:

Array *a = (Array *) atomic_load_explicit(&q->array, memory_order_relaxed);
size_t b = atomic_fetch_sub_explicit(&q->bottom, 1, memory_order_seq_cst) - 1;

This would remove that possible race condition.

I think that the broken-up version is valid because of the CAS inside of the steal and take functions resolves this race later, only if the deque is small enough for it to matter. But is this a general technique?

Aucun commentaire:

Enregistrer un commentaire