I am migrating a project that was run on bare-bone to linux, and need to eliminate some {disable,enable}_scheduler
calls. :)
So I need a lock-free sync solution in a single writer, multiple readers scenario, where the writer thread cannot be blocked. I came up with the following solution (pseudo-code), which does not fit to the usual acquire-release ordering:
class RWSync {
std::atomic<int> version; // incremented after every modification
std::atomic_bool invalid; // true during write
public:
template<typename F> void sync(F lambda) {
KWu32 currentVersion;
do {
do { // wait until the object is valid
currentVersion = version.load(std::memory_order_acquire);
} while (invalid.load(std::memory_order_acquire));
lambda();
std::atomic_thread_fence(std::memory_order_seq_cst);
// check if something changed
} while (version.load(std::memory_order_acquire) != currentVersion
|| invalid.load(std::memory_order_acquire));
}
void beginWrite() {
invalid.store(true, std::memory_order_relaxed);
std::atomic_thread_fence(std::memory_order_seq_cst);
}
void endWrite() {
std::atomic_thread_fence(std::memory_order_seq_cst);
version.fetch_add(1, std::memory_order_release);
invalid.store(false, std::memory_order_release);
}
}
I hope the intent is clear: I wrap the modification of a (non-atomic) payload between beginWrite/endWrite
, and read the payload only inside the lambda function passed to sync()
.
As you can see, here I have an atomic store in beginWrite()
where no writes after the store operation can be reordered before the store. I did not find suitable examples, and I am not experienced in this field at all, so I'd like some confirmation that it is OK (verification through testing is not easy either).
-
Is this code race-free and work as I expect?
-
If I use std::memory_order_seq_cst in every atomic operation, can I omit the fences? (Even if yes, I guess the performance would be worse)
-
Can I drop the fence in endWrite()?
-
Can I use memory_order_acq_rel in the fences? I don't really get the difference -- the single total order concept is not clear to me.
-
Is there any simplification / optimization opportunity?
+1. I happily accept any better idea as the name of this class :)
Aucun commentaire:
Enregistrer un commentaire