In all likelihood, a lockless implementation is already overkill for the purposes of my application, but I wanted to look into memory barriers and lockless-ness anyways in case I ever actually need to use these concepts in the future.
From what I can tell:
-
an "InterlockedAcquire" function performs an atomic operation while preventing the compiler from moving code statements after the InterlockedAcquire to before the InterlockedAcquire.
-
an "InterlockedRelease" function performs an atomic operation while preventing the compiler from moving code statements before the InterlockedRelease to after the InterlockedRelease.
-
a vanilla "Interlocked" function performs an atomic operation while preventing the compiler from moving code statements in either direction across the Interlocked call.
My question is, if a function is structured such that the compiler can't reorder any of the code anyways because doing so would affect single-threaded behavior, is there a difference between any of the variants of an Interlocked function, or all they all effectively the same? Is the only difference between them how they interact with code reordering?
For a more concrete example, here's my current application - the produce() function as part of what will eventually be a multiple producer, single consumer queue built using a circular buffer:
template <typename T>
class Queue {
private:
long headIndex;
long tailIndex;
T* array[MAXQUEUESIZE];
public:
Queue() {
headIndex = 0;
tailIndex = 0;
memset(array, 0, MAXQUEUESIZE*sizeof(void*);
}
~Queue() {
}
bool produce(T value) {
//1) prevents concurrent calls to produce() from causing corruption:
long indexRetVal;
long reservedIndex = tailIndex;
do {
indexRetVal = InterlockedCompareExchange64(&tailIndex, (reservedIndex + 1) % MAXQUEUESIZE, reserved);
} while (indexRetVal != reservedIndex);
//2) allocates the node.
T* newValPtr = (T*) malloc(sizeof(T));
if (newValPtr == null) {
OutputDebugString("Queue: malloc returned null");
return false;
}
*newValPtr = value;
//3) prevents a concurrent call to consume from causing corruption by atomically replacing the old pointer:
T* valPtrRetVal = InterlockedCompareExchangePointer(array + reservedIndex, newValPtr, null);
//if the previous value wasn't null, then our circular buffer overflowed:
if (valPtrRetVal != null) {
OutputDebugString("Queue: circular buffer overflowed");
return false;
}
//otherwise, everything worked fine
return true;
}
};
As I understand it, 3) will occur after 1) and 2) regardless of what I do anyways, but I should change 1) to an InterlockedRelease because I don't care whether it occurs before or after 2) and I should let the compiler decide.
Aucun commentaire:
Enregistrer un commentaire