I have been looking at a lock free single producer/single consumer circular buffer when I thought about speculative execution and its effect on the simple code.
With this implementation, there is only a unique thread which can call the push() function and another unique thread which can call the pop() function.
Here is the Producer code:
bool push(const Element& item)
{
const auto current_tail = _tail.load(std::memory_order_relaxed); //(1)
const auto next_tail = increment(current_tail);
if(next_tail != _head.load(std::memory_order_acquire)) //(2)
{
_array[current_tail] = item; //(3)
_tail.store(next_tail, std::memory_order_release); //(4)
return true;
}
return false; // full queue
}
Here is the Consumer code:
bool pop(Element& item)
{
const auto current_head = _head.load(std::memory_order_relaxed); //(1)
if(current_head == _tail.load(std::memory_order_acquire)) //(2)
return false; // empty queue
item = _array[current_head]; //(3)
_head.store(increment(current_head), std::memory_order_release); //(4)
return true;
}
The Question
What if the push() would be compiled as the following function due to speculative execution:
bool push(const Element& item)
{
const auto current_tail = _tail.load(std::memory_order_relaxed); // 1
const auto next_tail = increment(current_tail);
//The load is performed before the test, it is valid
const auto head = _head.load(std::memory_order_acquire);
//Here is the speculation, the CPU speculate that the test will succeed
//store due to speculative execution AND it respects the memory order due to read-acquire
_array[current_tail] = item;
_tail.store(next_tail, std::memory_order_release);
//Note that in this case the test checks if you it has to restore the memory back
if(next_tail == head)//the code was next_tail != _head.load(std::memory_order_acquire)
{
//We restore the memory back but the pop may have been called before and see an invalid memory
_array[current_tail - 1] = item;
_tail.store(next_tail - 1, std::memory_order_release);
return true;
}
return false; // full queue
}
To me, to be perfectly valid the push function should make sure the barrier is issued after the condition success:
bool push(const Element& item)
{
const auto current_tail = _tail.load(std::memory_order_relaxed); // 1
const auto next_tail = increment(current_tail);
if(next_tail != _head.load(std::memory_order_relaxed)) // 2
{
//Here we are sure that nothing can be reordered before the condition
std::atomic_thread_fence(std::memory_order_acquire); //2.1
_array[current_tail] = item; // 3
_tail.store(next_tail, std::memory_order_release); // 4
return true;
}
return false; // full queue
}
Aucun commentaire:
Enregistrer un commentaire