lundi 18 mars 2019

x86 mfence and C++ memory barrier

I'm checking how the compiler emits instructions multi-core memory barriers on x86_64. The below code is the one I'm testing using gcc_x86_64_8.3.

std::atomic<bool> flag {false};
int any_value {0};

void set()
{
  any_value = 10;
  flag.store(true, std::memory_order_release);
}

void get()
{
  while (!flag.load(std::memory_order_acquire));
  assert(any_value == 10);
}

int main()
{
  std::thread a {set};
  get();
  a.join();
}

When I use std::memory_order_seq_cst, I can see the MFENCE instruction is used with any optimization -O1 -O2 -O3. This instruction makes sure the store buffers are flushed, therefore updating their data in L1D cache (and using MESI protocol to make sure other threads can see effect).

However when I use std::memory_order_release/acquire with no optimizations MFENCE instruction is also used, but the instruction is omitted using -O1, -O2, -O3 optimizations, and not seeing other instructions that flush the buffers.

In the case where MFENCE is not used, what makes sure the store buffer data is committed to cache memory to ensure the memory order semantics?

Aucun commentaire:

Enregistrer un commentaire