mardi 28 mars 2017

Is it possible to force a reload of a thread_local variable after migration across kernel threads?

I am implementing user threading on top of kernel and threads and observed that, when a user thread migrates between kernel threads, thread_local variables are read from the previous kernel location even when the variables are also tagged as volatile.

Here is some assembly code that is generated from the use of volatile. We observe the register rax is being reloaded from memory, but that the memory location was determined before the call to swapcontext.

  404df8:   64 48 8b 04 25 00 00    mov    %fs:0x0,%rax
  404dff:   00 00·
  404e01:   48 8d 80 e0 ff ff ff    lea    -0x20(%rax),%rax
  404e08:   48 8b 10                mov    (%rax),%rdx
  404e0b:   48 89 c3                mov    %rax,%rbx
  404e0e:   48 39 d5                cmp    %rdx,%rbp
  404e11:   0f 84 c5 00 00 00       je     404edc <_ZN7Arachne8dispatchEv+0x3cc>
  404e17:   48 8d 72 08             lea    0x8(%rdx),%rsi
  404e1b:   48 8d 7d 08             lea    0x8(%rbp),%rdi
  404e1f:   48 89 2b                mov    %rbp,(%rbx)
  # Here we have a user-level context switch.
  404e22:   e8 e9 ec ff ff          callq  403b10 <_ZN7Arachne11swapcontextEPPvS1_> 
  # When we return, we would like to reload %fs:0x0 from the current value of fs.
  404e27:   48 8b 03                mov    (%rbx),%rax
  404e2a:   48 89 04 24             mov    %rax,(%rsp)
  404e2e:   48 8b 04 24             mov    (%rsp),%rax
  404e32:   48 c7 40 10 ff ff ff    movq   $0xffffffffffffffff,0x10(%rax)

Is there a way to force a reload of the address of the variable as an offset from the current value of the fs register?

I am fine with using gcc-specific hacks of those exist.

