mercredi 28 décembre 2016

why are multiple shared_future objects needed to synchronize data

A pointer to a data structure is shared with multiple threads via std::promise and std::shared_future. From the book 'C++ concurrency in action' by Anthony Williams (pg. 85-86), it seems that data is only correctly synchronized when each receiving thread uses a copy of the std::shared_future object as opposed to each thread accessing a single, global std::shared_future.

To illustrate, consider a thread creating bigdata and passing a pointer to multiple threads that have read-only access. If data synchronization between threads is not handled correctly, memory reordering may lead to undefined behavior (eg. a worker_thread reading incomplete data).

This (incorrect ?) implementation uses a single, global std::shared_future:

#include <future>

struct bigdata { ... };

std::shared_future<bigdata *> global_sf;

void worker_thread()
{
    const bigdata *ptr = global_sf.get();
    ...  // ptr read-only access
}

int main()
{
    std::promise<bigdata *> pr;
    global_sf = pr.get_future().share();

    std::thread t1{worker_thread};
    std::thread t2{worker_thread};

    pr.set_value(new bigdata);
    ...
}

And in this (correct) implementation, each worker_thread gets a copy of std::shared_future:

void worker_thread(std::shared_future<bigdata *> sf)
{
    const bigdata *ptr = sf.get();
    ...
}

int main()
{
    std::promise<bigdata *> pr;
    auto sf = pr.get_future().share();

    std::thread t1{worker_thread, sf};
    std::thread t2{worker_thread, sf};

    pr.set_value(new bigdata);
    ....

I am wondering why the first version is incorrect.

If std::shared_future::get() was a non-const member function, it would make sense since accessing a single std::shared_future from multiple threads would then be a data race itself. But since this member function is declared const, it is safe to access concurrently from multiple threads. I am not saying that therefore the problem does not exist, only that const on a standard library routine guarantees that it does not introduce a data race.

My question is, why exactly is this only guaranteed to work correctly if each worker_thread receives a copy of the std::shared_future ?

Aucun commentaire:

Enregistrer un commentaire