mardi 28 janvier 2020

is unordered_set thread safe if data race is irrelevant?

I would like to have threads add unique floats to a list, and first thought about approaching it by inserting into an unordered_map<float,int> where the values does not matter (which essentially is an unordered_set<float>). The threads could override each other with values 1 in the same key, ie. it doesn't matter if Thread 1 wrote value 1 first, or Thread 2, because in the end, it will just end up as 1. As such, data races should be irrelevant.

#include <iostream>
#include <string>
#include <unordered_set>
#include <thread>
#include <set>

std::unordered_set<float> mySet;

void someTask(int i) {
    mySet.insert(i % 3);
}

int main()
{
    std::thread threads[8];
    for (int i = 0; i < 8; i++) {  
     threads[i] = std::thread(someTask, i);
    }
    for (int i = 0; i < 8; i++) {  
     threads[i].join();
    }    

    for (auto const& mySetVal : mySet) {
     std::cout << mySetVal << std::endl;    // Oddly, mySet might have duplicates
    }    

    std::set<float> mySet_( mySet.begin(), mySet.end() ); // Make a new ordered set, and removing duplicates along the way

    for (auto const& mySetVal : mySet_) {
     std::cout << mySetVal << std::endl;   // Would this always produce the expected result of 0,1,2?
    }
}

However mySet despite being an unordered_set has duplicates, and I'm guessing this is due to a race condition while reading for the key, and ended up inserting twice? But when I replace mySet with a unordered_map<float,int>, there are still duplicates- I would've thought even if there is a race condition, and although we can't guarantee which thread will execute first, in the end, the threads can override one another without any harm.

So why are there duplicates?

And by removing the duplicates at the end, is this thread-safe, or would it produce the expected results of 0,1,2 consistently/reliably? If not, a thread-safe solution for the code above would be awesome!

Aucun commentaire:

Enregistrer un commentaire