I need to implement a huge hash table which supports multiple threads to insert and get at the same time. The keys are int and the second elements are vectors of object T.
class T {
//class definitions here
}
Currently the implementation is helped with tbb::concurrent_unordered_map. On the documentation it seems to allow insertion and traversal at the same time. However, running with multiple pthreads will lead to segmentation fault although sequential test is perfectly fine. Note that there is definitely no erase or pop operations, i.e. the hash table is only allowed to grow.
std::vector<T*> get(int key) {
//Note that the hash table hashTable is shared by multiple threads
tbb::concurrent_unordered_map<int, std::vector<T*>>::iterator it = hashTable.find(key);
if (it != hashTable.end())
return it->second;
}
void insert(int key, T *t) {
tbb::concurrent_unordered_map<int, std::vector<T*>>::iterator it = hashTable.find(key);
if (it != hashTable.end())
it->second.push_back(t);
else {
std::vector<T*> newTs;
newTs.push_back(t);
hashTable.insert(it, makepair(key, newTs));
}
}
To debug what happened, I first changed std::vector to tbb::concurrent_vector, and then modify the function get() and insert() as follows.
std::vector<T*> get_test(int key) {
std::vector<T*> test;
//Note that the hash table hashTable is shared by multiple threads
tbb::concurrent_unordered_map<int, tbb::concurrent_vector<T*>>::iterator it = hashTable.find(key);
if (it != hashTable.end())
test.insert(test.end(), it->second.begin(), it->second.end());
for (T* _t : test)
if (!_t)
printf("Bug happens here!\n"); //Segfault is originated here because a NULL is returned in the vector
return test;
}
void insert_test(int key, T *t) {
//Here t is guaranteed to be not NULL
if(!t)
std::terminate();
tbb::concurrent_unordered_map<int, tbb::concurrent_vector<T*>>::iterator it = hashTable.find(key);
if (it != hashTable.end())
it->second.push_back(t);
else {
std::vector<T*> newTs;
newTs.push_back(t);
hashTable.insert(it, makepair(key, newTs));
}
}
Now we can see the reason that the parallel program crashes is some NULL pointer is returned in get_test() function. Recall that in insert_test() function NULL is never inserted as the second elements.
The following are the questions to ask.
(1) I read from somewhere that concurrent insertion in tbb::concurrent_unordered_map may cause some of the insertion attempt failed and then the temp objects is destroyed. Is this the reason that NULL is observed in get_test() function?
(2) Can TBB allow insertion and traversal at the same time? This means while some threads are inserting, the others might call get() at the same time.
(3) Is there any better implementation, i.e. better concurrent hash table that support concurrent insert and read?
Thanks a lot......
Aucun commentaire:
Enregistrer un commentaire