lundi 26 octobre 2015

Threadsafe, ordered mapping/hash in C++?

What's the best way to implement a threadsafe ordered mapping/hash in C++? Aka, a fast-lookup data structure (aka, not a queue) that different threads can iterate across, occasionally inserting or deleting elements without interfering with the other threads' activity?

  • std::map is not threadsafe, and its operations are non-atomic - although only erasure invalidates iterators

  • wrapping every function in the whole map class doesn't fix the problem - you can have loose iterators out there pointing to a node that gets erased by another thread. It should either lock and prevent the deletion until the current thread is the only one referencing it, or use the UNIX-filesystem-style "dangling but still valid reference upon deletion" approach

  • tbb::concurrent_hash_map is designed to be threadsafe, but its iterators still invalidate upon deletion of their key. Even if the Value is a smart pointer the data won't get preserved, as the reference in the iterator gets lost.
  • One can use tbb:concurrent_hash_map by iterating through keys instead of iterators (one can look up a key as O(1) instead of O(log N) like std::map), but since it's a hash, it lacks order-dependent functions like upper_bound, lower_bound, etc. Critically, if a key is left dangling by an erase in another thread, there's no obvious way to tell the code to jump back to the nearest element to that key in the hash.
  • std::unordered_map has the possibility that one could try to jury-rig finding the nearest element if your key gets deleted via the bucket access functions. But std::unordered_map is not considered threadsafe (although it could potentially be wrapped to be threadsafe). tbb::concurrent_hash_map is threadsafe (subject to the constraints above) but it doesn't allow sufficient bucket access to do this.
  • std::map, accessed by keys instead of iterators, could find the next nearest element in the map if a key gets deleted by another thread. But every lookup would be O(log N) instead of O(1).
  • Also, as mentioned, std::map is non-atomic. It would have to be wrapped. Using keys instead of iterators also requires wrapping the whole class because one has to change all of the functions that normally take or return iterators to take or return keys instead.
  • Someone on one site that I read said that std::map doesn't work well in threaded environments regardless (compared to hashes) because threads don't play nicely with tree rebalancing, although I don't really know why that would be or whether it's even generally applicable/accurate.

What do you think would be best?

Aucun commentaire:

Enregistrer un commentaire