dimanche 20 février 2022

c++ concurrent hash map with thread-safe find insert count erase and iterator?

I am currently using std::unordered_map container to perform these operations find() insert() erase() count() and to iterate in this container. to increase the performance of my algorithm I need to use all of my CPU threads so I need to use a concurrent container that permit many threads to operate a thread-safe find() insert() erase() count() and iteration on common data.

my first attempt is using tbb::concurrent_unordered_map unfortunately this container does not provide safe_erase() function.

my second attempt is to use tbb::concurrent_hash_map as proposed in many SO posts post1 post2 post3 that provides thread-safe finding, inserting and erasing. But according to this Answer tbb::concurrent_hash_map does not permit safe iterator and they proposed some libraries that provide a collection of concurrent data structures and memory reclamation algorithms such as facebook folly, xenuim that give customizable concurrent_hash_map files, when I checked the source code I got lost how to use those header files.

Also I noticed that std::unordered_map member functions find, insert, end() return an iterator while tbb::concurent_hash_map member functions return a bool and it does not provide thread-safe iterator and find and insert use const_accessor and accessor

this is my class that i am trying to make run in parallel:

query.h

#define CHUNK 32

#include <array>
#include <iostream>
#include <unordered_map>
#include <vector>

class query {
  friend auto operator<<(std::ostream &, const query &) -> std::ostream &;
public:
  static size_t DT;
  query() = default;
  virtual ~query() = default;
  // Add a query point to d-tree.
  auto increase(const size_t &) -> query &;
  // Check whether a point is query point.
  auto checks(const size_t &) -> bool;
  // Remove a query point.
  auto remove(const size_t &) -> query &;
  // The number of query points.
  auto size() -> size_t;
private:
  static auto chunk_(size_t) -> size_t;
  size_t calculate_ = 0;
  std::vector<size_t> blank_;
  std::array<std::unordered_map<size_t, std::vector<size_t>>, CHUNK> container_;
};

query.cpp

#include "query.h"
size_t query::DT = 0;

auto operator<<(std::ostream &out, const query &query) -> std::ostream & {
  for (auto &&tree : query.container_) {
    for (auto &&s: tree) { out << s.first << std::endl; }
  }
  return out;
}

auto query::increase(const size_t &s) -> query & {
  auto &&it = container_[chunk_(s)].find(s);
  if (it != container_[chunk_(s)].end()) {return *this;}
  std::vector<size_t> v;
  container_[chunk_(s)].insert(std::make_pair(s, v));
  ++calculate_;
  return *this; }

auto query::checks(const size_t &p) -> bool {
    return container_[chunk_(p)].count(p);  }

auto query::remove(const size_t &s) -> query & {
  container_[chunk_(s)].erase(s);
  --calculate_;
  return *this; }

auto query::size() -> size_t { return calculate_;}

auto query::chunk_(size_t index) -> size_t {
  return (int) index % CHUNK;   }

things that I tried:

I am wondering if something like this would be possible:

...// add this to query.h
#include "tbb/concurrent_hash_map.h"
#include <tbb/parallel_for.h>
...
std::array<tbb::concurrent_hash_map<size_t, std::vector<size_t>>, CHUNK> container_;
... // change this in query.cpp
auto query::increase(const size_t &s) -> query & {
  tbb::concurrent_hash_map<size_t, std::vector<size_t>>::const_accessor cacc;
  tbb::concurrent_hash_map<size_t, std::vector<size_t>>::accessor acc;
  auto &&it = container_[chunk_(s)].find(cacc,s);
  if (it != container_[chunk_(s)].end()) {return *this;}
  std::vector<size_t> v;
  container_[chunk_(s)].insert(acc, std::make_pair(s, v));
  ++calculate_;
  return *this; }
...

// later these query functions will be used in other files inside for loops
...
tbb::parallel_for( tbb::blocked_range<size_t>(0,width),
                       [&](tbb::blocked_range<size_t> r)
             {
             for (size_t i=r.begin(); i<r.end(); ++i)
             {
             query.increase(window_[i]);
             }
              });    
...         

errors that I get:

in this line: if (it != container_[chunk_(s)].end()){...}

error: no operator "!=" matches these operands operand types are: bool != ...

I think because find() returns a bool, but end() returns an iterator.

notes that I found in SO answer:

it says that  you don't need a find() method for your task since insert() methods create the element if it does not exist.

What might be the problem? What's the best way to achieve this in my code? Thanks.

Aucun commentaire:

Enregistrer un commentaire