vendredi 14 février 2020

Trying to control multithreaded access to array using std::atomic

I'm trying to control multithreaded access to a vector of data, so threads will wait until their current position in it has been filled before trying to use it, or will fill it themselves if no-one else has yet. (But ensure no-one is waiting around if their position is already filled, or no-one has done it yet)

However, I am struggling to understand a good way to do this, especially involving std::atomic. I'm just not very familiar with C++ multithreading concepts aside from basic std::thread usage.

Here is a very rough example of the problem:

class myClass
{
  struct Data
  {
    int res1;
  };

  std::vector<Data*> myData;

  int foo(unsigned long position)
  {
    if (!myData[position])
      {
        bar(myData[position]);
      }

    // Do something with the data
    return 5 * myData[position]->res1;
  }

  void bar(Data* data)
  {
    data = new Data;

    // Do a whole bunch of calculations and so-on here

    data->res1 = 42;
  }
};

Now imagine if foo() is being called multi-threaded, and multiple threads may (or may not) have the same position at once. If that happens, there's a chance that a thread may (between when the Data was created and when bar() is finished, try to actually use the data.

So, what are the options?

1: Make a std::mutex for every position in myData. What if there are 10,000 elements in myData? That's 10,000 std::mutexes, not great.

2: Put a lock_guard around it like this:

std::mutex myMutex;
{
    const std::lock_guard<std::mutex> lock(myMutex);

    if (!myData[position])
      {
        bar(myData[position]);
      }
}

While this works, it also means if different threads are working in different positions, they wait needlessly, wasting all of the threading advantage.

3: Use a vector of chars and a spinlock as a poor man's mutex? Here's what that might look like:

static std::vector<char> positionInProgress;
static std::vector<char> positionComplete;

class myClass
{
  struct Data
  {
    int res1;
  };

  std::vector<Data*> myData;

  int foo(unsigned long position)
  {   
    if (positionInProgress[position])
      {
        while (positionInProgress[position])
          {
            ;  // do nothing, just wait until it is done
          }
      }
    else
      {
        if (!positionComplete[position])
          {
            // Fille the data and prevent anyone from using it until it is complete
            positionInProgress[position] = true;
            bar(myData[position]);
            positionInProgress[position] = false;

            positionComplete[position] = true;
          }
      }

    // Do something with the data
    return 5 * myData[position]->res1;
  }

  void bar(Data* data)
  {
    data = new Data;

    // Do a whole bunch of calculations and so-on here

    data->res1 = 42;
  }
};

This seems to work, but none of the test or set operations are atomic, so I have a feeling I'm just getting lucky.

4: What about std::atomic and std::atomic_flag? Well, there are a few problems.

  • std::atomic_flag doesn't have a way to test without setting in C++11...which makes this kind of difficult.
  • std::atomic is not movable or copy-constructable, so I cannot make a vector of them (I do not know the number of positions during construction of myClass)

Conclusion:

This is the simplest example that (likely) compiles I can think of that demonstrates my real problem. In reality, myData is a 2-dimensional vector implemented using a special hand-rolled solution, Data itself is a vector of pointers to more complex data types, the data isn't simply returned, etc. This is the best I could come up with.

Aucun commentaire:

Enregistrer un commentaire