samedi 17 avril 2021

The story in the C++11 saga of const -> mutable -> mutex

Summary: In C++98, or before threads, albeit incorrect in some contexts, it was common to write class with internal hidden state in this way:

struct Widget {
  int getValue() const{
    if (cacheValid) return cachedValue;
    else {
      cachedValue = expensiveQuery(); // write data mem
      cacheValid = true;                    // write data mem
      return cachedValue;
    }
}
...
  private:
  mutable int cachedValue;
  mutable bool cacheValid;
...
};

This question is to know how this evolved in modern C++, and specifically about the ... (rest of the implementation).

In C++11 Herb Sutter said that from now on const == thread safe. To me the logic is pretty solid and at the least, although not written in stone, this gives a good use of the language features (mutable/const/mutex) to express the new situation.

So, the new way to write the class would be:

struct Widget {
  int getValue() const{
    std::lock_guard<std::mutex> guard{m}; // lock mutex
    if (cacheValid) return cachedValue;
    else {
      cachedValue = expensiveQuery(); // write data mem
      cacheValid = true;                    // write data mem
      return cachedValue;
    }
  }                                      // unlock mutex
...
  private:
  mutable std::mutex m;
  mutable int cachedValue;
  mutable bool cacheValid;
...
};

One class or two classes?

During the presentation, I though this was the end of the story, solid logic, elegant solution, but actually it seem that this is just the tip of the iceberg:

For efficiency, users of the C++98 version, although recognizing that thread safety is necessary, could claim that locking a mutex is not necessary in certain contexts and that this is a pessimization. Sometimes, they could say that they know that they are not in multiple thread context and they shouldn't need to be using this version of the class.

One possibility is to maintain two versions of the class, Widget_nonmt (c++98) and Widget_mt (C++11 version with mutex). I have seen in some libraries, if I understand correctly, like in the logging library spdlog.

1. This seems to be an overkill, specially because you end up duplicating most of the code, it seems. Is it?

Non-const == unprotected == efficient when you need ?

so I though that some one that really know what he/she is doing can actually a more general version of the single class:

struct Widget {
private:
  int getValue_aux() const{
    if (cacheValid) return cachedValue;
    else {
      cachedValue = expensiveQuery(); // write data mem
      cacheValid = true;                    // write data mem
      return cachedValue;
    }
  }
public:
  int getValue() {
    return get_value_aux();
  }                                      // unlock mutex
  int getValue() const{
    std::lock_guard<std::mutex> guard{m}; // lock mutex
    get_value_aux();
  }                                      // unlock mutex
...
  private:
  mutable std::mutex m;
  mutable int cachedValue;
  mutable bool cacheValid;
...
};

So, a) a normal user can use a non-const instance of Widget to mark that it is not intended to be used in a thread safe context, and not pay for the mutex-locking b) A user-that-really-knows-what-he-is-doing can const_cast or wrap it as mutable if he knows that thread safety is not going to be an issue on a particular point in the code (he/she is still storing a mutex though).

2. Is this a reasonable generalization of Widget at all? Or is it flawed?

nonmt as and implementation

Of course I could be already off the rails here already, but if not I would like to know how to continue here.

One can have the best of both worlds by having two classes and the option to use const-non const members:

struct Widget_nonmt { // better naming?
  int getValue() const{
    if (cacheValid) return cachedValue;
    else {
      cachedValue = expensiveQuery(); // write data mem
      cacheValid = true;                    // write data mem
      return cachedValue;
    }
}
...
  private:
  mutable int cachedValue;
  mutable bool cacheValid;
...
};

struct Widget : private Widget_nonmt { // or struct Widget_mt
  int getValue() {
    return Widget_nonmt::getValue();
  }
  int getValue() const{
    std::lock_guard<std::mutex> guard{m}; // lock mutex
    return Widget_nonmt::getValue();
  }                                      // unlock mutex
...
  private:
  mutable std::mutex m; // now it can be though as protecting the Widget_nonmt part
...
};

3. Now there are two classes, and one also serves for an implementation of the other and it can in non-const context to be used without the mutex. Does it make sense? Is it a good option? is it used in real code?

The "rest" of the class, the "rest" of the story, e.g. copy

Ok, now I might be in uncharted territory and completely lost. How do I implement the rest of the class? That is, how this saga continues really.

Leaving aside conversions between Widget and Widget_nonmt, which can be a whole discussion in itself. How do I implement copy?

One option is not to implement copy and that's it. (=delete). But Widget can have bonafide state and that state should be copyable or one might want to make copies precisely to alleviate the usage of a single mutex and balance it with the expense of expensiveQuery.

(Widget is copyable without problems, one might or might not want to copy the cache depending if the Widget has real state for example, that doable and it is fine).

So lets assume we want to make Widget copyable: one cannot make it default-copyable because the mutex is not copyable so something needs to be implemented manually.

This is already complicated because one needs to protect the source instance other before copying and for that one seems to need a body for the function.

struct Widget : private Widget_nonmt { // or struct Widget_mt
  Widget(Widget const& other){
    std::lock_guard<std::mutex> guard{m}; // lock mutex
     Widget_nonmt::cachedValue = other.cachedValue;
     Widget_nonmt::cachedValid = other.cachedValid;
  }
  ...
};

This is getting very ugly. Maybe one needs to implement a lock() function that RAII-lock and return the unprotected Widget_nonmt, this is possible but you can see how the problem diverges.

5. Is this the right way to make the C++11 Widget copyable?

Copy constructor (thread-safe) --> Duplicate/Mutable-copy constructor?

Finally, we can have the grumpy user again who rightly says "now you are forcing me to use the mutex for copies even in context that I know they are unnecessary."

And this is where it gets crazy, continuing with this logic (same as for the member function) one might think of introducing a non-const copy constructor.

struct Widget : private Widget_nonmt { // or struct Widget_mt
  Widget(Widget& other) : Widget_nonmt(static_cast<Widget_nonmt&>(other)){}
  Widget(Widget const& other) ... // same as above
};

Note that this is not a move constructor, it is something more strange. It is not unusable, and certainly the STL containers for example will never use it when copying containers of Widget.

Logic took me to this ugly place. It seems like a tragic end to a story that started rather solidly. Or is this a problem that still waits for a more elegant and general solution.

6. Is this the real end of C++98 to C++11 transition saga of a class with protected internal state?

c++11