jeudi 30 août 2018

How to get pointer to string's data in a vector of strings?

I have a vector of strings as my primary data container. However, in order to interoperate with a C library, I need to be able to view these strings as character data pointers (i.e. const char*). This sounds simple enough, so I wrote a helper class like this:

class strvecAccessor {
  const std::vector<std::string>& names;

public:
  strvecAccessor(const std::vector<std::string>& a) : names(a) {}
  size_t size() const { 
    return names.size(); 
  }
  const char* item(size_t i) {
    auto name = names[i];
    return name.data();
  }
};

This accessor class is short-lived. It serves as a wrapper around an existing vector of strings, which is guaranteed to not be modified or go out-of-scope during the lifespan of this class. An example of how this class can be used is the following:

void test(strvecAccessor& arr) {
  for (size_t i = 0; i < arr.size(); ++i) {
    printf("%s\n", arr.item(i));
  }
}

But there is a bug in this code, which manifests itself only when I compile it in --coverage -O0 mode, and only on Unix machine (I compile with CLang 6.0.0 in C++11 compatibility mode). The bug is that the printed strings contain garbage.

I believe what happens is that the name variable in item() method is not a reference, but a copy of the i-th element of the array. It goes out-of-scope at the end of the item() function, at which point the pointer that was returned becomes dangling. Most of the time it is not noticeable since the pointer is used immediately, but in coverage mode it gets filled up with other data right after the call.

The problem disappears if I replace auto name = names[i]; with const std::string& name = names[i];. But I don't really understand why, and whether this actually solves the problem or just buries it deeper. So my question is: why the copy is being made in the original code; and how to protect myself against these kinds of errors in the future?

Aucun commentaire:

Enregistrer un commentaire