vendredi 6 septembre 2019

How to optimize memory allocations inside a C++11 function returning a vector without breaking client code

I have a C++11 function that returns a vector:

std::vector<int> foo(int i) {
  std::vector<int> ret;
  ret.push_back(20 + i);
  return ret;
}

This function is already published and used in multiple locations in client code. For some use cases, foo is used at very high frequency and for these use cases the cost of vector allocations is prohibitive. I can't break the client code for the other use cases.

The objective is to modify foo to enable re-use of already allocated vectors when available. This must be done in a way that does not break existing client.

The first thing I could think of was:

Option 1: having two different foo functions:

void foo(int i, std::vector<int> &ret) {
  ret.clear();
  ret.push_back(20 + i);
}
std::vector<int> foo(int i) {
  std::vector<int> ret;
  foo(i, ret);
  return ret;
}

This works as expected but the drawback is that there are now two functions instead of just one. I was wondering if there would be a good way to achieve the same result but without adding a new function.

I could think of (somewhat bad) ways to get the expected optimization without adding a new function:

Option 2: using a static thread_local temporary

std::vector<int> &&foo(int i) {
  static thread_local std::vector<int> ret;
  ret.clear();
  ret.push_back(20 + i);
  return std::move(ret);
}

This meets the requirements but the usage is surprising as the user has to write foo(i).swap(vector) to get the benefit of the optimization. Also, the static variable does add some measurable overhead.

Option 3: using an rvalue reference parameter with default value

std::vector<int> &&foo(int i, std::vector<int> &&ret = std::vector<int>()) {
  ret.clear();
  ret.push_back(20 + i);
  return std::move(ret);
}

Faster than option 2 above but the usage is atrocious. It now requires the user to write foo(i, std::move(vector)). Worse, the user is now allowed to write vector = foo(i, std::move(vector)) with disastrous consequences.

The specific questions would be:

  • What are the good alternatives that I missed?
  • Is there a way to achieve the same result as option 2 but without the overhead of the static local variable?
  • Is there a way to achieve the same result as option 3 but that ensures that the user can't write vector = foo(i, std::move(vector))?

Aucun commentaire:

Enregistrer un commentaire