mercredi 25 janvier 2017

C++ String Concatenation Optimizations

Looking at a piece of code like this (comments added):

std::string some_var;
std::string some_func(); // both are defined, but definition is irrelevant
...
return "some text " + some_var + "c" + some_func(); // intentionally "c" not 'c'

I was wondering, in which cases operator + of std::string has to make a copy, and what actually gets copied. A quick look at cppreference was only partially helpful, as it lists 12(!) different cases. In part I am asking to confirm my understanding of the page:

  • Case 1) makes a copy of lhs then copies rhs to end of this copy
  • In C++98 Case 2) - 5) a temporary string is constructed from the char/const char* argument, which then results in case 1)
  • In C++11 Case 2) - 5) a temporary string is constructed from the char/const char* argument, which then results in case 6) or 7)
  • In C++11 Case 6) - 12) the r-value argument will be mutated with insert/append and, if a char/const char* argument was provided, no temporary is necessary due to the overloads on insert/append. In all cases an r-value is returned to facilitate further chaining. No copies are made (except the copy of the arguments to be appended/inserted at the insertion location). The contents of the string may need to be moved.

A chain like the example above should thus result in: 2) -> 6) -> 11) -> 8), with no copies of any lhs being made, but just modifications to the buffer of the r-value resulting from the first operation (creation of the temp-string).

Therefore this seems to be as efficient as operator +=, once operator + uses at least on r-value argument. Is this correct, and is there any point in using operator += over operator + in C++11 and after anymore, unless the both arguments are l-values.

What optimizations can the compiler make?

Aucun commentaire:

Enregistrer un commentaire