lundi 5 octobre 2015

Why, when using g++ 4.8.4, does this code result in a map with a single element?

I've been involved in porting an older Win32-MFC project to Linux over the last year and a half and finally hit something I don't fully understand. At first I thought it might be due to the introduction of C++11 move semantics, but I'm not sure if that's the issue. Under g++ 4.8.4 using the -std=c++11 flag the following code:

#include <map>
#include <string>
#include <iostream>
#include <iomanip>
#include <cstring>

const char* foo[] = { "biz", "baz", "bar", "foo", "yin" };
const int sizes[] = { 3, 3, 3, 3, 3 };

typedef std::map <std::string, int> simpleMap_t;
typedef std::pair<std::string, int> simplePair_t;

int main()
{
    simpleMap_t map;
    std::string key;
    for (int i = 0; i<5; i++)
    {
        key.resize(sizes[i]);
        memcpy(const_cast<char *>(key.data()), foo[i], sizes[i]);
        simplePair_t pair = std::make_pair(key, 0);
        std::cout << "key: \""         << key        << "\" - " << static_cast<const void*>(key.data())
                  << " pair.first: \"" << pair.first << "\" - " << static_cast<const void*>(pair.first.data())
                  << std::endl;
        map.insert(map.end(), pair);
    }

    std::cout << "map size =  " << map.size() << std::endl;
    return 0;
}

Will produce this output:

key: "biz" - 0x1dec028 pair.first: "biz" - 0x1dec028
key: "baz" - 0x1dec028 pair.first: "baz" - 0x1dec028
key: "bar" - 0x1dec028 pair.first: "bar" - 0x1dec028
key: "foo" - 0x1dec028 pair.first: "foo" - 0x1dec028
key: "yin" - 0x1dec028 pair.first: "yin" - 0x1dec028
map size =  1

While the same code compiled in Visual Studio 2013 will produce this:

key: "biz" - 0039FE14 pair.first: "biz" - 0039FDE0
key: "baz" - 0039FE14 pair.first: "baz" - 0039FDE0
key: "bar" - 0039FE14 pair.first: "bar" - 0039FDE0
key: "foo" - 0039FE14 pair.first: "foo" - 0039FDE0
key: "yin" - 0039FE14 pair.first: "yin" - 0039FDE0
map size =  5

Interestingly, the code will "work" when compiled with g++ when the size of the string changes each iteration. Replacing:

const char* foo[] = { "biz", "baz", "bar", "foo", "yin" };
const int sizes[] = { 3, 3, 3, 3, 3 };

with:

const char* foo[] = { "bizbiz", "baz", "barbar", "foo", "yinyin" };
const int sizes[] = { 6, 3, 6, 3, 6 };

will produce:

key: "bizbiz" - 0xc54028 pair.first: "bizbiz" - 0xc54028
key: "baz" - 0xc54098 pair.first: "baz" - 0xc54098
key: "barbar" - 0xc54108 pair.first: "barbar" - 0xc54108
key: "foo" - 0xc54178 pair.first: "foo" - 0xc54178
key: "yinyin" - 0xc541e8 pair.first: "yinyin" - 0xc541e8
map size =  5

My understanding of move-semantics is incomplete, but I'm left wondering if that's what is at play here. Is ownership of the internal std::string's buffer being given away when making the std::pair? Or is it something else like an optimization in the std::string::resize() method that is not re-allocating a new character buffer when it should be?

Aucun commentaire:

Enregistrer un commentaire