mardi 10 mars 2015

std::unordered_map allocation, insert and deallocation time

I have approximately 4million values in a file which i want to store in a container for performing computations.


The key of each value consists of 2 unsigned integers The value is a struct containing 4 double numbers.


The values will not change after loaded.



typedef pair<unsigned int, unsigned int> aa;
struct MyRecord { double a1; double a2; double a3; double a4; };

class MyRecordHash{
public:
size_t operator()(const aa &k) const{ return k.first * 10000 + k.second; }
};

struct MyRecordEquals : binary_function<const aa&, aa&, bool> {
result_type operator()( nm lhs, nm rhs ) const
{
return (lhs.first == rhs.first) && (lhs.second == rhs.second);
}
};

std::unordered_map<aa,MyRecord,MyRecordHash,MyRecordEquals> MyRecords;


I use MyRecords.reserve(number_of_records) prior to inserting the records.


Problem A: Although i call reserve before i start inserting the data, the memory allocated is not sufficient and keeps reallocating more and more memory as it inserts the data. Shouldn't it allocate the required memory with reserve? For example for 4m records it allocates with reserve 38.9Mb and then after the inserts an additional 256.5Mb.


Problem B: The insert process is rather slow. I checked the load factor, and it never increases more than 0.5. Are there any suggestions for anything else to check? I use MyRecords.insert for insertion.


Problem C: After i complete my calculations i call MyRecords.clear() . Instead of deleting the contents "instantly" it starts removing record by record (approx 3Mb/second). If i don't call clear() i get the same behavior. Is this normal? I checked all previous stackoverflow questions and the only suggestion i found was that it might be related to debugging. I used the -O3 option but it didn't change anything.


I am using the MinGW-64 compiler 4.9.1 version.


Thank you all for reading this and for your suggestions.


Aucun commentaire:

Enregistrer un commentaire