So I am converting this vector of i32 and float16 to vector of u8 using the following code
template <typename T>
void convertToU8(const T& inValue, std::vector<uint8_t>& outValues) {
const auto inputU8 = reinterpret_cast<const uint8_t*>(&inValue);
for (size_t i = 0; i < sizeof(inValue); ++i)
outValues.push_back(*(inputU8 + i));
}
vector<uint8_t> u8ValuesBuf;
u8ValuesBuf.reserve(estimatedSize);
for (auto i : i32Buffer) {
convertToU8<int32_t>(i, u8ValuesBuf);
}
for (auto i : float16Buffer) {
convertToU8<float16>(i, u8ValuesBuf);
}
From what I see during the runtime, the time it took for the execution of the project on Windows is 30000+ ms
. Now the same code on Linux, is much faster and completes all the things under 3100ms
. There is no conditional compilation which could add extra time for windows.
To isolate the issue, here is what I did I changed the for loops like below and created a dummy vector of u8
vector<uint8_t> u8ValuesBuf;
u8ValuesBuf.reserve(estimatedSize);
vector<uint8_t> temp32(i32Buffer.size() * 4, 0);
copy(temp32.begin(), temp32.end(), back_inserter(u8ValuesBuf));
//for (auto i : i32Buffer) {
// convertToU8<int32_t>(i, u8ValuesBuf);
//}
vector<uint8_t> temp16(float16Buffer.size() * 2, 0);
copy(temp16.begin(), temp16.end(), back_inserter(u8ValuesBuf));
//for (auto i : float16Buffer) {
// convertToU8<float16>(i, u8ValuesBuf);
//}
And the time was cut down to 3100ms
on windows too! What could possibly be happening with reinterpret_cast?
Any help is much appreciated!
PS: Using Visual Studio 2019 on Windows
Aucun commentaire:
Enregistrer un commentaire