I want to measure the duration of an operation the following way:
t1 = GetCurrentTime()
// do the operation
t2 = GetCurrentTime()
return TransformToSeconds(t2 - t1)
How to do that in C++11? I want the measurement be as fast as possible, i.e. GetCurrentTime()
should be fast, and the resolution of the clock should be as fine as possible.
I did some study, and some measurements, and I am very confused.
- I have read parts of
chrono
on cppreference.com. Note that the documentation ofhigh_resolution_clock
discourages using it, and suggests usingsteady_clock
for duration measurement. - I have read parts of Acquiring high-resolution time stamps, a Microsoft documentation.
- I have read parts of clock_gettime(3), a linux man page.
- I have read this StackOverflow answer.
- I have compared a custom time measurement implementation based on the above documents,
std::chrono::system_clock
,std::chrono::steady_clock
andstd::chrono::high_resolution_clock
.
On my Windows machine (cl.exe
19.16.27035) I was able to measure consistent results with this program:
#include <chrono>
#include <cstdint>
#include <stdio.h>
#include <Windows.h>
static_assert(std::is_same_v<decltype(LARGE_INTEGER::QuadPart), std::int64_t>);
constexpr unsigned Repeat = 3000000;
const std::int64_t WindowsTicksPerSec = [] {
LARGE_INTEGER ticksPerSec;
QueryPerformanceFrequency(&ticksPerSec);
return ticksPerSec.QuadPart;
}();
std::int64_t GetWindowsNow()
{
LARGE_INTEGER ticks;
QueryPerformanceCounter(&ticks);
return ticks.QuadPart; // number of "ticks"
}
double TestWindowsClock()
{
double durationSeconds = 0.0;
for (unsigned i = 0; i < Repeat; i++) {
const std::int64_t t1 = GetWindowsNow();
const std::int64_t t2 = GetWindowsNow();
durationSeconds += double(t2 - t1) / WindowsTicksPerSec;
}
return durationSeconds / Repeat;
}
template <class Clock>
double TestSTLClock()
{
double durationSeconds = 0.0;
for (unsigned i = 0; i < Repeat; i++) {
const typename Clock::time_point t1 = Clock::now();
const typename Clock::time_point t2 = Clock::now();
durationSeconds += std::chrono::duration<double>(t2 - t1).count();
}
return durationSeconds / Repeat;
}
void PrintMeasurements(const char* label, double durationSeconds)
{
printf("%-21s: %7.3f ns ", label, durationSeconds * 1000000000);
for (unsigned i = 0; i < durationSeconds * 1000000000; i++)
printf("=");
printf("\n");
}
int main()
{
PrintMeasurements("Windows clock", TestWindowsClock());
PrintMeasurements("system_clock", TestSTLClock<std::chrono::system_clock>());
PrintMeasurements("steady_clock", TestSTLClock<std::chrono::steady_clock>());
PrintMeasurements("high_resolution_clock", TestSTLClock<std::chrono::high_resolution_clock>());
static_assert(std::is_same_v<std::chrono::steady_clock, std::chrono::high_resolution_clock>);
}
It prints out the following results (it's more or less the same in each execution):
Windows clock : 19.795 ns ====================
system_clock : 30.168 ns ===============================
steady_clock : 51.390 ns ====================================================
high_resolution_clock: 52.166 ns =====================================================
Which contradicts to the common sense answer (use high_resolution_clock
) and to the cppreference.com recommendation (use steady_clock
). As we can see:
- One is able to give a custom implementation that is faster than any standard solution.
high_resolution_clock
is the worst of all. It's the same assteady_clock
in MSVC on a sidenote.
If I want to measure the duration of an operation in a portable way, the picture is even more complicated, because in different compilers different methods will be the best. To compare results on a linux machine, use this program on Godbolt. Note that on Godbolt it is very unreliable: each execution gives significantly different results. This one on Wandbox is more stable. Curiously enough turning on optimizations gives worse results.
Aucun commentaire:
Enregistrer un commentaire