mardi 17 mai 2022

Execution speed of code with `function` object as compared to using template functions

I know that std::function is implemented with the type erasure idiom. Type erasure is a handy technique, but as a drawback it needs to store on the heap a register (some kind of array) of the underlying objects.

Hence when creating or copying a function object there are allocations to do, and as a consequence the process should be slower than simply manipulating functions as template types, right?

To check this assumption I have run a test function that accumulates n = cycles consecutive integers, and then divides the sum by the number of increments n. First coded as a template:

#include <iostream>
#include <functional>
#include <chrono>
using std::cout;
using std::function;
using std::chrono::system_clock;
using std::chrono::duration_cast;
using std::chrono::milliseconds;

double computeMean(const double start, const int cycles) {
    double tmp(start);
    for (int i = 0; i < cycles; ++i) {
        tmp += i;
    }
    return tmp / cycles;
}

template<class T>
double operate(const double a, const int b, T myFunc) {
    return myFunc(a, b);
}

and the main.cpp:

int main()
{
    double init(1), result;
    int increments(1E9);
    // start clock
    system_clock::time_point t1 = system_clock::now();

    result = operate(init, increments, computeMean);
    // stop clock
    system_clock::time_point t2 = system_clock::now();

    cout << "Input: " << init << ", " << increments << ", Output: " << result << '\n';
    cout << "Time elapsed: " << duration_cast<milliseconds>(t2 - t1).count() << " ms\n";
    return 0;
}

I run this a hundred times and get a mean result of 10024.9 ms.

Then I introduce the function object in the main, plus a template specialization for operate so I can recycle the code above:

\\ as above, just add the template specialization
template<>
double operate(const double a, const int b, function<double (const double, const int)> myFunc) {
    cout << "nontemplate called\n";
    return myFunc(a, b);
}

\\ and inside the main
int main()
{
    //...
    // start clock
    system_clock::time_point t1 = system_clock::now();

    // new lines
    function<double (const double, const int)> computeMean =
        [&](const double init, const int increments) {
            double tmp(init);
            for (int i = 0; i < increments; ++i) {
                tmp += i;
            }
            return tmp / increments;
        };
    // rest as before
    // ...
}

I expected the function version to be faster, but the average is about the same, actually even slower, result = 9820.3 ms. I checked the standard deviations and they are about the same, 1233.77 against 1234.96.

What sense can be made of this? I would have expected the second version with the function object to be slower than the template version.

Here the whole test can be run on GDB.

Aucun commentaire:

Enregistrer un commentaire