vendredi 1 avril 2022

Standard C++ Parallelism using nvc++ is slow

I am not sure what i am doing wrong but it seems std::sort is much slower using nvc++ -stdpar than g++. Other std functions are better but never better than the multithreaded CPU version.

Below is the code where TIMEIT is just a macro to compute the before and after timer.

using Duration = std::chrono::duration<double, std::milli>;
std::random_device rd;
std::uniform_real_distribution<> dist(1,1000);

int main(){

int n=1<<21;
std::vector<float> v(n);

std::generate(v.begin(),v.end(),[&](){ return dist(rd);});
Duration d;
TIMEIT(d,
std::sort(std::execution::par,v.begin(),v.end());
std::cout<<d.count()<<"\n";

The results are

nvc++ -stdpar std.cu 217.164

g++ std.cpp -ltbb 69.9

Note that nvc++ issues a warning about a sequential run for cc less than 70 but when i try without -stdpar the time is 756 so i am guessing that it is parallelizing. If not, i am not sure how to force it to parallelize.

nvc++ 22.3 on WSL2 on Windows 11

GPU: GeForce GTX 1660 Ti with Max-Q Design, 6GB RAM

CPU: Intel I7-1065G7, 32 GB RAM

Aucun commentaire:

Enregistrer un commentaire