samedi 1 octobre 2016

Speedup comparison between BLAS and OpenBLAS in Armadillo

does anybody know the speedup achieved by using OpenBLAS instead of BLAS library in armadillo?

here is my result: I'm trying to multiply a vector to a matrix in armadillo

    #include <iostream>   
    using namespace std;
    using namespace arma;
int main(int argc, char** argv) {

int num_examples = 100000;
vec randperm(num_examples, fill::zeros);
for(uword i=0;i<num_examples;i++)
    randperm(i) = i;

vec randnum = shuffle(randperm);

sp_mat X = sprandu<sp_mat>(num_examples,127,0.8);
mat W(3,127,fill::randn);
wall_clock timer;
double t;

uword pred_class = (X.row(randnum(10))*W.t()).index_max();
 t= timer.toc();
cout<<"Elapsed time is:"<<t<<endl;

Here is the result with using -lopenblas:Elapsed time is:0.0374926 The result with using just -armadillo is: Elapsed time is:0.084193

My machine: Ubuntu 14.04 running on 4 physical cores (with hyperthreading enabled, 8 cores)

Aucun commentaire:

Enregistrer un commentaire