I am never sure which possibility I should choose to parallelize nested for loops.
For example I have the following code snippet:
#pragma omp parallel for schedule(static)
for(int b=0; b<bSize; b++)
for(int n=0; n<N; n++) o[n + b*N] = b[n];
#pragma omp parallel for collapse(2) schedule(static)
for(int b=0; b<bSize; b++)
for(int n=0; n<N; n++) o[n + b*N] = b[n];
In the first snippet I use parallel for
(with schedule(static)
because of the first touch policy). In some codes I saw people use mostly the collapse-clausel to parallize nested for loops in other codes it is never used instead the nested for loops are parallelized with a simple parallel for
. Is this more a habit or is there a difference between the two versions? Is there a reason some people never use collapse(n)
?
Aucun commentaire:
Enregistrer un commentaire