mardi 1 septembre 2015

Knapsack algorithm, How to get better performance?

Openmp outperforms the serial code by factor x2, but I would like to have a better performance if it is possible.

Here is the serial code in c++:

for (int k = 0; k < numelem[i]; k++)
    {

      sumK = sumK - weight[k];
      int cmax = 0;
      cmax = max(capacity - sumK, weight[k]);

      for (int c = capacity; c >= cmax; c--)
        {

          if (f[c] < f[c - weight[k]] + value[k])
          {
            f[c] = f[c - weight[k]] + value[k];
            M[capacity * k + c] = 1;
          }
        }
    }

For the openmp version, I use two f0,f1 arrays which are swapped at each iteration. This helps me to prevent the race condition, but I suppose that false sharing is still present (not sure). Other my supposition is that, the conditional statements inside pragma for slow down the execution.

#pragma omp parallel
    {
        for (int k = 0; k < numelem[i]; k++) {

            sumK = sumK - weight[k];
            int cmax = 0;
            cmax = max(capacity - sumK, weight[k]);
            int c = capacity;

            if (k % 2 == 0) {

#pragma omp for
                for (c = capacity; c >= cmax; c--) {

                    //FALSE SHARING???

                    if (f0[c] < f0[c - weight[k]] + value[k]) {
                        f1[c] = f0[c - weight[k]] + value[k];
                        M[capacity * k + c] = 1;
                    } else {
                        f1[c] = f0[c];
                    }
                }
            } 

            else {

#pragma omp for
                for (c = capacity; c >= cmax; c--) {

                    //FALSE SHARING???

                    if (f1[c] < f1[c - weight[k]] + value[k]) {
                        f0[c] = f1[c - weight[k]] + value[k];
                        M[capacity * k + c] = 1;
                    } else {
                        f0[c] = f1[c];
                    }
                }

            }

        }
    }   

Here you can find the full code for serial c++ and openmp c++

This work is based on this article:

Aucun commentaire:

Enregistrer un commentaire