lundi 25 janvier 2016

OpenACC present clause update data

I am trying to do openACC optimizations for many body simulations. Currently, I am facing a problem which lead to memory problem in below

call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
call to cuMemFreeHost returned error 700: Illegal address during kernel execution
srun: error: jrc0017: task 0: Exited with exit code 1

I am using compiler flag -acc -Minfo=accel -ta=tesla -fast -std=c++11 and I don't want to use -ta=tesla:managed because I want to organise memory by myself.

#pragma acc kernels present(sim.part.rx, sim.part.ry, sim.part.rz, sim.part.vx, sim.part.vy, sim.part.vz) 
{
  for(int idx = 0; idx < sim.num; ++idx) {     // Loop over target particle
    float
      prx = sim.part.rx[idx],                   // my position
      pry = sim.part.ry[idx],
      prz = sim.part.rz[idx];
    float Fx = 0.f, Fy = 0.f, Fz = 0.f;          // Force
    #pragma acc loop 
    for(int jdx = 0; jdx < sim.num; ++jdx) {   // Loop over interaction partners
      if(idx != jdx) {                          // No self-force
        const float dx = prx - sim.part.rx[jdx]; // Distance to partner
        const float dy = pry - sim.part.ry[jdx];
        const float dz = prz - sim.part.rz[jdx];
        const float h  = 1.f/sqrt(dx*dx + dy*dy + dz*dz + eps);
        const float h3 = h*h*h;
        Fx += dx*h3;                            // Sum up force
        Fy += dy*h3;
        Fz += dz*h3;
      }
    }
    sim.part.vx[idx] += sim.mass*dt*Fx;         // update velocity
    sim.part.vy[idx] += sim.mass*dt*Fy;
    sim.part.vz[idx] += sim.mass*dt*Fz;
  }
}

If I delete the code in below

sim.part.vx[idx] += sim.mass*dt*Fx;         // update velocity
sim.part.vy[idx] += sim.mass*dt*Fy;
sim.part.vz[idx] += sim.mass*dt*Fz;

my code is able to run without problem. But I got memory problem if I un-comment them. It seems that sim.part.vx are try to update the data but compiler don't know which lead to the memory problem.

Does anyone know how to fix this problem?

Aucun commentaire:

Enregistrer un commentaire