I have developed a program in C++11 and I want to speed up the performance.
I will use a simple example to show the structure of the program (not complete).
//main.cpp
#include "a.h"
int main()
{
std::vector<a> a_container;
for (auto i=0; i< 10K; i++)
{
a a_obj;
a_container.push_back(a_obj);
}
for(time = 1; time< long_time; time++)
{
//i used openmp here already
for (auto i=0; i< 10K; i++)
{
a_container[i].dosomething();
}
for (auto i=0; i< 10K; i++)
{
a_container[i].update();
}
}
return 1;
}
//a.cpp
//a.h
#include "b.h"
class a
{
int d;
b b_obj;
int dosomething();
}
//b.cpp
//b.h
class b
{
int c;
double d;
int dosomething();
}
So in order to speed up the program, I want to use both MPI and OpenMP, mainly for the loop (could be up to 1 million~1 billion instances).
The class object a and b both contain complex member variables (standard and other containers, etc.) and functions.
By using OpenMP, I can take advantage of one HPC node with all cores/threads. But if I want to use MPI, I need to distribute all the instances to many nodes.
I haven't found a good solution to this yet, please provide some suggestion. Thanks.
Aucun commentaire:
Enregistrer un commentaire