samedi 21 mars 2020

MPI_Recv derived data type message missing

Hi I am writing a c++ program, in which I want MPI to communicate by a derived data type. But the receiver does not receive the full information that the sender sends out.

Here is how I build my derived data type:

// dg_derived_datatype.cpp

#include <mpi.h>
#include "dg_derived_datatype.h"

namespace Hash{

    MPI_Datatype Face_type;
};

void Construct_data_type(){

    MPI_Face_type();

}

void MPI_Face_type(){

    int num = 3;

    // Number of elements in each block (array of integers)
    int elem_blocklength[num]{2, 1, 5};

    // Byte displacement of each block (array of integers).
    MPI_Aint array_of_offsets[num];
    MPI_Aint intex, charex;
    MPI_Aint lb;
    MPI_Type_get_extent(MPI_INT, &lb, &intex);
    MPI_Type_get_extent(MPI_CHAR, &lb, &charex);

    array_of_offsets[0] = (MPI_Aint) 0;
    array_of_offsets[1] = array_of_offsets[0] + intex * 2;
    array_of_offsets[2] = array_of_offsets[1] + charex;

    MPI_Datatype array_of_types[num]{MPI_INT, MPI_CHAR, MPI_INT};

    // create and MPI datatype
    MPI_Type_create_struct(num, elem_blocklength, array_of_offsets, array_of_types, &Hash::Face_type);  
    MPI_Type_commit(&Hash::Face_type);

}

void Free_type(){

    MPI_Type_free(&Hash::Face_type);    

}

Here I derive my data type Hash::Face_type and commit it. The Hash::Face_type is used to transfer my struct (face_pack, 2 int + 1 char + 5 int) vector.

// dg_derived_datatype.h

#ifndef DG_DERIVED_DATA_TYPE_H
#define DG_DERIVED_DATA_TYPE_H

#include <mpi.h>

struct face_pack{

    int owners_key; 

    int facei; 

    char face_type;

    int hlevel;

    int porderx;

    int pordery; 

    int key;

    int rank;

};

namespace Hash{

    extern MPI_Datatype Face_type;
};

void Construct_data_type();

void Free_type();

#endif

Then in my main program I do

// dg_main.cpp

#include <iostream>
#include <mpi.h>
#include "dg_derived_datatype.h"
#include <vector>

void Recv_face(int source, int tag, std::vector<face_pack>& recv_face);

int main(){
// Initialize MPI. 
// some code here.
// I create a vector of struct: std::vector<face_pack> face_info,
// to store the info I want to let proccesors communicate. 

Construct_data_type(); // construct my derived data type

MPI_Request request_pre1, request_pre2, request_next1, request_next2;

// send
if(num_next > 0){ // If fullfilled the current processor send info to the next processor (myrank + 1)

std::vector<face_pack> face_info;
// some code to construct face_info

// source my_rank, destination my_rank + 1
MPI_Isend(&face_info[0], num_n, Hash::Face_type, mpi::rank + 1, mpi::rank + 1, MPI_COMM_WORLD, &request_next2);

}

// recv
if(some critira){ // recv from the former processor (my_rank - 1)

std::vector<face_pack> recv_face;

Recv_face(mpi::rank - 1, mpi::rank, recv_face); // recv info from former processor

}
if(num_next > 0){

        MPI_Status status;
        MPI_Wait(&request_next2, &status);

}

Free_type();

// finialize MPI
}

void Recv_face(int source, int tag, std::vector<face_pack>& recv_face){

    MPI_Status status1, status2;

    MPI_Probe(source, tag, MPI_COMM_WORLD, &status1);

    int count;
    MPI_Get_count(&status1, Hash::Face_type, &count);

    recv_face = std::vector<face_pack>(count);

    MPI_Recv(&recv_face[0], count, Hash::Face_type, source, tag, MPI_COMM_WORLD, &status2);
}


The problem is that the receiver sometimes receives an incomplete info.

For example, I print out the face_info before it is sent out:

// rank 2
owners_key3658 facei 0 face_type M neighbour 192 n_rank 0
owners_key3658 facei 1 face_type L neighbour 66070 n_rank 1
owners_key3658 facei 1 face_type L neighbour 76640 n_rank 1
owners_key3658 facei 2 face_type M neighbour 2631 n_rank 0
owners_key3658 facei 3 face_type L neighbour 4953 n_rank 1
...
owners_key49144 facei 1 face_type M neighbour 844354 n_rank 2
owners_key49144 facei 1 face_type M neighbour 913280 n_rank 2
owners_key49144 facei 2 face_type L neighbour 41619 n_rank 1
owners_key49144 facei 3 face_type M neighbour 57633 n_rank 2

Which is correct.

But in the receiver side, I print out the message it received:

owners_key3658 facei 0 face_type M neighbour 192 n_rank 0
owners_key3658 facei 1 face_type L neighbour 66070 n_rank 1
owners_key3658 facei 1 face_type L neighbour 76640 n_rank 1
owners_key3658 facei 2 face_type M neighbour 2631 n_rank 0
owners_key3658 facei 3 face_type L neighbour 4953 n_rank 1

... // at the beginning it's fine, however, at the end it messed up

owners_key242560 facei 2 face_type ! neighbour 2 n_rank 2
owners_key217474 facei 2 face_type ! neighbour 2 n_rank 2
owners_key17394 facei 2 face_type ! neighbour 2 n_rank 2
owners_key216815 facei 2 face_type ! neighbour 2 n_rank 2

Surely, it lost the face_type info, which is a char. And as far as I know, the std::vector warrants a contiguous memory here. So I am not sure which part of my derived mpi datatype is wrong. The message passing sometimes works sometimes not.

Any suggestion? Thanks a lot.

Aucun commentaire:

Enregistrer un commentaire