lundi 28 août 2017

Valgrind showing memory leaks in OpenMPI ORTE and OPAL but nothing in the main program

I am currently debugging a very large program in Ubuntu 16.04 LTS which uses OpenMPI and have encountered some memory leaks. Using Valgrind, the error seems to be stemming from main (main.c:13) (which is not part of the program), ORTE and OPAL which are sub packages of OpenMPI.

The question is whether this is a bug in the program or OpenMPI itself and whether this only occurs for some MPI method or MPI datatype.

Below is the output from Valgrind using the -g -O0 -fno-inline C++ compiler flags and --tool=memcheck --leak-check=yes --track-origins=yes --leak-check=full valgrind flags.

==42361== Memcheck, a memory error detector
==42361== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==42361== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==42361== Command: mpiexec -n 18 -output-filename test.txt dist/UM_Release/GNU_MPI-Linux/_nb_test_shortest -L -l 1 -r 0.01 -R 0.1
-s 1 -z 0 -B 0.5:30:0.1 -b 100 -i 300 -m 0.1 ../_DATA/Chicago
==42361== Parent PID: 42313
==42361== 
==42361== Conditional jump or move depends on uninitialised value(s)
==42361==    at 0x4C3E552: orte_get_attribute (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4C6ADF7: bind_in_place.isra.0 (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4C6C0D9: orte_rmaps_base_compute_bindings (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4C67581: orte_rmaps_base_map_job (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EFCC10: opal_libevent2022_event_base_loop (event.c:1370)
==42361==    by 0x404B30: orterun (orterun.c:1071)
==42361==    by 0x403455: main (main.c:13)
==42361==  Uninitialised value was created by a stack allocation
==42361==    at 0x4C6ACE0: bind_in_place.isra.0 (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361== 
==42361== HEAP SUMMARY:
==42361==     in use at exit: 418,015 bytes in 2,008 blocks
==42361==   total heap usage: 46,504 allocs, 44,496 frees, 33,545,201 bytes allocated
==42361== 
==42361== 4 bytes in 1 blocks are definitely lost in loss record 5 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x3D66481021: strdup (in /lib64/libc-2.12.so)
==42361==    by 0x4EE6516: opal_basename (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x79C99B9: ???
==42361==    by 0x79C9AD4: ???
==42361==    by 0x5DACEBE: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 8 bytes in 1 blocks are definitely lost in loss record 22 of 352
==42361==    at 0x4A057BB: calloc (vg_replace_malloc.c:593)
==42361==    by 0xA01C5E6: ???
==42361==    by 0xA055D40: ???
==42361==    by 0xA018EA1: ???
==42361==    by 0x4C45DF0: pmix_server_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x5DAD3DD: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 8 bytes in 1 blocks are definitely lost in loss record 23 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0xA05E5EA: ???
==42361==    by 0xA05626B: ???
==42361==    by 0xA018EA1: ???
==42361==    by 0x4C45DF0: pmix_server_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x5DAD3DD: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 11 bytes in 1 blocks are definitely lost in loss record 25 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x3D66481021: strdup (in /lib64/libc-2.12.so)
==42361==    by 0x5DAC9B7: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 13 bytes in 1 blocks are definitely lost in loss record 29 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x3D66481021: strdup (in /lib64/libc-2.12.so)
==42361==    by 0x79C91D9: ???
==42361==    by 0x79C9335: ???
==42361==    by 0x4EDCA96: mca_base_select (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4C5AF99: orte_plm_base_select (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x5DAC89B: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 14 bytes in 1 blocks are definitely lost in loss record 30 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4EEEC1A: opal_os_path (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EF0ADD: opal_path_access (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EF0E6C: opal_path_findv (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x79C999A: ???
==42361==    by 0x79C9AD4: ???
==42361==    by 0x5DACEBE: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 31 (24 direct, 7 indirect) bytes in 1 blocks are definitely lost in loss record 76 of 352
==42361==    at 0x4A06C20: realloc (vg_replace_malloc.c:662)
==42361==    by 0x4EE5932: opal_argv_append_nosize (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x79C92CA: ???
==42361==    by 0x79C9335: ???
==42361==    by 0x4EDCA96: mca_base_select (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4C5AF99: orte_plm_base_select (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x5DAC89B: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 39 bytes in 1 blocks are definitely lost in loss record 124 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x3D6646F41F: vasprintf (in /lib64/libc-2.12.so)
==42361==    by 0x3D6644F307: asprintf (in /lib64/libc-2.12.so)
==42361==    by 0x4EE687D: set_dest (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EE7532: opal_cmd_line_parse (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4038FA: orterun (orterun.c:604)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 39 bytes in 1 blocks are definitely lost in loss record 125 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x3D6646F41F: vasprintf (in /lib64/libc-2.12.so)
==42361==    by 0x3D6644F307: asprintf (in /lib64/libc-2.12.so)
==42361==    by 0x4EE687D: set_dest (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EE7532: opal_cmd_line_parse (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x405384: create_app (orterun.c:1395)
==42361==    by 0x40507D: parse_locals (orterun.c:1241)
==42361==    by 0x404220: orterun (orterun.c:866)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 42 (24 direct, 18 indirect) bytes in 1 blocks are definitely lost in loss record 140 of 352
==42361==    at 0x4A06C20: realloc (vg_replace_malloc.c:662)
==42361==    by 0x4EE5932: opal_argv_append_nosize (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x79C9A12: ???
==42361==    by 0x79C9AD4: ???
==42361==    by 0x5DACEBE: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 48 bytes in 1 blocks are definitely lost in loss record 172 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4EDC10B: mca_base_component_repository_open (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EDB36A: mca_base_component_find (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EE4F99: mca_base_framework_components_register (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EE53A7: mca_base_framework_register (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EE5410: mca_base_framework_open (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4F3FA81: patcher_query (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4F3F965: opal_memory_base_open (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EE5484: mca_base_framework_open (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EBEC9E: opal_init (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x403BAB: orterun (orterun.c:678)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 48 bytes in 1 blocks are definitely lost in loss record 173 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x87DE912: ???
==42361==    by 0xBC84858: ???
==42361==    by 0x4C53618: orte_iof_base_select (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x5DAD444: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 71 bytes in 1 blocks are definitely lost in loss record 190 of 352
==42361==    at 0x4A06C20: realloc (vg_replace_malloc.c:662)
==42361==    by 0x3D6646F3D2: vasprintf (in /lib64/libc-2.12.so)
==42361==    by 0x3D6644F307: asprintf (in /lib64/libc-2.12.so)
==42361==    by 0x4C5A4A8: orte_oob_base_get_addr (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DE7BD: ???
==42361==    by 0x5DACD0C: ???
==42361==    by 0x4C2396E: orte_init (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4040CE: orterun (orterun.c:818)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 195 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0x4C23637: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 196 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0x4C463D5: pmix_server_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x5DAC165: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 197 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0x4C463E0: pmix_server_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x5DAC165: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 198 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0x4C2EB93: orte_data_server_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x5DAC176: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 199 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0xB87DCFC: ???
==42361==    by 0x4C4C2A5: orte_dfs_base_close (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EE5568: mca_base_framework_close (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x5DAC18E: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 200 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0xB87DD07: ???
==42361==    by 0x4C4C2A5: orte_dfs_base_close (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EE5568: mca_base_framework_close (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x5DAC18E: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 201 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0xB6789A4: ???
==42361==    by 0x4C4FBB5: orte_filem_base_close (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EE5568: mca_base_framework_close (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x5DAC19A: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 202 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0xB678B49: ???
==42361==    by 0x4C4FBB5: orte_filem_base_close (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EE5568: mca_base_framework_close (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x5DAC19A: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 203 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0xBC84C10: ???
==42361==    by 0x4C52E15: orte_iof_base_close (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EE5568: mca_base_framework_close (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x5DAC1C4: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 204 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0x89E219B: ???
==42361==    by 0x4C5152E: orte_grpcomm_base_close (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EE5568: mca_base_framework_close (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x5DAC200: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 205 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0x4C5BCD6: orte_plm_base_comm_stop (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x79CA191: ???
==42361==    by 0x4C5AED5: orte_plm_base_close (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EE5568: mca_base_framework_close (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x5DAC218: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 80 bytes in 1 blocks are possibly lost in loss record 206 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C6C5ED: prq_cons (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x87DED18: ???
==42361==    by 0x4C5BD06: orte_plm_base_comm_stop (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x79CA191: ???
==42361==    by 0x4C5AED5: orte_plm_base_close (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EE5568: mca_base_framework_close (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x5DAC218: ???
==42361==    by 0x4C23641: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 96 (80 direct, 16 indirect) bytes in 2 blocks are definitely lost in loss record 211 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C2694F: orte_dt_copy_sig (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4C525E3: orte_grpcomm_API_allgather (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4C46D09: pmix_server_fencenb_fn (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0xA01B675: ???
==42361==    by 0xA05A331: ???
==42361==    by 0xA054C4F: ???
==42361==    by 0xA05556A: ???
==42361==    by 0xA043D21: ???
==42361==    by 0x4EFCC10: opal_libevent2022_event_base_loop (event.c:1370)
==42361==    by 0xA04197C: ???
==42361==    by 0x3D66C07AA0: start_thread (in /lib64/libpthread-2.12.so)
==42361== 
==42361== 96 (80 direct, 16 indirect) bytes in 2 blocks are definitely lost in loss record 212 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C2B709: orte_dt_unpack_sig (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EC8191: opal_dss_unpack_buffer (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x4EC8FDD: opal_dss_unpack (in /usr/local/lib/libopen-pal.so.20.2.0)
==42361==    by 0x89E201B: ???
==42361==    by 0x4C6D490: orte_rml_base_process_msg (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x4EFCC10: opal_libevent2022_event_base_loop (event.c:1370)
==42361==    by 0x404B30: orterun (orterun.c:1071)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 168 bytes in 1 blocks are possibly lost in loss record 246 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x87DECDF: ???
==42361==    by 0x4C23637: orte_finalize (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0x404B98: orterun (orterun.c:1089)
==42361==    by 0x403455: main (main.c:13)
==42361== 
==42361== 14,464 (128 direct, 14,336 indirect) bytes in 2 blocks are definitely lost in loss record 346 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0x4C46C7B: pmix_server_fencenb_fn (in /usr/local/lib/libopen-rte.so.20.1.0)
==42361==    by 0xA01B675: ???
==42361==    by 0xA05A331: ???
==42361==    by 0xA054C4F: ???
==42361==    by 0xA05556A: ???
==42361==    by 0xA043D21: ???
==42361==    by 0x4EFCC10: opal_libevent2022_event_base_loop (event.c:1370)
==42361==    by 0xA04197C: ???
==42361==    by 0x3D66C07AA0: start_thread (in /lib64/libpthread-2.12.so)
==42361==    by 0xAC746FF: ???
==42361== 
==42361== 122,881 (176 direct, 122,705 indirect) bytes in 2 blocks are definitely lost in loss record 351 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0xA054BDF: ???
==42361==    by 0xA05556A: ???
==42361==    by 0xA043D21: ???
==42361==    by 0x4EFCC10: opal_libevent2022_event_base_loop (event.c:1370)
==42361==    by 0xA04197C: ???
==42361==    by 0x3D66C07AA0: start_thread (in /lib64/libpthread-2.12.so)
==42361==    by 0xAC746FF: ???
==42361== 
==42361== 136,119 bytes in 306 blocks are definitely lost in loss record 352 of 352
==42361==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==42361==    by 0xA01EA87: ???
==42361==    by 0xA0424C8: ???
==42361==    by 0xA05EAE2: ???
==42361==    by 0xA05F55C: ???
==42361==    by 0xA054D18: ???
==42361==    by 0xA05556A: ???
==42361==    by 0xA043D21: ???
==42361==    by 0x4EFCC10: opal_libevent2022_event_base_loop (event.c:1370)
==42361==    by 0xA04197C: ???
==42361==    by 0x3D66C07AA0: start_thread (in /lib64/libpthread-2.12.so)
==42361==    by 0xAC746FF: ???
==42361== 
==42361== LEAK SUMMARY:
==42361==    definitely lost: 145,701 bytes in 413 blocks
==42361==    indirectly lost: 169,449 bytes in 568 blocks
==42361==      possibly lost: 2,976 bytes in 24 blocks
==42361==    still reachable: 99,889 bytes in 1,003 blocks
==42361==         suppressed: 0 bytes in 0 blocks
==42361== Reachable blocks (those to which a pointer was found) are not shown.
==42361== To see them, rerun with: --leak-check=full --show-reachable=yes
==42361== 
==42361== For counts of detected and suppressed errors, rerun with: -v
==42361== ERROR SUMMARY: 53 errors from 53 contexts (suppressed: 6 from 6)

And this is some OpenMPI information

         Package: Open MPI buildd@lgw01-07 Distribution
        Open MPI: 2.0.2   Open MPI repo revision: v2.0.1-348-ge291d0e    Open MPI release date: Jan 31, 2017
        Open RTE: 2.0.2   Open RTE repo revision: v2.0.1-348-ge291d0e    Open RTE release date: Jan 31, 2017
            OPAL: 2.0.2

OPAL repo revision: v2.0.1-348-ge291d0e OPAL release date: Jan 31, 2017 MPI API: 3.1.0 Ident string: 2.0.2 Prefix: /usr Configured architecture: x86_64-pc-linux-gnu Configure host: lgw01-07 Configured by: buildd Configured on: Fri Feb 10 11:13:16 UTC 2017 Configure host: lgw01-07 Built by: buildd Built on: Fri Feb 10 11:20:28 UTC 2017 Built host: lgw01-07 C bindings: yes C++ bindings: yes Fort mpif.h: yes (all) Fort use mpi: yes (full: ignore TKR) Fort use mpi size: deprecated-ompi-info-value Fort use mpi_f08: yes Fort mpi_f08 compliance: The mpi_f08 module is available, but due to limitations in the gfortran compiler, does not support the following: array subsections, direct passthru (where possible) to underlying Open MPI's C functionality Fort mpi_f08 subarrays: no Java bindings: yes Wrapper compiler rpath: disabled C compiler: gcc C compiler absolute: /usr/bin/gcc C compiler family name: GNU C compiler version: 6.3.0 C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fort compiler: gfortran Fort compiler abs: /usr/bin/gfortran Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)

Thank you.

Aucun commentaire:

Enregistrer un commentaire