samedi 6 octobre 2018

libtensorflow_cc.so initialised a second time causes segfault

I am developing Tensorflow plugins for a thirdpary application, through a framework called OpenFX.

When I initialise one plugin, everything is great.

When I initalise a second plugin I get the crash below:

0   libsystem_kernel.dylib          0x00007fffd2498d42 __pthread_kill + 10
1   libsystem_pthread.dylib         0x00007fffd2586457 pthread_kill + 90
2   libsystem_c.dylib               0x00007fffd23fe420 abort + 129
3   libtensorflow_framework.so      0x000000013c011c80 tensorflow::internal::LogMessageFatal::~LogMessageFatal() + 32
4   libtensorflow_framework.so      0x000000013c011c90 tensorflow::internal::LogMessageFatal::~LogMessageFatal() + 16
5   libtensorflow_framework.so      0x000000013bff7118 tensorflow::monitoring::CollectionRegistry::Register(tensorflow::monitoring::AbstractMetricDef const*, std::__1::function<void (tensorflow::monitoring::MetricCollectorGetter)> const&) + 728
6   libtensorflow_cc.so             0x0000000133fcb668 tensorflow::monitoring::Counter<2>::Counter(tensorflow::monitoring::MetricDef<(tensorflow::monitoring::MetricKind)1, long long, 2> const&) + 152
7   libtensorflow_cc.so             0x0000000133fc6e17 tensorflow::monitoring::Counter<2>* tensorflow::monitoring::Counter<2>::New<char const (&) [46], char const (&) [58], char const (&) [11], char const (&) [7]>(char const (&&&) [46], char const (&&&) [58], char const (&&&) [11], char const (&&&) [7]) + 103
8   libtensorflow_cc.so             0x000000013e3d64aa _GLOBAL__sub_I_loader.cc + 42
9   dyld                            0x000000010e435a1b ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 385
10  dyld                            0x000000010e435c1e ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
11  dyld                            0x000000010e4314aa ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 338
12  dyld                            0x000000010e431441 ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 233
13  dyld                            0x000000010e430524 ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 138
14  dyld                            0x000000010e4305b9 ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 75
15  dyld                            0x000000010e4257cd dyld::runInitializers(ImageLoader*) + 87
16  dyld                            0x000000010e42d3ec dlopen + 556
17  libdyld.dylib                   0x00007fffd2367832 dlopen + 59
18  libnuke-10.5.7.dylib            0x00000001069dfe98 OFX::Binary::ref() + 56
19  libnuke-10.5.7.dylib            0x0000000106a11e96 OFX::Host::PluginHandle::PluginHandle(OFX::Host::Plugin*, OFX::Host::Host*) + 54
20  libnuke-10.5.7.dylib            0x00000001064e2a84 nofx_plugin_load(char const*) + 660
21  libnuke-10.5.7.dylib            0x0000000106808352 scriptTcl(int, char const**) + 3986
22  ???                             0x00007fff5fbfd130 0 + 140734799794480

The libtensorflow_cc.so and libtensorflow_framework.so are each linked with seperate rpaths for convient deployment

I cannot find the source _GLOBAL__sub_I_loader.cc to look at the cause of the problem.

I am currently on macOS but I deploy to Linux and Windows also.

the headers in use are:

#include <tensorflow/core/platform/init_main.h>
#include <tensorflow/core/public/session.h>
#include <tensorflow/core/framework/tensor_shape.h>

By the looks of things it isn't even stepping through my functions just loading the shared library.

Is there something I can do to put the init into a scope so the Counters dont have the same default names the second time it loads?

see: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/lib/monitoring/collection_registry_test.cc

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/lib/monitoring/collection_registry.cc#L77

I am currently using tensorflow::Env::Default()

What other value could this be?

Sam

Aucun commentaire:

Enregistrer un commentaire