Actually this will be now almost two weeks I've been figuring out this problem, and I finally at least know possible cause. What happens is, that a call to socket::async_connect
blocks the thread and seemingly acts as synchronous. The call looks like bellow and happens in :
void Layer104::connect(const std::string& ip, const std::string& port)
{
try
{
tcp::resolver resolver(io);
tcp::resolver::query query(ip.c_str(), port.c_str());
tcp::resolver::iterator it = resolver.resolve(query);
LOGMTRTTIDEBUG("StartProtocol()"<<endl);
activeSocket->async_connect(*it, boost::bind(&LayerXXX::connectHandler, getPtr(), boost::asio::placeholders::error));
LOGMTRTTIDEBUG("StartProtocol() started using activeSocket->async_connect."<<endl);
}
catch (exception&)
{
LOGMTRTTIERR("ERR_CONNECTION_FAILED"<<endl)
handleError(ERR_CONNECTION_FAILED);
}
}
As you can see I have two debug messages there to see if the function returned. 90% the function returns immediatelly and I get this in the log:
Date Thread name Thread id Method name Message
D 2015-11-02 10:35:39:787 [Client 192.168.8.23:1242 #6576] MyProtocol::LayerXXX::connect StartProtocol()
D 2015-11-02 10:35:39:788 [Client 192.168.8.23:1242 #6576] MyProtocol::LayerXXX::connect StartProtocol() started using activeSocket->async_connect.
But sometimes, and this only happens when network connection is very bad, I just see this, last message from that thread at all:
Date Thread name ID Method name Message
D 2015-11-02 10:35:39:787 [Client #6576] MyProtocol::LayerXXX::connect StartProtocol()
And since then the thread is stuck here:
The thread I'm talking about is using the boost::asio::io_service
event loop and also the connect
method is called from this event loop. This is how the thread is started:
void MyClient::threadMain() {
LOGMTRTTIDEBUG("Starting boost::asio::io_service main loop."<<endl);
// IP and port is added to name after connecting succesfully
ThreadNameMap::setName( "Client" );
io_.run();
LOGMTRTTIDEBUG("boost::asio::io_service main OVER! Thread DEAD!"<<endl);
}
The relevant stacktrace when the thread is stuck:
ntdll.dll!NtWaitForSingleObject() + 0xa bytes
mswsock.dll!__GSHandlerCheck_SEH() + 0x2c95 bytes
mswsock.dll!__GSHandlerCheck_SEH() + 0x5e0c bytes
ws2_32.dll!WSAAccept() + 0xd4 bytes
ws2_32.dll!accept() + 0x15 bytes
myapp64d.exe!boost::asio::detail::socket_ops::call_accept<int>(int * __formal, unsigned __int64 s, sockaddr * addr, unsigned __int64 * addrlen) Line 96 + 0x32 bytes C++
myapp64d.exe!boost::asio::detail::socket_ops::accept(unsigned __int64 s, sockaddr * addr, unsigned __int64 * addrlen, boost::system::error_code & ec) Line 114 + 0x19 bytes C++
myapp64d.exe!boost::asio::detail::socket_select_interrupter::open_descriptors() Line 90 + 0x1c bytes C++
myapp64d.exe!boost::asio::detail::socket_select_interrupter::socket_select_interrupter() Line 42 C++
myapp64d.exe!boost::asio::detail::select_reactor::select_reactor(boost::asio::io_service & io_service) Line 48 + 0x86 bytes C++
myapp64d.exe!boost::asio::detail::service_registry::create<boost::asio::detail::select_reactor>(boost::asio::io_service & owner) Line 81 + 0x26 bytes C++
myapp64d.exe!boost::asio::detail::service_registry::do_use_service(const boost::asio::io_service::service::key & key, boost::asio::io_service::service * (boost::asio::io_service &)* factory) Line 123 + 0x13 bytes C++
myapp64d.exe!boost::asio::detail::service_registry::use_service<boost::asio::detail::select_reactor>() Line 49 C++
myapp64d.exe!boost::asio::use_service<boost::asio::detail::select_reactor>(boost::asio::io_service & ios) Line 34 C++
myapp64d.exe!boost::asio::detail::win_iocp_socket_service_base::get_reactor() Line 620 + 0xd bytes C++
myapp64d.exe!boost::asio::detail::win_iocp_socket_service_base::start_connect_op(boost::asio::detail::win_iocp_socket_service_base::base_implementation_type & impl, boost::asio::detail::reactor_op * op, const sockaddr * addr, unsigned __int64 addrlen) Line 550 + 0xd bytes C++
myapp64d.exe!boost::asio::detail::win_iocp_socket_service<boost::asio::ip::tcp>::async_connect<boost::_bi::bind_t<void,boost::_mfi::mf1<void,MyProtocol::LayerXXX,boost::system::error_code const & __ptr64>,boost::_bi::list2<boost::_bi::value<MyProtocol::LayerXXX * __ptr64>,boost::arg<1> > > >(boost::asio::detail::win_iocp_socket_service<boost::asio::ip::tcp>::implementation_type & impl, const boost::asio::ip::basic_endpoint<boost::asio::ip::tcp> & peer_endpoint, boost::_bi::bind_t<void,boost::_mfi::mf1<void,MyProtocol::LayerXXX,boost::system::error_code const &>,boost::_bi::list2<boost::_bi::value<MyProtocol::LayerXXX *>,boost::arg<1> > > & handler) Line 515 C++
myapp64d.exe!boost::asio::stream_socket_service<boost::asio::ip::tcp>::async_connect<boost::_bi::bind_t<void,boost::_mfi::mf1<void,MyProtocol::LayerXXX,boost::system::error_code const & __ptr64>,boost::_bi::list2<boost::_bi::value<MyProtocol::LayerXXX * __ptr64>,boost::arg<1> > > >(boost::asio::detail::win_iocp_socket_service<boost::asio::ip::tcp>::implementation_type & impl, const boost::asio::ip::basic_endpoint<boost::asio::ip::tcp> & peer_endpoint, const boost::_bi::bind_t<void,boost::_mfi::mf1<void,MyProtocol::LayerXXX,boost::system::error_code const &>,boost::_bi::list2<boost::_bi::value<MyProtocol::LayerXXX *>,boost::arg<1> > > & handler) Line 234 C++
myapp64d.exe!boost::asio::basic_socket<boost::asio::ip::tcp,boost::asio::stream_socket_service<boost::asio::ip::tcp> >::async_connect<boost::_bi::bind_t<void,boost::_mfi::mf1<void,MyProtocol::LayerXXX,boost::system::error_code const & __ptr64>,boost::_bi::list2<boost::_bi::value<MyProtocol::LayerXXX * __ptr64>,boost::arg<1> > > >(const boost::asio::ip::basic_endpoint<boost::asio::ip::tcp> & peer_endpoint, const boost::_bi::bind_t<void,boost::_mfi::mf1<void,MyProtocol::LayerXXX,boost::system::error_code const &>,boost::_bi::list2<boost::_bi::value<MyProtocol::LayerXXX *>,boost::arg<1> > > & handler) Line 779 C++
myapp64d.exe!MyProtocol::LayerXXX::connect(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & ip, const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & port) Line 225 C++ ntdll.dll!NtWaitForSingleObject() + 0xa bytes
What I know so far:
- This only happens when network is dropping packets, so it must be timing related. I simulated this using clumsy 0.2 (recommended).
boost::asio::ip::tcp::socket::async_connect
is called fromboost::asio::io_service
event loop, but it works most of the time- According to bug reports we've got, this thread doesn't go un-stuck in any reasonable amount of time. So it's not like some timeout... and anyway, the connect should be asynchronous.
I really need some help already, it's been too long without any progress on this issue.
Aucun commentaire:
Enregistrer un commentaire