vendredi 1 mai 2015

Cancelling callbacks in Boost ASIO

I've been trying to switch my code from using one io_service per network connection to using shared ones, and I'm seeing some very odd behaviour on server sockets (client ones seem to work OK).

In order to try to work out what's going on I've re-started building up a simple example that will allow me to check my assumptions about everything that ought to happen. The first problem I've hit is that io_service::run doesn't exit when there are no handlers left, and as far as I can tell the handlers aren't removed from the work queue.

I have one thread that does an async_accept followed by an async_read. There is a separate client thread (which has its own io_service). The client thread's io_service is never run, and the server's one is run in yet another thread.

I'm using a condition variable to wait in the server thread for the read to complete (which will never happen as the client never writes). This times out just fine and then I call socket.cancel(). I would expect this to remove the read handler and run to exit as the work queue is now empty.

I do see the read handler get called (with a cancel error), but run never exits. When I tie the socket lifetime to the handler lifetime (by lambda capturing a shared_ptr to the socket) the memory isn't freed either.

The server is set up like this:

std::mutex mutex;
std::unique_lock<std::mutex> lock(mutex);
std::condition_variable signal;

boost::asio::io_service server_service;
boost::asio::ip::tcp::acceptor listener(server_service);
std::mutex read_mutex;
std::unique_lock<std::mutex> read_lock(read_mutex);
std::condition_variable read_done;
std::thread server([&]() {
    std::unique_lock<std::mutex> lock(mutex);
    listener.open(boost::asio::ip::tcp::v4());
    listener.set_option(boost::asio::socket_base::enable_connection_aborted(true));
    listener.bind(boost::asio::ip::tcp::endpoint(boost::asio::ip::tcp::v4(), 4567));
    listener.listen();

    std::shared_ptr<connection> server_cnx(new connection(server_service));
    listener.async_accept(server_cnx->socket,
        [&, server_cnx](const boost::system::error_code& error) {
            log_thread() << "Server got a connection " << error << std::endl;
            boost::asio::async_read_until(server_cnx->socket, server_cnx->buffer, '\n',
                [&, server_cnx](const boost::system::error_code& error, std::size_t bytes) {
                    log_thread() << "Got " << bytes << ", " << error << std::endl;
                    std::unique_lock<std::mutex> lock(read_mutex);
                    lock.unlock();
                    read_done.notify_one();
                });
        });
    lock.unlock();
    signal.notify_one();
    if ( read_done.wait_for(read_lock, std::chrono::seconds(1)) == std::cv_status::timeout ) {
        log_thread() << "Server read timed out -- cancelling socket jobs" << std::endl;
        server_cnx->socket.cancel();
        server_cnx->socket.close();
    } else {
        log_thread() << "Server data read" << std::endl;
    }
    log_thread() << "Exiting server thread" << std::endl;
});
signal.wait(lock);
log_thread() << "Server set up" << std::endl;

The io_service thread is set up like this:

std::thread server_io([&]() {
    log_thread() << "About to service server IO requests" << std::endl;
    try {
        server_service.run();
    } catch ( ... ) {
        log_thread() << "Exception caught" << std::endl;
    }
    log_thread() << "**** Service jobs all run" << std::endl;
    signal.notify_one();
});

The output is as follows:

10.0002 139992957945728 Server set up
10.0005 139992957945728 Client set up
10.0006 139992848398080 About to service server IO requests
10.0006 139992848398080 Server got a connection system:0
11.0003 139992934819584 Server read timed out -- cancelling socket jobs
11.0004 139992934819584 Exiting server thread
11.0004 139992848398080 Got 0, system:125
20.0006 139992957945728 IO thread timed out servicing requests -- stopping it
^^^ This should not happen because the server service should have run out of work
20.0006 139992957945728 Waiting for things to close....
22.0008 139992957945728 Wait over, exiting

(Columns are time + 10s, thread ID, log message)

At the 11 second mark you can see that the async_read_until is called. This is the last handler in the server's io_service and yet run doesn't exit.

Even after the time out waiting for run to exit fires and the waiting thread does io_service::stop(), still run doesn't exit (there's another 2 second wait there).

The full code is on github

Aucun commentaire:

Enregistrer un commentaire