dimanche 13 août 2017

c++ 11 rtlocks.cpp critical_section::unlock() crash - Access Violation

I'm building an MFC C++11 Applications with multiple threads. I'm experiencing an Access Violation crash when a call is being made to pNextNode->Unblock() in the code below. pNextNode is not null but is pointing to what seems like freed or corrupted memory. The following code is copied from rtlocks.cpp and the trigger for the crash is a call for some std::unique_lock::unlock(). The crash is pretty rare, it happens once every 10-15 reproduce tries so my guess is it might be a racing problem. I'm not sure about the next fact but it seems that a condition_variable wait is always involved somehow in the crash but the condvar is using a different lock with a different mutex in a different thread so I'm not sure if that the cause for the crash.

I'm not pasting any code of mine because there's a lot of it involved so I'd like to keep my question as simple as possible. My question is in what conditions can a situation/crash like this happen where the pNextNode pointer is corrupted.

_CRTIMP void critical_section::unlock()
{
    LockQueueNode * pCurrentNode = reinterpret_cast<LockQueueNode *>(_M_pHead);

    _ASSERT_EXPR(pCurrentNode != nullptr, L"Lock not being held");
    _ASSERT_EXPR(pCurrentNode->m_pContext == SchedulerBase::SafeFastCurrentContext(), L"Lock being held by different context");

    // Reset the context on the active node to ensure that it is possible to detect the error case
    // where the same context tries to enter the lock twice. (enjoy the fence below)
    reinterpret_cast<LockQueueNode *>(&_M_activeNode)->m_pContext = nullptr;

    LockQueueNode * pNextNode = pCurrentNode->m_pNextNode;

    // This assignment must be put before ICEP(_M_pTail) because 
    // 1. It must be visible before ICEP(_M_pTail), and
    // 2. It must be visible before unblock.
    _M_pHead = pNextNode;

    // If we reach the end of the queue of waiters, we need to handle a potential race with new incoming waiters.
    if (pNextNode == nullptr && reinterpret_cast<LockQueueNode *>(InterlockedCompareExchangePointer(&_M_pTail, nullptr, pCurrentNode)) != pCurrentNode)
    {
        // If someone is adding a context then wait until next node pointer is populated.
        pNextNode = pCurrentNode->WaitForNextNode();
        // DO NOT try to combine this assignment with the one above by moving it out of and after the if statement. Moving it to after the if statement
        // could lead to situations where the set of _M_pHead is not fenced.
        _M_pHead = pNextNode;
    }

    // It's no longer safe to touch pNextNode after it gets unblocked
    while (pNextNode != nullptr && !pNextNode->Unblock()) // Crash here
    {
        // The unblock could only have failed because the block is a timed block and was woken up by the timer.
        pCurrentNode = pNextNode;
        pNextNode = pCurrentNode->m_pNextNode;

        // This assignment must be put before ICEP(_M_pTail) below
        _M_pHead = pNextNode;

        // If we reach the tail end of the waiters queue, we need to handle a potential race due to a new waiter being added to the tail.
        if (pNextNode == nullptr && reinterpret_cast<LockQueueNode *>(InterlockedCompareExchangePointer(&_M_pTail, nullptr, pCurrentNode)) != pCurrentNode)
        {
            // If someone is adding a context then wait until next node pointer is populated.
            pNextNode = pCurrentNode->WaitForNextNode();
            _M_pHead = pNextNode;
        }

        // Since this was a timer node that was released by the timer firing we need to release our reference on it so it can be deleted.
        pCurrentNode->DerefTimerNode();
    }
}

Aucun commentaire:

Enregistrer un commentaire