dimanche 6 mai 2018

Multithread image-loading crash with SDL2

good people.

For the past couple of weeks I've been dealing with an annoying multithreading problem which I'm really struggling to resolve. I will be very happy and proud of the community if you help me address the issue.

  • Error code: double free or corruption (faststop)

  • Explanation of the problem and behaviour of the program: When I execute the program on a single thread - there is no problem, but when when I allow multithreading - my program crashes. For slower machines it could crash once in 50 runs, but with faster machines (bigger and faster number of cores) - it crashesh like 25 of 50 times. (unpreddicted behaviour)

  • Source code: At the bottom of the post you can find all the detailed 1:1 source code that causes the crash.

Useful info: From what I've read about multithread resource loading - only the thread that created the SDL_Renderer can create valid SDL_Texture's with that renderer. The other threads can only "help" by loading the image from harddrive with IMG_Load() since it's only CPU work. That's exatcly what I am doing.

  • Miltithread approach:

1) I construct thread safe queue with absolute image paths on the harddrive from the main thread (still on single core at this moment);

2) I spawn worker threads and pass them the above thread safe queue, which they start to consume.

3) The worker threads execute IMG_Load() (Loading images /or SDL_Surfaces/ with the CPU from the harddrive) and store the loaded SDL_Surfaces into another thread safe queue. 4) The main thread is waiting with a condition variable on the second thread safe queue and start to consume it and construct SDL_Textures out of the popped SDL_Surfaces;

5) Only the main thread calls Texture::freeSurface() on a surface once, when the SDL_Texture has been successfully constructed (GPU work).

Where it crashes: In Texture::freeSurface() method at the SDL_FreeSurface() call.

void Texture::freeSurface(SDL_Surface *& surface)
{
    if(surface && surface->refcount) //sanity check
    {
        SDL_FreeSurface(surface);
        surface = nullptr;
    }
}

Important note: When the crash occures - in the Texture::freeSurface() function - the value of "surface" is not nullptr and the value of surface->refcount is 0. SDL2 updates the value of refcount when a surface is freed, so again this points to a multithreading issue.

Important note 2: Happened to me only once but I did get a SEGFAULT on the destructor of a ThreadSafeQueue.

The header file for "Texture.h" is not included, because I ran out of space in the fill form, but it's trivial - all functions are static.

Thank you for your time.

/*
 * ThreadSafeQueue.hpp
 *  @brief: thread-safe multiple producer, multiple consumer queue
 */

#ifndef THREADSAFEQUEUE_HPP_
#define THREADSAFEQUEUE_HPP_

//C++ system headers
#include <mutex>
#include <condition_variable>
#include <queue>
#include <utility>
#include <atomic>

template<typename T>
class ThreadSafeQueue
{
    public:
        ThreadSafeQueue() : _isKilled(false) {}

        //copy and move constructors and assignment operators are deleted

        virtual ~ThreadSafeQueue() = default;

        /** @brief used to push new elements to the queue
         *
         *         NOTE: push() method moves the pushed data,
         *                                              rather than copying it.
         *
         *  @param const T & - reference to the data that is being pushed
         * */
        void push(const T & newData)
        {
            //lock the queue
            std::unique_lock<std::mutex> lock(_mutex);

            //push the data to queue by moving it
            _data.push(std::move(newData));

            /** Manually unlock the queue since we want to notify any threads
              * that are waiting for data
              * */
            lock.unlock();

            //notify one blocked thread
            _condVar.notify_one();
        }

        /** @brief used to push new elements to the queue
         *
         *         NOTE: use pushWithCopy() method only when you have a
         *               good reason not to have your original resource moved.
         *               pushWithCopy() method copies the pushed data,
         *                                              rather than moving it.
         *
         *  @param const T & - reference to the data that is being pushed
         * */
        void pushWithCopy(const T & newData)
        {
            //lock the queue
            std::unique_lock<std::mutex> lock(_mutex);

            //push the data to queue by copying it
            _data.push(newData);

            /** Manually unlock the mutex since we want to notify any threads
              * that are waiting for data
              * */
            lock.unlock();

            //notify one blocked thread
            _condVar.notify_one();
        }

        /** @brief used to acquire data from the queue.
         *
         *         NOTE: if queue is empty - the thread is put to sleep with
         *               a condition variable and when new data is being pushed
         *               into the queue - the thread is woken up again
         *
         *  @param T & - reference to the value that is being popped
         *                                                       from the queue
         * */
        void waitAndPop(T & value)
        {
            //lock the queue
            std::unique_lock<std::mutex> lock(_mutex);

            /** Condition variables can be subject to spurious wake-ups,
             *  so it is important to check the actual condition
             *  being waited for when the call to wait returns
             * */
            while (_data.empty() && !_isKilled)
            {
                _condVar.wait(lock);
            }

            //user has requested shutdown
            if(_isKilled)
            {
                return;
            }

            //acquire the data from the queue
            value = std::move(_data.front());

            //pop the 'moved' queue node
            _data.pop();
        }

        /** @brief used to try and acquire data from the queue.
         *
         *         NOTE: if queue is not empty - the data from the queue is
         *               being moved to the T & value function argument
         *
         *  @param bool - pop was successful or not
         * */
        bool tryPop(T & value)
        {
            //lock the queue
            std::lock_guard<std::mutex> lock(_mutex);

            //if queue is empty - try_pop() fails -> return false status
            if (_data.empty())
            {
                return false;
            }

            //acquire the data from the queue
            value = std::move(_data.front());

            //pop the 'moved' queue node
            _data.pop();

            //try_pop() succeeded -> return true status
            return true;
        }

        /** @brief used check the empty status of the queue.
         *
         *  @param bool - is queue empty or not
         * */
        inline bool isEmpty() const
        {
            //lock the queue
            std::lock_guard<std::mutex> lock(_mutex);

            //return empty status
            return _data.empty();
        }

        /** @brief used acquire the size of the queue.
         *
         *  @param uint64_t - the size of the queue container
         * */
        inline uint64_t size() const
        {
            //lock the queue
            std::lock_guard<std::mutex> lock(_mutex);

            //return actual queue size
            return _data.size();
        }

        /** @brief used initiate shutdown of the queue
         *         (usually invoking of this method by the developer should be
         *          followed by joining thread producers and thread consumers)
         * */
        inline void shutdown()
        {
            //lock the queue
            std::lock_guard<std::mutex> lock(_mutex);

            _isKilled = true;
        }

    private:
        //the actual queue holding the data
        std::queue<T>           _data;

        //the queue mutex used for locking on thread-shared resources
        mutable std::mutex      _mutex;

        /** Conditional variable used for waking threads that are currently
          * waiting for data to be pushed to the queue (starvation)
          * so they can process it
          * */
        std::condition_variable _condVar;

        /** An atomic flag used to signal waiting thread that user requested
         *  queue shutdown with ::shutdown() method (and will probably
         *  join the threads) afterwards
         *  */
        std::atomic<bool>       _isKilled;
};


#endif /* THREADSAFEQUEUE_HPP_ */

====================

/*
 * Texture.cpp
 */

//Corresponding header
#include "Texture.h"

//Other libraries headers
#include <SDL.h>
#include <SDL_image.h>

//Own components headers
#include "Log.h"

SDL_Renderer * Texture::_renderer = nullptr;


int32_t Texture::loadSurfaceFromFile(const char *   path,
                                     SDL_Surface *& outTexture)
{
    int32_t err = EXIT_SUCCESS;

    //memory leak check
    if(nullptr != outTexture)
    {
        LOGERR("Warning non-nullptr detected! Will no create Surface. "
                                                      "Memory leak prevented!");

        err = EXIT_FAILURE;
    }
    else
    {
        //Load image at specified path
        outTexture = IMG_Load(path);
        if(nullptr == outTexture)
        {
            LOGERR("Unable to load image %s! SDL_image Error: %s",
                                                         path, IMG_GetError());

            err = EXIT_FAILURE;
        }
    }

    return err;
}

int32_t Texture::loadTextureFromSurface(SDL_Surface *& surface,
                                        SDL_Texture *& outTexture)
{
    int32_t err = EXIT_SUCCESS;

    if(nullptr == surface)
    {
       LOGERR("Nullptr surface detected. Unable to loadFromSurface()");

       err = EXIT_FAILURE;
    }

    if(EXIT_SUCCESS == err)
    {
        //Check for memory leaks and get rid of preexisting texture
        if(outTexture)
        {
            LOGERR("Warning, trying to dynamically load a new texture before"
                                               "calling delete on the old one");
            freeTexture(outTexture);
        }

        //Create texture from surface pixels
        outTexture = SDL_CreateTextureFromSurface(_renderer, surface);

        if(nullptr == outTexture)
        {
            LOGERR("Unable to create texture! SDL Error: %s", SDL_GetError());

            err = EXIT_FAILURE;
        }

        //Get rid of old loaded surface
        freeSurface(surface);
    }

    return err;
}

void Texture::freeSurface(SDL_Surface *& surface)
{
    if(surface && surface->refcount) //sanity check
    {
        SDL_FreeSurface(surface);
        surface = nullptr;
    }
}

void Texture::freeTexture(SDL_Texture *& texture)
{
    if(texture) //sanity check
    {
        SDL_DestroyTexture(texture);
        texture = nullptr;
    }
}

void Texture::setRenderer(SDL_Renderer * renderer)
{
    _renderer = renderer;
}

====================================

/*
 * ResourceContainer.h
 */

#ifndef RESOURCECONTAINER_H_
#define RESOURCECONTAINER_H_

//C++ system headers
#include <cstdint>
#include <vector>
#include <unordered_map>

//Own components headers
#include "CommonResourceStructs.h"

//Forward declarations
//Since ThreadSafeQueue is a very heavy include -> use forward declaration to it
template <typename T> class ThreadSafeQueue;

struct SDL_Surface;
struct SDL_Texture;


class ResourceContainer
{
    public:
        ResourceContainer();

        virtual ~ResourceContainer() = default;

        /** @brief used to initialise the Resource container
         *
         *  @param const uint64_t - number of widgets to be loaded
         *
         *  @return int32_t       - error code
         * */
        int32_t init(const uint64_t widgetsCount);

        /** @brief used to deinitialize
         *                          (free memory occupied by Resource container)
         * */
        void deinit();

        /** @brief used to load all stored resources from the _rsrsDataMap
         *       > as SDL_Textute * in the _rsrsMap (for Hardware Renderer);
         *
         *  @param const bool - is multithread init allowed
         * */
        void loadAllStoredResources(const bool isMultithreadInitAllowed);

    private:
        /** @brief used to load a single resource at program start up
         *
         *  @param const ResourceData & - populated structure with
         *                                                 Widget specific data
         *  @param SDL_Surface *&       - created SDL_Surface
         *
         *  @returns int32_t            - error code
         * */
        int32_t loadSurface(const ResourceData & rsrcData,
                            SDL_Surface *&       outSurface);

        /** @brief used to internally load all stored resources
         *                                          only using the main thread
         * */
        void loadAllStoredResourcesSingleCore();

        /** @brief used to internally load all stored resources
         *     only using all possible hardware threads (if such are supported)
         *
         *  @param const uint32_t - number of worker threads to spawn
         *                                        (main thread is not included)
         * */
        void loadAllStoredResourcesMultiCore(const uint32_t workerThreadsNum);

        //_rsrcMap holds all Images
        std::unordered_map<uint64_t, SDL_Texture *> _rsrcMap;

        //_rsrcDataMap holds resource specific information for every Image
        std::unordered_map<uint64_t, ResourceData>  _rsrcDataMap;

        /** A copy of the resourceData's (used for
         *                                     multithread loading of resources)
         *  */
        ThreadSafeQueue<ResourceData> *             _resDataThreadQueue;

        /** Holds all loaded SDL_Surface's during initiliazation (used for
         *                                     multithread loading of resources)
         *
         *  std::pair first : unique rsrcId for currently processed ResourceData
         *  std::pair second: SDL_Surface * that will be created from the
         *                                              processed ResourceData
         *  */
        ThreadSafeQueue<std::pair<uint64_t, SDL_Surface *>> *
                                                    _loadedSurfacesThreadQueue;
};

#endif /* RESOURCECONTAINER_H_ */

==============================

/*
 * ResourceContainer.cpp
 */

//Corresponding header
#include "ResourceContainer.h"

//C++ system headers
#include <cstdlib>
#include <thread>

//Other libraries headers
#include "Texture.h"

//Own components headers
#include "ThreadSafeQueue.hpp"

#include "Log.h"

/** @brief used to load SDL_Surface's from file system async
 *
 *  @param resQueue              - the resource Data queue (input)
 *  @param outSurfQueue          - the loaded surfaces queue (output)
 *  @param threadsLeftToComplete - number of threads still working on the async
 *                                 surface load from the file system
 *  */
static void loadSurfacesFromFileSystemAsync(
           ThreadSafeQueue<ResourceData> * resQueue,
           ThreadSafeQueue<std::pair< uint64_t, SDL_Surface *>> * outSurfQueue,
           std::atomic<int32_t> & threadsLeftToComplete)
{
    ResourceData resData;

    SDL_Surface * surface = nullptr;

    while(resQueue->tryPop(resData))
    {
        if(EXIT_SUCCESS !=
             Texture::loadSurfaceFromFile(resData.header.path.c_str(), surface))
        {
            LOGERR("Warning, error in loadSurfaceFromFile() for file %s. "
                            "Terminating other resourceLoading",
                                                resData.header.path.c_str());
            return;
        }

        //push the newly generated SDL_Surface to the ThreadSafe Surface Queue
        outSurfQueue->push(std::make_pair(resData.header.hashValue, surface));

        //reset the variable so it can be reused
        surface = nullptr;
    }

    //decrement number of worker threads to be complete
    --threadsLeftToComplete;

    if(0 == threadsLeftToComplete.load())
    {
        //all worker threads are done -> shutdown the resourceQueue
        resQueue->shutdown();
    }
}


ResourceContainer::ResourceContainer() : _resDataThreadQueue(nullptr),
                                         _loadedSurfacesThreadQueue(nullptr)
{
}

int32_t ResourceContainer::init(const uint64_t widgetsCount)
{
    int32_t err = EXIT_SUCCESS;

    _rsrcDataMap.reserve(widgetsCount);

    _rsrcMap.reserve(widgetsCount);

    _resDataThreadQueue = new ThreadSafeQueue<ResourceData>;

    if(nullptr == _resDataThreadQueue)
    {
        LOGERR("Error, bad alloc for ThreadSafeQueue<ResourceData>");

        err = EXIT_FAILURE;
    }

    if(EXIT_SUCCESS == err)
    {
        _loadedSurfacesThreadQueue =
                        new ThreadSafeQueue<std::pair<uint64_t, SDL_Surface *>>;

        if(nullptr == _loadedSurfacesThreadQueue)
        {
            LOGERR("Error, bad alloc for ThreadSafeQueue<SDL_Surface *>");

            err = EXIT_FAILURE;
        }
    }

    return err;
}

void ResourceContainer::deinit()
{
    //free Image/Sprite Textures
    for(_rsrcMapIt it = _rsrcMap.begin(); it != _rsrcMap.end(); ++it)
    {
        Texture::freeTexture(it->second);
    }

    //clear rsrcMap unordered_map and shrink size
    _rsrcMap.clear();


    //clear rsrcDataMap unordered_map and shrink size
    _rsrcDataMap.clear();


    //release memory for resourceData thread safe queue
    if(_resDataThreadQueue) //sanity check
    {
        delete _resDataThreadQueue;
        _resDataThreadQueue = nullptr;
    }

    //release memory for loaded SDL_Surface's thread safe queue
    if(_loadedSurfacesThreadQueue) //sanity check
    {
        delete _loadedSurfacesThreadQueue;
        _loadedSurfacesThreadQueue = nullptr;
    }
}

void ResourceContainer::loadAllStoredResources(
                                            const bool isMultithreadInitAllowed)
{
    if(!isMultithreadInitAllowed)
    {
        LOGG("Starting Single Core resource loading ");

        loadAllStoredResourcesSingleCore();

        return;
    }

    /** Multi Threading strategy:
     *
     *  N - hardware supported number of cores.
     *
     *  Only the Hardware Renderer can perform GPU operations. With this said:
     *      > spawn N - 1 worker threads that will perform  only the
     *        CPU intensive work (reading files from disk,creating
     *        SDL_Surface's from them and storing those SDL_Surface's into a
     *        ThreadSafeQueue);
     *
     *      > use the main thread to pop generated SDL_Surface's from the
     *        ThreadSafeQueue that the worker threads are populating and start
     *        to create SDL_Texture's (GPU Accelerated Textures) out of those
     *        SDL_Surface's;
     * */

    /** Hardware_concurrency may return 0 if its not supported
      * if this happens -> run the resourceLoad in single core
      * */
    const uint32_t HARDWARE_THREAD_NUM = std::thread::hardware_concurrency();

    /** If threads are <= 1 no need to spawn second thread,
     *  because there will be a performance
     *  lost from the constant threads context switches
     * */
    if(HARDWARE_THREAD_NUM > 1) //multi core case
    {
        /* Generate THREAD_NUM - 1 worker CPU threads and leave the main
         * thread to operate ot GPU + CPU operations
         * */
        const uint32_t WORKER_THREAD_NUM = HARDWARE_THREAD_NUM - 1;

        LOGG("Starting Multi Core resource loading on %u threads",
                                                        HARDWARE_THREAD_NUM);

        loadAllStoredResourcesMultiCore(WORKER_THREAD_NUM);
    }
    else //single core case
    {
        LOGR("Multi Threading is not supported on this hardware. ");
        LOGG("Starting Single Core resource loading ");

        loadAllStoredResourcesSingleCore();
    }
}

int32_t ResourceContainer::loadSurface(const ResourceData & rsrcData,
                                       SDL_Surface *&       outSurface)
{
    int32_t err = EXIT_SUCCESS;

    if(EXIT_SUCCESS !=
            Texture::loadSurfaceFromFile(rsrcData.header.path.c_str(),
                                         outSurface))
    {
        LOGERR("Error in loadSurfaceFromFile() for rsrcId: %#16lX",
                                                rsrcData.header.hashValue);

        err = EXIT_FAILURE;
    }

    return err;
}

int32_t ResourceContainer::loadTextureFromSurface(SDL_Surface *  surface,
                                                  SDL_Texture *& outTexture)
{
    int32_t err = EXIT_SUCCESS;

    if(EXIT_SUCCESS != Texture::loadTextureFromSurface(surface, outTexture))
    {
        LOGERR("Error in Texture::loadTextureFromSurface()");

        err = EXIT_FAILURE;
    }

    return err;
}

void ResourceContainer::loadAllStoredResourcesSingleCore()
{
    //NOTE: some of the thread safe mechanism such as ThreadSafeQueue as reused.
    //the overhead is minimal and the source will be reused.

    ResourceData resData;

    SDL_Surface * newSurface = nullptr;

    while(_resDataThreadQueue->tryPop(resData))
    {
        if(EXIT_SUCCESS !=
                    Texture::loadSurfaceFromFile(resData.header.path.c_str(),
                                                 newSurface))
        {
            LOGERR("Warning, error in loadSurfaceFromFile() for file %s. "
                            "Terminating other resourceLoading",
                                                resData.header.path.c_str());

            return;
        }

        //push the newly generated SDL_Surface to the ThreadSafe Surface Queue
        _loadedSurfacesThreadQueue->push(
                                    std::make_pair(resData.header.hashValue,
                                                   newSurface));

        //reset the variable so it can be reused
        newSurface = nullptr;
    }

    //temporary variables used for _loadedSurfacesThreadQueue::pop operation
    std::pair<uint64_t, SDL_Surface *> currResSurface(0, nullptr);

    //on main thread return
    SDL_Texture * newTexture = nullptr;

    //generate GPU textures from the stored SDL_Surface's
    while(_loadedSurfacesThreadQueue->tryPop(currResSurface))
    {
        if(EXIT_SUCCESS != 
                  Texture::loadTextureFromSurface(currResSurface.second,
                                                  newTexture))
        {
            LOGERR("Error in Texture::loadTextureFromSurface() for rsrcId: "
                                          "%#16lX", currResSurface.first);

            return;
        }

        //store the generates SDL_Texture into the rsrcMap
        _rsrcMap[currResSurface.first] = newTexture;

        //reset the variable so it can be reused
        newTexture = nullptr;
    }
}

void ResourceContainer::loadAllStoredResourcesMultiCore(
                                                const uint32_t workerThreadsNum)
{
    //create a vector pool for worker threads
    std::vector<std::thread> workerThreadPool(workerThreadsNum);

    //used to indicate to the main thread for all worker threads job finished
    std::atomic<int32_t> threadsLeftToComplete(workerThreadsNum);

    //temporary variables used for _loadedSurfacesThreadQueue::pop operation
    std::pair<uint64_t, SDL_Surface *> currResSurface(0, nullptr);

    for(uint32_t i = 0; i < workerThreadsNum; ++i)
    {
        //spawn the worker threads
        workerThreadPool[i] =
                std::thread(loadSurfacesFromFileSystemAsync,
                            _resDataThreadQueue,
                            _loadedSurfacesThreadQueue,
                            std::ref(threadsLeftToComplete));
    }

    //temporary variables used for calculations
    SDL_Texture * newTexture = nullptr;

    //start work on the main thread
    //while waiting for the worker threads to be done
    while(0 != threadsLeftToComplete.load())
    {
        /** Block main thread and wait resources to be pushed
          * into the _loadedSurfacesThreadQueue
          * */
        _loadedSurfacesThreadQueue->waitAndPop(currResSurface);

        if(EXIT_SUCCESS !=
                Texture::loadTextureFromSurface(currResSurface.second,
                                                newTexture))
        {
            LOGERR("Error in Texture::loadTextureFromSurface() for rsrcId: "
                                          "%#16lX", currResSurface.first);

            return;
        }
        else
        {
            //emplace GPU Texture to the rsrcMap
            _rsrcMap[currResSurface.first] = newTexture;

            //reset the variable so it can be reused
            newTexture = nullptr;
        }
    }

    //all worker threads are done with their work - join them
    for(uint32_t i = 0; i < workerThreadsNum; ++i)
    {
        workerThreadPool[i].join();
    }

    //finish rest of the work with main thread (if such is left)
    while(_loadedSurfacesThreadQueue->tryPop(currResSurface))
    {
        if(EXIT_SUCCESS !=
                Texture::loadTextureFromSurface(currResSurface.second,
                                                newTexture))
        {
            LOGERR("Error in Texture::loadTextureFromSurface() for rsrcId: "
                                           "%#16lX", currResSurface.first);

            return;
        }
        else
        {
            //emplace GPU Texture to the rsrcMap
            _rsrcMap[currResSurface.first] = newTexture;

            //reset the variable so it can be reused
            newTexture = nullptr;
        }
    }
}

Aucun commentaire:

Enregistrer un commentaire