dimanche 10 février 2019

C++11 Lock-free sequence number generator safe?

The goal is to implement a sequence number generator in modern C++. The context is in a concurrent environment.

Requirement #1 The class must be singleton (common for all threads)

Requirement #2 The type used for the numbers is 64-bit integer.

Requirement #3 The caller can request more than one numbers

Requirement #4 This class will cache a sequence of numbers before being able serve the calls. Because it caches a sequence, it must also store the upper bound -> the maximum number to be able to return.

Requirement #5 Last but not least, at startup (constructor) and when there are no available numbers to give ( n_requested > n_avalaible ), the singleton class must query the database to get a new sequence. This load from DB, updates both seq_n_ and max_seq_n_.

A brief draft for its interface is the following:

class singleton_sequence_manager {

public:

    static singleton_sequence_manager& instance() {
        static singleton_sequence_manager s;
        return s;
    }

    std::vector<int64_t> get_sequence(int64_t n_requested);

private:
    singleton_sequence_manager(); //Constructor
    void get_new_db_sequence(); //Gets a new sequence from DB

    int64_t seq_n_;
    int64_t max_seq_n_;
}

Example just to clarify the use case. Suppose that at startup, DB sets seq_n_ to 1000 and max_seq_n_ to 1050:

get_sequence.(20); //Gets [1000, 1019]
get_sequence.(20); //Gets [1020, 1039]
get_sequence.(5); //Gets [1040, 1044]
get_sequence.(10); //In order to serve this call, a new sequence must be load from DB

Obviously, an implementation using locks and std::mutex is quite simple.

What I am interested into is implementing a lock-free version using std::atomic and atomic operations.

My first attempt is the following one:

int64_t seq_n_;
int64_t max_seq_n_;

were changed to:

std::atomic<int64_t> seq_n_;
std::atomic<int64_t> max_seq_n_;

Get a new sequence from DB just sets the new values in the atomic variables:

void singleton_sequence_manager::get_new_db_sequence() {
    //Sync call is made to DB
    //Let's just ignore unhappy paths for simplicity
    seq_n_.store( start_of_seq_got_from_db );
    max_seq_n_.store( end_of_seq_got_from_db );
    //At this point, the class can start returning numbers in [seq_n_ : max_seq_n_]
}

And now the get_sequence function using atomic compare and swap technique:

std::vector<int64_t> singleton_sequence_manager::get_sequence(int64_t n_requested) {

    bool succeeded{false};
    int64_t current_seq{};
    int64_t next_seq{};

    do {

        current_seq = seq_n_.load();
        do {
            next_seq += current_seq + n_requested + 1;
        }
        while( !seq_n_.compare_exchange_weak( current_seq, next_seq ) );
        //After the CAS, the caller gets the sequence [current_seq:next_seq-1]

        //Check if sequence is in the cached bound.
        if( max_seq_n_.load() > next_seq - 1 )
            succeeded = true;
        else //Needs to load new sequence from DB, and re-calculate again
            get_new_db_sequence();

    }        
    while( !succeeded );

    //Building the response        
    std::vector<int64_t> res{};
    res.resize(n_requested);
    for(int64_t n = current_seq ; n < next_seq ; n++)
        res.push_back(n);

    return res;
}

Thoughts:

  • I am really concerned for the lock-free version. Is the implementation safe ? If we ignore the DB load part, obviously yes. The problem arises (in my head at least) when the class has to load a new sequence from the DB. Is the update from DB safe ? Two atomic stores ?

  • My second attempt was to combine both seq_n_ and max_seq_n_ into a struct called sequence and use a single atomic variable std::atomic but the compiler failed. Because the size of the struct sequence is greater than 64-bit.

  • Is it possible to somehow protect the DB part by using an atomic flag for marking if the sequence is ready yet: flag set to false while waiting the db load to finish and to update both atomic variables. Therefore, get_sequence must be updated in order to wait for flag to bet set to true. (Use of spin lock ?)

Aucun commentaire:

Enregistrer un commentaire