mercredi 28 mars 2018

implement async algorithm using intel tbb parallelization

I'm trying to implement an async algorithm, which could be represented as follows:

// a naive data reading class, which returns the value via an internal index
struct DataLoader {
DataLoader() : index(0) {}

int read() {
    if (index > 200) { index = 0; }
    return index++;
}

int index;
};

int main() {
    DataLoader d;
    int rez[100] = {0};  // stores the final result

    tbb::parallel_for(0, 1000, [&](int i) {
        // 1) so `d` would be accessed by multi-threads, no need for a mutex at all
        int x = d.read();  

        // 2) ditto, no mutex needed
        rez[i%100] += x;
        });
}

I know in the main function above, parallel_for might bring some race condition issues, but I don't worry about if d or rez is sync across threads at all, it's totally fine to have race condition.

All I care about is to make it run fast enough, so question is, is my code doing it right?

Aucun commentaire:

Enregistrer un commentaire