lundi 25 juin 2018

data parallelism via multithreading ,calling a utility in each thread to process a file

I'm currently working on a project where I receive a request with a list of files to be processed (Say program is QueueHandler).At present I process files in a sequential order by calling another utility program( say FileProcessor) where actual algorithm is present. QueueHandler calls FileProcessor using system() call in a loop.

I'm planning to implement multithreading so that multiple files can be processed simultaneously, each file is independent of each other, once the processing is done by FileProcessor, QueueHandler checks if output was generated successfully, and then send back a combined report back.

For illustration I've written a sample program which mimic the behaviour of my actual program and write a certain number of integers in a file (FileProcessor aka myPrinterUtility) and another program calls this utility program with different arguments(QueueHandler aka main.cpp)

Question 1 - Is there any better way for implementation of the same , I would like to keep QueueHandler and FileProcessor as separate programs.

Question 2 - hardware_concurrency() noexcept; show the output as 4 , but I might need to process upto 12 files at a time (current upper cap) , how it will affect the speed if I open 12 threads simultaneously vs 4 threads at a time ( which one is better ).

Please let me know if any other details are required for the same, I'm new to multithreading so any suggestions will be helpful.

main.cpp

#include<iostream>
#include<thread>
#include<future>
#include<stdlib.h>
#include<fstream>
#include<vector>
using namespace std;



bool is_file_exist(const char *fileName)
{
    std::ifstream infile(fileName);
    infile.seekg(0, ios::end);
    bool retVal=infile.good() && (infile.tellg()>0);
    bool fileExist=infile.good();
    infile.close();
    if(!fileExist)
    {
        return false;
    }
    else
    {
        return true;
    }
}


bool CallUtil(int firstNumber,int totalNumbersToPrint,int fileName)
{
    //call the utility program
    string opFile="/home/saumitra/sampleOP/output-"+std::to_string(fileName)+".txt";
    string utilString="/home/saumitra/myPrinterUtility "+std::to_string(firstNumber)+ " " +std::to_string(totalNumbersToPrint)+" "+opFile;

    system(utilString.c_str());
    return is_file_exist(opFile.c_str());
}

int main()
{
    vector<std::future<bool> > myFutureVector;
    for(int first=100,total=1000,i=0;i<5;i++,first=first+100,total=total+5000)
    {
        myFutureVector.push_back(std::async(launch::async,CallUtil,first,total,i));
    }

    for(vector<std::future<bool> >::iterator it=myFutureVector.begin();it!=myFutureVector.end();it++)
    {
        cout<<"final status : "<<it->get()<<endl;
    }
    return 0;
}

myPrinterUtility.cpp

#include<fstream>
#include<string>

using namespace std;

int main(int argc,char* argv[])
{
    if(argc==4)
    {

        string firstNum=argv[1];
        string total=argv[2];
        string filename=argv[3];

        ofstream filelist;
        filelist.open(filename.c_str(),ios::app);

        int firstNumInt=stoi(firstNum);
        int totalInt=stoi(total);

        for(int i=0;i<totalInt;i++)
        {
            filelist<<i+firstNumInt<<endl;
        }
        filelist.close();
    }

}

Aucun commentaire:

Enregistrer un commentaire