TL;DR
Can you increase the priority of threads?
Are there optimizations that can be done to allow better thread balanceing given a limited number of cores? (say 4)
I'm new to threaded programing and I wanted to understand it a little better. I did just enough research to write a simple program with threads. I then took what I learned and used that knowledge to write a decently slow algorithm to test the multi thread against a single thread.
I knew the results would likely show that single thread is faster as it is what I'm told. And they were for my code with the single thread completing within 60% of the 4 thread pass. This was relatively consistent across many load sizes.
But I noticed a few things. During the multi threaded execution I was looking at my core load distribution and saw a single core taking the bulk of the load.
This brought me to believe there must be some behavior that threaded programs have that I want to turn off to get the best result.
I had the reverse Idea instead. I added a std::this_thread.sleep_for(std::chrono::nanoseconds(1)); because I remember from experience that adding a sleep to a forever loop made it more system friendly. I was astonished by the results.
My original idea that threads spread the load and must be faster now proved true by almost the exact thread count I was using. (Not quite the 4 threaded process has almost no change in performance and the single threaded became 4-6x slower then previous)
This begs the question. Do threads add a friendlessness to be interrupted and can it be disabled at the cost of potentially loosing all cores to them? or are there other techniques that provide better performance. Because from my perspective the only reason threads are slower is due to lower execution priority compared to a single uninterrupted thread in my particular example.
Code I'd used for testing this behavior:
#include <ctime>
#include <iostream>
#include <thread>
#include <chrono>
std::thread t1, t2, t3;
std::string s0,s1,s2,s3;
void shiftout(int*a,unsigned size, unsigned k)
{
while(++k<size)
{
a[k-1] = a[k];
}
a[size-1] = a[size-2] + 2;
}
void function(std::string title, int pram,unsigned ARRAY_SIZE)
{
std::string result;
int a[ARRAY_SIZE];
unsigned i,j;
for(i = 0; i < ARRAY_SIZE; ++i)
{
a[i] = 2*i+1;
}
a[0] = 2;
i = 5;
shiftout(a,ARRAY_SIZE,i--);
while(++i < ARRAY_SIZE)
{
j = 0;
do if(a[i]%a[j] == 0)
{
// std::cout<<i<<","<<j<<" "<<a[i]<<"/"<<a[j]<<std::endl;
shiftout(a,ARRAY_SIZE,i);
break;
}
while(++j < i);
//std::this_thread.sleep_for(std::chrono::nanoseconds(1));
}
i=(unsigned)-1;
while(++i<ARRAY_SIZE)
{
result+=std::to_string(a[i]);
result+=' ';
}
switch(pram)
{
case 0:s0 = result;break;
case 1:s1 = result;break;
case 2:s2 = result;break;
case 3:s3 = result;break;
}
}
int main()
{
std::clock_t time;
unsigned s = 0x100;
std::cin.sync_with_stdio(false);
time = std::clock();
function("Thread 0",0,s);
function("Thread 0",1,s);
function("Thread 0",2,s);
function("Thread 0",3,s);
std::cout
// <<s0<<" "<<s1<<" "<<s2<<" "<<s3<<" "<<std::endl
<<std::endl<<"1 thread completion in "
<<(std::clock()-time)/(double)CLOCKS_PER_SEC<<std::endl;
t1=std::thread(function,"Thread 1",1,s);
t2=std::thread(function,"Thread 2",2,s);
t3=std::thread(function,"Thread 3",3,s);
time = std::clock();
function("Thread 0",0,s);
t1.join();
t2.join();
t3.join();
std::cout
// <<s0<<" "<<s1<<" "<<s2<<" "<<s3<<" "<<std::endl
<<std::endl<<"4 thread completion in "
<<(std::clock()-time)/(double)CLOCKS_PER_SEC<<std::endl;
return 0;
}
Aucun commentaire:
Enregistrer un commentaire