lundi 19 mars 2018

Moving computations to a different thread, with constraints: even slower?

There's an old C++11 program designed about 10 years ago that does CPU-intensive work. It's still maintained and I cannot change its sources nor have access to them.

It's designed as a main loop that updates a massive world data array (imagine : a big clunky A* running). It's all sequential. There's no multithreading whatsoever. When the processing is finished, the frame is rendered, then on to the next frame.

This program has a plugin system that lets me hook my own C++11 code at one specific point in the frame computation. It lets me read the data as I deem fit.

I'm planning on writing new code. To give you a rough context: that would be C++11 code for Windows 2010, meant to be run on any 64-bit processor with more than two cores, manufactured in 2018. I do not care about performance drop with older systems. I only want to take advantage of multithreading for future systems.

This new code would (I'm not detailing how, and why. Percentages below correspond to what I have analyzed as feasible) :

  • Completely shut down a fraction (let's say, 15%) of the initial computing and memory-reading of the original program.
  • In replacement, do two things :
    • Do its own (limited) reading of the initial memory space. That would re-introduce about 2% load in terms of reading from that now-shared memory space. This reading is now done concurrently with the old program. It's only reading memory and not doing any heavy calculations.
    • The remaining 13% are now done in a separate thread created by the new code. It is the only one to access its data. Therefore there is no concurrent access, and the OS will most likely make that run on a different core, since the original code completely saturates its core.

Would you say that this could help the original program scale up, or do you think that those 2% memory-reading from the shared memory space create slow-downs that cancel the benefit of moving the other 13% to a separate memory space managed by a separate core?

The important part is that the goal is not to make the current load faster, but to increase the load, so that the program can be (moderately) scaled up for a few more years before it's put to sleep.

Do you have implementation advice?

Aucun commentaire:

Enregistrer un commentaire