vendredi 25 mars 2022

Strange race condition when calling pthread_getschedparam from within a C++11 std::thread's thread-function?

I have some (not-very-portable) code that allows me to use the pthread library's pthread_getschedparam() and pthread_setschedparam() functions to set the priority of a C++11 thread, at least on platforms where C++11's threads are implemented on top of PThreads.

I call these two pthread-functions from within the C++11's thread-entry function itself.

However, I've noticed a strange race-condition like problem: sometimes pthread_getschedparam() returns error code ESRCH/3 (aka "No such process"). If I then repeat the call in a loop, it eventually works.

This seems very odd to me, since I'm calling these functions from within the very thread that the functions occasionally think doesn't exist.

Below is a toy example program that reproduces the fault for me (both under MacOS12.3 and Ubuntu 20):

#include <thread>
#include <pthread.h>
#include <stdio.h>
#include <string.h>

static void InternalThreadEntryFunc(std::thread * threadPtr)
{
   int schedPolicy;
   sched_param param;

   int ret = 1;
   int errors = 0;
   while(ret != 0)
   {
      ret = pthread_getschedparam(threadPtr->native_handle(), &schedPolicy, &param);
      if (ret != 0) {errors++; printf("ret=%i [%s]\n", ret, strerror(ret));}
   }
   if (errors) printf("errors=%i\n", errors);
}

int main(int, char **)
{
   while(1)
   {
      std::thread thread;

      thread = std::thread(InternalThreadEntryFunc, &thread);
      thread.join();
   }

   return 0;
}

The output I get from the above program looks like this (with more output being printed periodically, every few seconds):

$ ./a.out 
ret=3 [No such process]
errors=1
ret=3 [No such process]
ret=3 [No such process]
errors=2
ret=3 [No such process]
ret=3 [No such process]
ret=3 [No such process]
ret=3 [No such process]
ret=3 [No such process]
errors=5
ret=3 [No such process]
errors=1
[...]

Does anyone know what is causing these errors, or what I might do to resolve them elegantly? (simply retrying the call until it works is the non-elegant solution, of course)

Aucun commentaire:

Enregistrer un commentaire