c++11: How to declare function with argument that is a closure in C++11 in a Cuda device function?

vendredi 29 mars 2019

How to declare function with argument that is a closure in C++11 in a Cuda device function?

I'm working on Cuda with C++11 (I don't think Cuda supports later C++ versions yet). I've a closure object that is passed to the function Process() which calls the closure for each iteration.

I understand that std:: functionality is generally not available in Cuda. For example, when I try to use std::function< float(uint32_t) >, I get this error:

error: calling a host function("std::function ::function< ::, void, void> ") from a global function("_NV_ANON_NAMESPACE::LargeKernel") is not allowed

What can I replace lookupFunc with so that this compiles without std::function being available? I was able to work around this by creating a function template to deduce the type of the lambda function.

This code works and shows the work around I've employed:

//using lookupFunc = std::function< float(uint32_t) >;

template< typename Lambda > // Work around with function template
__device__
void Process(float       * const outData,
             const  int32_t      locationX,
             const Lambda /* lookupFunc */ lambda)
{
    float answer = 0.f;

    for( int32_t offset = -1 ; ++offset < 1024 ; )
    {
        const float value = lambda( offset );

        answer += value;
    }

    outData[ locationX ] = answer;
}

__global__
void LargeKernel(const float * const inData,
                 float       * const outData)
{
    constexpr uint32_t cellStride = 1;
    const     int32_t  locationX  = threadIdx.x + blockDim.x * blockIdx.x;
    const auto lambda
        = [locationX, inData, cellStride](const int32_t offset)
          {
              return inData[ locationX + offset + cellStride ];
          };

    Process( outData, locationX, lambda );
}

I also tried:

using lookupFunc = float(* const)(uint32_t);

But that gives error:

error: no suitable conversion function from "const lambda ->float" to "float (*)(uint32_t)" exists

How can I declare the type of the third argument to Process() without using a template?

c++11

vendredi 29 mars 2019

How to declare function with argument that is a closure in C++11 in a Cuda device function?

Aucun commentaire:

Enregistrer un commentaire