mardi 17 mars 2020

How do I pass constexpr values to CUDA device-side functions taking const references?

Consider the following code:

template <typename T> __host__ __device__ int foo1(const T& x);
template <typename T> __host__ __device__ int foo2(T x);

These two functions correspond to two common ways to pass "in"-parameters rather than "out" or "in/out" parameters. The second one is simpler, in that no references or addresses are involved; but the first one ensures no copying of more complex types, so it is often preferred.

My problem is with passing constexpr values - to the first kind of function (foo1). If it's on the host side - no problem. constexpr variables have addresses, and the compiler will take care of me and do something reasonable.

But - the same is not true for the device side. If we compile:

constexpr const int c { 123 };

__host__   int bar() { return foo1(c); }
__device__ int baz() { return foo1(c); }

The first function will compiler fine, but the second one will fail to compile (GodBolt).

I can't provide both functions, since the compiler won't be able to decide between them (often/always). And I don't want to just pass values, because I do want to avoid copies of large T's; or because I'm required to provide foo1() by some formal constraint.

What can I do, then?

I'll also mention I'd want to be able to write the same code on both the device and the host side.

Aucun commentaire:

Enregistrer un commentaire