c++11: NVIDIA nvcc compilation flag for constexpr depth and IEEE 754 exponent computation

mardi 19 février 2019

NVIDIA nvcc compilation flag for constexpr depth and IEEE 754 exponent computation

Consider the following code that computes the exponent of a double floating point number as a constant expression (in the format specified by the IEEE 754 standard).

    template <typename T>  constexpr T abs_CE(const T x){return x>=0?x:-x;}
    constexpr unsigned long long int __double_exponent_CE_(const double x){return x==0?0:(x>=2.?(__double_exponent_CE_(x/2.)+1):(x<1?__double_exponent_CE_(x*2.)-1:0));}
    constexpr unsigned long long int __double_exponent_CE(const double x){return (x==0)?0:(__double_exponent_CE_(abs_CE(x))+1023);}

That code fails to compile as constant expression in gcc under normal compilation flags circumstances for certain inputs like std::numeric_limits< double >::max. The reason it fails to compile is because it exceeds the max recursion depth for a constant expression (512 is the default value). For example std::numeric_limits< double >::max requires 1024 calls, exceeding the limit.

If the flag -fconstexpr-depth=2048 is added, then the code compiles perfectly, and evaluates to a constant expression that can be passed as a template parameter.

That code fails to compile under nvcc with the flag -Xcompiler -fconstexpr-depth=2048 (specifically it crashes when nvcc issues the cicc command), so is there any way to change the depth limit in nvcc? I have not found any flag to change it in NVCC options.

Just in case there is no such equivalent flag in nvcc, does anybody know any other way to compute the exponent of a double in compile time with less than 512 recursions calls?

c++11

mardi 19 février 2019

NVIDIA nvcc compilation flag for constexpr depth and IEEE 754 exponent computation

Aucun commentaire:

Enregistrer un commentaire