lundi 30 avril 2018

Casting with AVX intrinsics

There are two ways of casting with AVX2, either:

__m256i b = ...set register...
auto c = (__m256d)b; // version 1
auto d = _mm256_castsi256_pd(b); // version 2

I assume that both of these should give same results. The official manual from Intel says that there is zero runtime latency for version 2. Can I use version 1 as well with a zero latency assumption? In addition can I assume casting from any to any register type with version 2 is zero latency.

Aucun commentaire:

Enregistrer un commentaire