mardi 23 janvier 2018

is __restrict__ required in the implementation signature of a function?

Say there is some simple adding function

// c[i] = a[i] + b[i] for i in [0, n)
static void add(const float * __restrict__ a,
                const float * __restrict__ b,
                float * __restrict__ c,
                int n);

in some header test.hpp, which is then implemented in test.cpp. To the best of my (extremely mediocre) disassembly analysis skills (naive inspection and diff), without __restrict__ in the signature of the implementation (.cpp), it is not enforced.

Does __restrict__ need to be in both the definition AND implementation signature? If so, why?

I believe the add function is an appropriate test case for this, but I'm not skilled enough to understand what is different and why it matters. I am researching that aspect right now to hopefully refine the question. Right now I just see fewer instructions, but need to find where / why they matter and will update this section.


I believe the code below is straightforward, included to make something easy to compile. If you set USE_RESTRICT in test.cpp to 0, the disassembled results are different. The one in the header (TEST_NO_RESTRICT) is there for additional permutations, but should remain 1 for this question.

test.hpp

#pragma once

// set to 1 to check the differences
#define TEST_NO_RESTRICT 0

#if TEST_NO_RESTRICT
    #define RESTRICT /* does nothing */
#else
    #if defined(__GNUC__) || defined(__clang__)
        #define RESTRICT __restrict__
    #elif defined(_MSC_VER)
        #define RESTRICT __restrict
    #else
        #define RESTRICT /* does nothing */
    #endif
#endif // TEST_NO_RESTRICT

// c[i] = a[i] + b[i] for i in [0, n)
void add(const float * RESTRICT a, const float * RESTRICT b, float * c, int n);

test.cpp

#include "test.hpp"

// set to 0 to undo restrict
#define USE_RESTRICT 1

#if USE_RESTRICT
    #define IMPL_RESTRICT RESTRICT
#else
    #define IMPL_RESTRICT /* does nothing */
#endif

// c[i] = a[i] + b[i] for i in [0, n)
void add(const float * IMPL_RESTRICT a,
         const float * IMPL_RESTRICT b,
         float * IMPL_RESTRICT c,
         int n) {
    for (int i = 0; i < n; ++i)
        c[i] = a[i] + b[i];
}

main.cpp

#include "test.hpp"
#include <iostream>
#include <iomanip>

int main(void) {
    int n = 4;
    float *a = (float *)malloc(n * sizeof(float));
    float *b = (float *)malloc(n * sizeof(float));
    float *c = (float *)malloc(n * sizeof(float));

    for (int i = 0; i < n; ++i) {
        a[i] = i;
        b[i] = i;
    }

    add(a, b, c, n);

    auto print = [n](const std::string &desc, const float *arr) {
        std::cout << desc << std::endl;
        for (int i = 0; i < n; ++i)
            std::cout << " " << arr[i];
        std::cout << std::endl;
    };

    print("A: ", a);
    print("B: ", b);
    print("C: ", c);

    free(a);
    free(b);
    free(c);

    return 0;
}

To build with CMake (really we just need -std=c++11 and -O3):

CMakeLists.txt

cmake_minimum_required(VERSION 3.1.3 FATAL_ERROR)
project("restrict_test")

# C++11 required for this project
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)

set(CMAKE_CXX_FLAGS "-O3 ${CMAKE_CXX_FLAGS}")

add_executable(restrict-test test.hpp test.cpp main.cpp)

Aucun commentaire:

Enregistrer un commentaire