dimanche 10 novembre 2019

How to tell clang to optimize with a certain level

I have the following code for drawing a circle into a given buffer:

void drawcircle(int x, int y, int radius, void* buffer, int width, int height)
{       
    int radiusSq = radius * radius;
    for (int yy = -radius; yy <= radius; ++yy)
    {
        for (int xx = -radius; xx <= radius; ++xx)
        {
            if (xx * xx + yy * yy < radiusSq + radius)
            {
                if (x + xx >= 0 && y + yy >= 0 && x + xx <= width && y + yy <= height)
                {
                    set_pixel(x + xx, y + yy);
                }
            }
        }
    }
}

In Xcode, the default flags are: -O0 for Debug -Os for Release

When ran the code above on clang with -Os or -O1`, the function takes 6s to draw 100,000 circles of radius 100.

If I enable -O2 instead, it runs in 40ns, and 29ns with O3. This is a massive difference.

For this function only, I want to enable O2 or O3. I came across the following syntax:

[[gnu::optimize(O2)]], [[gnu::optimize("O3")]], [[clang::optimize(optnone)]], __attribute((optimize(O3)))__ somewhere.

If I specify anything other than optnone the optimize attribute is ignored (O0 is also ignored)..

Any ideas how I can tell it to only optimize this function with a specified level?

Aucun commentaire:

Enregistrer un commentaire