I have the following code for drawing a circle into a given buffer:
void drawcircle(int x, int y, int radius, void* buffer, int width, int height)
{
int radiusSq = radius * radius;
for (int yy = -radius; yy <= radius; ++yy)
{
for (int xx = -radius; xx <= radius; ++xx)
{
if (xx * xx + yy * yy < radiusSq + radius)
{
if (x + xx >= 0 && y + yy >= 0 && x + xx <= width && y + yy <= height)
{
set_pixel(x + xx, y + yy);
}
}
}
}
}
In Xcode, the default flags are: -O0
for Debug -Os
for Release
When ran the code above on clang with -Os
or -O1`, the function takes 6s to draw 100,000 circles of radius 100.
If I enable -O2
instead, it runs in 40ns, and 29ns with O3
. This is a massive difference.
For this function only, I want to enable O2 or O3. I came across the following syntax:
[[gnu::optimize(O2)]]
, [[gnu::optimize("O3")]]
, [[clang::optimize(optnone)]]
, __attribute((optimize(O3)))__
somewhere.
If I specify anything other than optnone
the optimize
attribute is ignored (O0
is also ignored)..
Any ideas how I can tell it to only optimize this function with a specified level?
Aucun commentaire:
Enregistrer un commentaire