I'm having problem with g++ (4.9.2) optimization that is produces faulty code that is puzzling to me. And by faulty, I mean the code output is fundementally different between optimized (-O1, -O2 or -O3) and non-optimized (-O0) compilation. And, of course, the optimized code is wrong.
I have class similar to <bitset>
, where info is stored at bit-level and is instantiated with any number of bits, but has a specialized template for Bits with <= 8 bits.
#include <iostream>
using namespace std;
// generalized class Bits, uses array of specialized, 1-byte Bits
template <unsigned int bits=8, bool _=(bits>8)>
class Bits {
Bits<8,false> reg[(bits+7)>>3];
public:
void set(int pos) { reg[pos>>3].set(pos%8); };
void clr(int pos) { reg[pos>>3].clr(pos%8); };
bool get(int pos) { reg[pos>>3].get(pos%8); };
};
// specialized, 1-byte Bits (flag stored in a char)
template <unsigned int bits> class Bits<bits,false> {
char reg;
public:
Bits() : reg(0) {};
Bits(int r) : reg(r) {};
void set(int pos) { reg |= mark(pos); };
void clr(int pos) { reg &= ~mark(pos); };
bool get(int pos) { return (reg & mark(pos)); };
static int mark(int pos) { return ( 1 << pos ); };
};
int main() {
Bits<16> b;
Bits<8> c;
b.set(1);
c.set(1);
cout << b.get(1) << endl;
cout << c.get(1) << endl;
return 0;
};
The test is simple, set a bit and then print said bit state to stdout. This is done with a 16-bits Bits object (the generalized templated) and 8-bit Bits object (the specialized template). The expected answer is TRUE for either objects. And when I compile with no optimization (i.e. g++-4.9 -O0 main.cpp
), this is exactly what I get. Output of ./a.out
is:
1
1
But when I compile with -O1 optimization (i.e. g++-4.9 -O1 main.cpp
), the results is different AND partially wrong:
0
1
Specifically, Bits<8>
tests correctly in both optimzation (-O0 and -O3), but Bits<16>
test correctly only with -O0 and not with -O1.
The optimizer (-O1, -O2, and -O3) all just optimizes out all then Bits member functions and simply jumps to the final results, calculated at compile-time. Obviously the optimizer is making some error, but I don't know what is the root cause. Does anyone know what I should be looking for to debug the problem?
Aucun commentaire:
Enregistrer un commentaire