mercredi 12 juillet 2017

How to succinctly, portably, and thoroughly seed the mt19937 PRNG in C++?

I seem to see many answers in which someone suggests using <random> to generate random numbers, usually along with code like this:

std::random_device rd;  
std::mt19937 gen(rd());
std::uniform_int_distribution<> dis(0, 5);
dis(gen);

Usually this replaces some kind of "unholy abomination" such as:

srand(time(NULL));
rand()%6;

We might criticize the old way by arguing that time(NULL) provides low entropy, time(NULL) is predictable, and the end result is non-uniform.

But all of that is true of the new way: it just has a shinier veneer.

  • rd() returns a single unsigned int. This has at least 16 bits and probably 32. That's not enough to seed MT's 19937 bits of state.

  • Using std::mt19937 gen(rd());gen() (seeding with 32 bits and looking at the first output) doesn't give a good output distribution. 7 and 13 can never be the first output. Two seeds produce 0. Twelve seeds produce 1226181350. (Link)

  • std::random_device can be, and sometimes is, implemented as a simple PRNG with a fixed seed. It might therefore produce the same sequence on every run. (Link) This is even worse than time(NULL).

Worse yet, it is very easy to copy and paste the foregoing code snippets, despite the problems they contain. Some solutions to the this require acquiring largish libraries which may not be suitable to everyone.

In light of this, my question is How can one succinctly, portably, and thoroughly seed the mt19937 PRNG in C++?

Given the issues above, a good answer:

  • Must fully seed the mt19937/mt19937_64.
  • Cannot rely solely on std::random_device or time(NULL) as a source of entropy.
  • Should not rely on Boost or other libaries.
  • Should fit in a small number of lines such that it would look nice copy-pasted into an answer.

Thoughts

  • My current thought is that outputs from std::random_device can be mashed up (perhaps via XOR) with time(NULL), values derived from address space randomization, and a hard-coded constant (which could be set during distribution) to get a best-effort shot at entropy.

  • std::random_device::entropy() does not give a good indication of what std::random_device might or might not do.

Aucun commentaire:

Enregistrer un commentaire