c++ - Random sample from a large population runs into infinite loop -
i want draw n samples relatively large population without replacement. draw random numbers , keep track of previous choices, can resample whenever drew number twice:
boost::mt19937 generator; boost::uniform_int<> distribution(0, 1669 - 1); boost::variate_generator<boost::mt19937, boost::uniform_int<> > gen(generator, distribution); int n = 100; std::vector<int> idxs; while(static_cast<int>(idxs.size()) < n) { // random samples std::generate_n(std::back_inserter(idxs), n - idxs.size(), gen); // remove duplicates // keep that's not duplicates save time std::sort(idxs.begin(), idxs.end()); std::vector<int>::iterator = std::unique(idxs.begin(), idxs.end()); idxs.resize(std::distance(idxs.begin(), it)); }
unfortunately, run infinite loop constants used above.
i added output (that shows keeps picking same number) , stopping after 10 tries showing problem:
boost::mt19937 generator; boost::uniform_int<> distribution(0, 1669 - 1); boost::variate_generator<boost::mt19937, boost::uniform_int<> > gen(generator, distribution); int n = 100; int repeat = 0; std::vector<int> idxs; while(static_cast<int>(idxs.size()) < n) { if(repeat++ > 10) break; cout << "repeat " << repeat << ", " << idxs.size() << " elements" << endl; std::generate_n(std::back_inserter(idxs), n - idxs.size(), gen); cout << "last " << idxs.back() << endl; std::sort(idxs.begin(), idxs.end()); std::vector<int>::iterator = std::unique(idxs.begin(), idxs.end()); idxs.resize(std::distance(idxs.begin(), it)); }
the code prints
repeat 1, 0 elements last 1347 repeat 2, 99 elements last 1359 repeat 3, 99 elements last 1359
and on, , seems loop forever if don't kill program. shouldn't happen, right? unlucky? or doing wrong?
short solution @jxh! using reference helps:
boost::variate_generator<boost::mt19937&, boost::uniform_int<> > gen(generator, distribution);
the problem generate_n
creates copy of generator gen
created. so, @ end of call generate_n
, state of gen
unchanged. thus, each time re-loop, generate same sequence again.
one way fix use reference random number generator object in variate_generator
:*
boost::variate_generator<boost::mt19937&, boost::uniform_int<> > gen(generator, distribution);
* due limited experience boost, original suggestion rather clumsy. have adopted solution implemented asker in answer.
Comments
Post a Comment