randomizeLinesViaSort

Shuffle all input lines by assigning random weights and sorting.

randomizeLinesViaSort reads in all input lines and writes them out in random order. The algorithm works by assigning a random value to each line and sorting. Both weighted and unweighted shuffling are supported.

Notes:

  • For unweighted shuffling randomizeLinesViaShuffle is faster and should be used unless compatibility mode is needed.
  • This routine is significantly faster than heap-based reservoir sampling in the case where the entire file is being read.
  • Input data must be read entirely in memory. Disk oriented techniques are needed when data sizes get too large for available memory. One option is to generate random values for each line, e.g. --gen-random-inorder, and sort with a disk- backed sort program like GNU sort.
void
randomizeLinesViaSort
(
Flag!"isWeighted" isWeighted
OutputRange
)
(,
auto ref OutputRange outputStream
)
if (
isOutputRange!(OutputRange, char)
)

Meta