Shuffle all input lines by assigning random weights and sorting.
randomizeLinesViaSort reads in all input lines and writes them out in random order.
The algorithm works by assigning a random value to each line and sorting. Both
weighted and unweighted shuffling are supported.
Notes:
For unweighted shuffling randomizeLinesViaShuffle is faster and should be used
unless compatibility mode is needed.
This routine is significantly faster than heap-based reservoir sampling in the
case where the entire file is being read.
Input data must be read entirely in memory. Disk oriented techniques are needed
when data sizes get too large for available memory. One option is to generate
random values for each line, e.g. --gen-random-inorder, and sort with a disk-
backed sort program like GNU sort.
Shuffle all input lines by assigning random weights and sorting.
randomizeLinesViaSort reads in all input lines and writes them out in random order. The algorithm works by assigning a random value to each line and sorting. Both weighted and unweighted shuffling are supported.
Notes: