randomizeLinesViaSort

Randomize all the lines in files or standard input using assigned random weights and sorting.

All lines in files and/or standard input are read in and written out in random order. This algorithm assigns a random value to each line and sorts. This approach supports both weighted sampling and simple random sampling (unweighted).

This is significantly faster than heap-based reservoir sampling in the case where the entire file is being read. See also randomizeLinesViaShuffle for the unweighted case, as it is a little faster, at the cost not supporting random value printing or compatibility-mode.

Input data size is limited by available memory. Disk oriented techniques are needed when data sizes are larger. For example, generating random values line-by-line (ala --gen-random-inorder) and sorting with a disk-backed sort program like GNU sort.

void
randomizeLinesViaSort
(
Flag!"isWeighted" isWeighted
OutputRange
)
(,
auto ref OutputRange outputStream
)
if (
isOutputRange!(OutputRange, char)
)

Meta