helpTextVerbose

Undocumented in source.

auto helpTextVerbose = q"EOS Synopsis: tsv-join --filter-file file [options] [file...] tsv-join matches input lines (the 'data stream') against lines from a 'filter' file. The match is based on exact match comparison of one or more 'key' fields. Fields are TAB delimited by default. Input lines are read from files or standard input. Matching lines are written to standard output, along with any additional fields from the filter file that have been specified. For example: tsv-join --filter-file filter.tsv --key-fields 1 --append-fields 5,6 data.tsv This reads filter.tsv, creating a hash table keyed on field 1. Lines from data.tsv are read one at a time. If field 1 is found in the hash table, the line is written to standard output with fields 5 and 6 from the filter file appended. In database parlance this is a "hash semi join". Note the asymmetric relationship: Records in the filter file should be unique, but lines in the data stream (data.tsv) can repeat. Field names can be used instead of field numbers if the files have header lines. The following command is similar to the previous example, except using field names: tsv-join -H -f filter.tsv -k ID --append-fields Date,Time data.tsv tsv-join can also work as a simple filter based on the whole line. This is the default behavior. Example: tsv-join -f filter.tsv data.tsv This outputs all lines from data.tsv found in filter.tsv. Multiple fields can be specified as keys and append fields. Field numbers start at one, zero represents the whole line. Fields are comma separated and ranges can be used. Example: tsv-join -f filter.tsv -k 1,2 --append-fields 3-7 data.tsv The --e|exclude option can be used to exclude matched lines rather than keep them. The joins supported are similar to the "stream-static" joins available in Spark Structured Streaming and "KStream-KTable" joins in Kafka. The filter file plays the same role as the Spark static dataset or Kafka KTable. Options: EOS";

helpTextVerbose

Meta

Source