The allowFieldNumZero flag is used as a template parameter controlling whether zero is a valid field. It is used by parseFieldList, parseNumericFieldList, and makeFieldListOptionHandler.
The consumeEntireFieldListString flag is used as a template parameter indicating whether the entire field-list string should be consumed. It is used by parseNumericFieldList.
The convertToZeroBasedIndex flag is used as a template parameter controlling whether field numbers are converted to zero-based indices. It is used by parseFieldList, parseNumericFieldList, and makeFieldListOptionHandler.
OptionHandlerDelegate is the signature of the delegate returned by makeFieldListOptionHandler.
findFieldGroups creates range that iterates over the 'field-groups' in a 'field-list'. (Private function.)
isMixedNumericNamedFieldGroup determines if a field group is a range where one element is a field number and the other element is a named field (not a number).
isNumericFieldGroup determines if a field-group is a valid numeric field-group. (Private function.)
isNumericFieldGroupWithHyphenFirstOrLast determines if a field-group is a field number with a leading or trailing hyphen. (Private function.)
makeFieldListOptionHandler creates a std.getopt option handler for processing field-lists entered on the command line. A field-list is as defined by parseNumericFieldList.
namedFieldGroupToRegex generates regular expressions for matching fields in named field-group to field names in a header line. (Private function.)
namedFieldRegexMatches returns an input range iterating over all the fields (strings) in an input range that match a regular expression. (Private function.)
parseFieldList returns a range iterating over the field numbers in a field-list.
parseNumericFieldGroup parses a single number or number range. E.g. '5' or '5-8'. (Private function.)
parseNumericFieldList lazily generates a range of fields numbers from a 'numeric field-list' string.
fieldListHelpText is text intended display to end users to describe the field-list syntax.
Utilities for parsing "field-lists" entered on the command line.
Field-lists
A "field-list" is entered on the command line to specify a set of fields for a command option. A field-list is a comma separated list of individual fields and "field-ranges". Fields are identified either by field number or by field names found in the header line of the input data. A field-range is a pair of fields separated by a hyphen and includes both the listed fields and all the fields in between.
Fields-lists are parsed into an ordered set of one-based field numbers. Repeating fields are allowed. Some examples of numeric fields with the tsv-select tool:
Fields specified by name must match a name in the header line of the input data. Glob-style wildcards are supported using the asterisk (*) character. When wildcards are used with a single field, all matching fields in the header are used. When used in a field range, both field names must match a single header field.
Consider a file data.tsv containing timing information:
The header fields are:
Some examples using named fields for this file. (Note: -H turns on header processing):
Both field numbers and fields names can both be used in the same field-list, except when specifying a field range:
A backslash is used to escape special characters occurring in field names. Characters that must be escaped when specifying them field names are: asterisk (*), comma(,), colon (:), space ( ), hyphen (-), and backslash (\). A backslash is also used to escape numbers that should be treated as field names rather than field numbers. Consider a file with the following header fields:
These fields can be used in named field commands as follows:
Fields lists are combined with other content in some command line options. The colon and space characters are both terminator characters for field-lists. Some examples:
Field-list support routines identify the termination of the field-list. They do not do any processing of content occurring after the field-list.
Numeric field-lists
The original field-lists used in tsv-utils were numeric only. This is still the format used when a header line is not available. They are a strict subset of the field-list syntax described so above. Due to this history there are support routines that only support numeric field-lists. They are used by tools supporting only numeric field lists. They are also used by the more general field-list processing routines in this file when a named field or field range can be reduced to a numeric field-group.
Field-list utilities
The following functions provide the APIs for field-list processing:
The following private functions handle key parts of the implementation: