1 /** 2 Utilities for parsing "field-lists" entered on the command line. 3 4 # Field-lists 5 6 A "field-list" is entered on the command line to specify a set of fields for a 7 command option. A field-list is a comma separated list of individual fields and 8 "field-ranges". Fields are identified either by field number or by field names found 9 in the header line of the input data. A field-range is a pair of fields separated 10 by a hyphen and includes both the listed fields and all the fields in between. 11 12 $(NOTE Note: Internally, the comma separated entries in a field-list are called a 13 field-group.) 14 15 Fields-lists are parsed into an ordered set of one-based field numbers. Repeating 16 fields are allowed. Some examples of numeric fields with the `tsv-select` tool: 17 18 $(CONSOLE 19 $ tsv-select -f 3 # Field 3 20 $ tsv-select -f 3-5 # Fields 3,4,5 21 $ tsv-select -f 7,3-5 # Fields 7,3,4,5 22 $ tsv-select -f 3,5-3,5 # Fields 3,5,4,3,5 23 ) 24 25 Fields specified by name must match a name in the header line of the input data. 26 Glob-style wildcards are supported using the asterisk (`*`) character. When 27 wildcards are used with a single field, all matching fields in the header are used. 28 When used in a field range, both field names must match a single header field. 29 30 Consider a file `data.tsv` containing timing information: 31 32 $(CONSOLE 33 $ tsv-pretty data.tsv 34 run elapsed_time user_time system_time max_memory 35 1 57.5 52.0 5.5 1420 36 2 52.0 49.0 3.0 1270 37 3 55.5 51.0 4.5 1410 38 ) 39 40 The header fields are: 41 42 ``` 43 1 run 44 2 elapsed_time 45 3 user_time 46 4 system_time 47 5 max_memory 48 ``` 49 50 Some examples using named fields for this file. (Note: `-H` turns on header processing): 51 52 $(CONSOLE 53 $ tsv-select data.tsv -H -f user_time # Field 3 54 $ tsv-select data.tsv -H -f run,user_time # Fields 1,3 55 $ tsv-select data.tsv -H -f run-user_time # Fields 1,2,3 56 $ tsv-select data.tsv -H -f '*_memory' # Field 5 57 $ tsv-select data.tsv -H -f '*_time' # Fields 2,3,4 58 $ tsv-select data.tsv -H -f '*_time,*_memory' # Fields 2,3,4,5 59 $ tsv-select data.tsv -H -f '*_memory,*_time' # Fields 5,2,3,4 60 $ tsv-select data.tsv -H -f 'run-*_time' # Invalid range. '*_time' matches 3 fields 61 ) 62 63 Both field numbers and fields names can both be used in the same field-list, except 64 when specifying a field range: 65 66 $(CONSOLE 67 $ tsv-select data.tsv -H -f 1,user_time # Fields 1,3 68 $ tsv-select data.tsv -H -f 1-user_time # Invalid range 69 ) 70 71 A backslash is used to escape special characters occurring in field names. Characters 72 that must be escaped when specifying them field names are: asterisk (`*`), comma(`,`), 73 colon (`:`), space (` `), hyphen (`-`), and backslash (`\`). A backslash is also used 74 to escape numbers that should be treated as field names rather than field numbers. 75 Consider a file with the following header fields: 76 ``` 77 1 test id 78 2 run:id 79 3 time-stamp 80 4 001 81 5 100 82 ``` 83 84 These fields can be used in named field commands as follows: 85 86 $(CONSOLE 87 $ tsv-select file.tsv -H -f 'test\ id' # Field 1 88 $ tsv-select file.tsv -H -f 'run\:1' # Field 2 89 $ tsv-select file.tsv -H -f 'time\-stamp' # Field 3 90 $ tsv-select file.tsv -H -f '\001' # Field 4 91 $ tsv-select file.tsv -H -f '\100' # Field 5 92 $ tsv-select file.tsv -H -f '\001,\100' # Fields 4,5 93 ) 94 95 $(NOTE Note: The use of single quotes on the command line is necessary to avoid shell 96 interpretation of the backslash character.) 97 98 Fields lists are combined with other content in some command line options. The colon 99 and space characters are both terminator characters for field-lists. Some examples: 100 101 $(CONSOLE 102 $ tsv-filter -H --lt 3:100 # Field 3 < 100 103 $ tsv-filter -H --lt elapsed_time:100 # 'elapsed_time' field < 100 104 $ tsv-summarize -H --quantile '*_time:0.25,0.75' # 1st and 3rd quantiles for time fields 105 ) 106 107 Field-list support routines identify the termination of the field-list. They do not 108 do any processing of content occurring after the field-list. 109 110 # Numeric field-lists 111 112 The original field-lists used in tsv-utils were numeric only. This is still the 113 format used when a header line is not available. They are a strict subset of the 114 field-list syntax described so above. Due to this history there are support routines 115 that only support numeric field-lists. They are used by tools supporting only numeric 116 field lists. They are also used by the more general field-list processing routines in 117 this file when a named field or field range can be reduced to a numeric field-group. 118 119 # Field-list utilities 120 121 The following functions provide the APIs for field-list processing: 122 123 $(LIST 124 * [parseFieldList] - The main routine for parsing a field-list entered on the 125 command line. It returns a range iterating over the field numbers represented 126 by field-list. It handles both numeric and named field-lists and works with or 127 without header lines. The range has a special member function that tracks how 128 much of the original input range has been consumed. 129 130 * [parseNumericFieldList] - This is a top-level routine for processing numeric 131 field-lists entered on the command line. It was the original routine used by 132 tsv-utils tools when only numeric field-lists where supported. It is still 133 used in cases where only numeric field-lists are supported. 134 135 * [makeFieldListOptionHandler] - Returns a delegate that can be passed to 136 std.getopt for parsing numeric field-lists. It was part of the original code 137 supporting numeric field-lists. Note that delegates passed to std.getopt do 138 not have access to the header line of the input file, so the technique can 139 only be used for numeric field-lists. 140 141 * [fieldListHelpText] - A global variable containing help text describing the 142 field list syntax that can be shown to end users. 143 ) 144 145 The following private functions handle key parts of the implementation: 146 147 $(LIST 148 * [findFieldGroups] - Range that iterates over the "field-groups" in a 149 "field-list". 150 151 * [isNumericFieldGroup] - Determines if a field-group is a valid numeric 152 field-group. 153 154 * [isNumericFieldGroupWithHyphenFirstOrLast] - Determines if a field-group is a 155 valid numeric field-group, except for having a leading or trailing hyphen. 156 This test is used to provide better error messages. A field-group that does not 157 pass either [isNumericFieldGroup] or [isNumericFieldGroupWithHyphenFirstOrLast] 158 is processed as a named field-group. 159 160 * [isMixedNumericNamedFieldGroup] - determines if a field group is a range where 161 one element is a field number and the other element is a named field (not a 162 number). This is used for error handling. 163 164 * [namedFieldGroupToRegex] - Generates regexes for matching field names in a 165 field group to field names in the header line. One regex is generated for a 166 single field, two are generated for a range. Wildcards and escape characters 167 are translated into the correct regex format. 168 169 * [namedFieldRegexMatches] - Returns an input range iterating over all the 170 fields (strings) in a range matching a regular expression. It is used in 171 conjunction with [namedFieldGroupToRegex] to find the fields in a header line 172 matching a regular expression and map them to field numbers. 173 174 * [parseNumericFieldGroup] - A helper function that parses a numeric field 175 group (a string) and returns a range that iterates over all the field numbers 176 in the field group. A numeric field-group is either a single number or a 177 range. E.g. `5` or `5-8`. This routine was part of the original code 178 supporting only numeric field-lists. 179 ) 180 */ 181 182 module tsv_utils.common.fieldlist; 183 184 import std.exception : enforce; 185 import std.format : format; 186 import std.range; 187 import std.regex; 188 import std.stdio; 189 import std.traits : isIntegral, isNarrowString, isUnsigned, ReturnType, Unqual; 190 import std.typecons : tuple, Tuple; 191 192 /** 193 fieldListHelpText is text intended display to end users to describe the field-list 194 syntax. 195 */ 196 immutable fieldListHelpText = q"EOS 197 tsv-utils Field Syntax 198 199 Most tsv-utils tools operate on fields specified on the command line. All 200 tools use the same syntax to identify fields. tsv-select is used in this 201 document for examples, but the syntax shown applies to all tools. 202 203 Fields can be identified either by a one-upped field number or by field 204 name. Field names require the first line of input data to be a header with 205 field names. Header line processing is enabled by the '--H|header' option. 206 207 Some command options only accept a single field, but many operate on lists 208 of fields. Here are some examples (using tsv-select): 209 210 $ tsv-select -f 1,2 file.tsv # Selection using field numbers 211 $ tsv-select -f 5-9 file.txt # Selection using a range 212 $ tsv-select -H -f RecordID file.txt # Selection using a field name 213 $ tsv-select -H -f Date,Time,3,5-7,9 # Mix of names, numbers, ranges 214 215 Wildcards: Named fields support a simple 'glob' style wildcarding scheme. 216 The asterisk character ('*') can be used to match any sequence of 217 characters, including no characters. This is similar to how '*' can be 218 used to match file names on the Unix command line. All fields with 219 matching names are selected, so wildcards are a convenient way to select 220 a set of related fields. Quotes should be placed around command line 221 arguments containing wildcards to avoid interpretation by the shell. 222 223 Examples - Consider a file 'data.tsv' containing timing information: 224 225 $ tsv-pretty data.tsv 226 run elapsed_time user_time system_time max_memory 227 1 57.5 52.0 5.5 1420 228 2 52.0 49.0 3.0 1270 229 3 55.5 51.0 4.5 1410 230 231 Some examples selecting fields from this file: 232 233 $ tsv-select data.tsv -H -f 3 # Field 3 (user_time) 234 $ tsv-select data.tsv -H -f user_time # Field 3 235 $ tsv-select data.tsv -H -f run,user_time # Fields 1,3 236 $ tsv-select data.tsv -H -f '*_memory' # Field 5 237 $ tsv-select data.tsv -H -f '*_time' # Fields 2,3,4 238 $ tsv-select data.tsv -H -f 1-3 # Fields 1,2,3 239 $ tsv-select data.tsv -H -f run-user_time # Fields 1,2,3 (range with names) 240 241 Special characters: There are several special characters that need to be 242 escaped when specifying field names. Escaping is done by preceeding the 243 special character with a backslash. Characters requiring escapes are: 244 asterisk (`*`), comma(`,`), colon (`:`), space (` `), hyphen (`-`), and 245 backslash (`\`). A field name that contains only digits also needs to be 246 backslash escaped, this indicates it should be treated as a field name 247 and not a field number. A backslash can be used to escape any character, 248 so it's not necessary to remember the list. Use an escape when not sure. 249 250 Examples - Consider a file with five fields named as follows: 251 252 1 test id 253 2 run:id 254 3 time-stamp 255 4 001 256 5 100 257 258 Some examples using specifying these fields by name: 259 260 $ tsv-select file.tsv -H -f 'test\ id' # Field 1 261 $ tsv-select file.tsv -H -f '\test\ id' # Field 1 262 $ tsv-select file.tsv -H -f 'run\:1' # Field 2 263 $ tsv-select file.tsv -H -f 'time\-stamp' # Field 3 264 $ tsv-select file.tsv -H -f '\001' # Field 4 265 $ tsv-select file.tsv -H -f '\100' # Field 5 266 $ tsv-select file.tsv -H -f '\001,\100' # Fields 4,5 267 EOS"; 268 269 /** 270 The `convertToZeroBasedIndex` flag is used as a template parameter controlling 271 whether field numbers are converted to zero-based indices. It is used by 272 [parseFieldList], [parseNumericFieldList], and [makeFieldListOptionHandler]. 273 */ 274 alias ConvertToZeroBasedIndex = Flag!"convertToZeroBasedIndex"; 275 276 /** 277 The `allowFieldNumZero` flag is used as a template parameter controlling 278 whether zero is a valid field. It is used by [parseFieldList], 279 [parseNumericFieldList], and [makeFieldListOptionHandler]. 280 */ 281 alias AllowFieldNumZero = Flag!"allowFieldNumZero"; 282 283 /** 284 The `consumeEntireFieldListString` flag is used as a template parameter 285 indicating whether the entire field-list string should be consumed. It is 286 used by [parseNumericFieldList]. 287 */ 288 alias ConsumeEntireFieldListString = Flag!"consumeEntireFieldListString"; 289 290 /** 291 `parseFieldList` returns a range iterating over the field numbers in a field-list. 292 293 `parseFieldList` is the main routine for parsing field-lists entered on the command 294 line. It handles both numeric and named field-lists. The elements of the returned 295 range are sequence of 1-up field numbers corresponding to the fields specified in 296 the field-list string. 297 298 An error is thrown if the field-list string is malformed. The error text is 299 intended for display to the user invoking the tsv-utils tool from the command 300 line. 301 302 Named field-lists require an array of field names from the header line. Named 303 fields are allowed only if a header line is available. Using a named field-list 304 without a header line generates an error message referencing the headerCmdArg 305 string as a hint to the end user. 306 307 Several optional modes of operation are available: 308 309 $(LIST 310 * Conversion to zero-based indexes (`convertToZero` template parameter) - Returns 311 the field numbers as zero-based array indices rather than 1-based field numbers. 312 313 * Allow zero as a field number (`allowZero` template parameter) - This allows zero 314 to be used as a field number. This is typically used to allow the user to 315 specify the entire line rather than an individual field. Use a signed result 316 type if also using covertToZero, as this will be returned as (-1). 317 318 * Consuming the entire field list string (`consumeEntire` template parameter) - By 319 default, an error is thrown if the entire field-list string is not consumed. 320 This is the most common behavior. Turning this off (the `No` option) will 321 terminate processing without error when a valid field-list termination character 322 is found. The `parseFieldList.consumed` member function can be used to see where 323 in the input string processing terminated. 324 ) 325 326 The optional `cmdOptionString` and `headerCmdArg` arguments are used to generate better 327 error messages. `cmdOptionString` should be the command line arguments string passed to 328 `std.getopt`. e.g `"f|field"`. This is added to the error message. Callers already 329 adding the option name to the error message should pass the empty string. 330 331 The `headerCmdArg` argument should be the option for turning on header line processing. 332 This is standard for tsv-utils tools (`--H|header`), so most tsv-utils tools will use 333 the default value. 334 335 `parseFieldList` returns a reference range. This is so the `consumed` member function 336 remains valid when using the range with facilities that would copy a value-based 337 range. 338 */ 339 auto parseFieldList(T = size_t, 340 ConvertToZeroBasedIndex convertToZero = No.convertToZeroBasedIndex, 341 AllowFieldNumZero allowZero = No.allowFieldNumZero, 342 ConsumeEntireFieldListString consumeEntire = Yes.consumeEntireFieldListString) 343 (string fieldList, bool hasHeader = false, string[] headerFields = [], 344 string cmdOptionString = "", string headerCmdArg = "H|header") 345 if (isIntegral!T && (!allowZero || !convertToZero || !isUnsigned!T)) 346 { 347 final class Result 348 { 349 private string _fieldList; 350 private bool _hasHeader; 351 private string[] _headerFields; 352 private string _cmdOptionMsgPart; 353 private string _headerCmdArg; 354 private ReturnType!(findFieldGroups!string) _fieldGroupRange; 355 private bool _isFrontNumericRange; 356 private ReturnType!(parseNumericFieldGroup!(T, convertToZero, allowZero)) _numericFieldRange; 357 private ReturnType!(namedFieldRegexMatches!(T, convertToZero, string[])) _namedFieldMatches; 358 private size_t _consumed; 359 360 this(string fieldList, bool hasHeader, string[] headerFields, 361 string cmdOptionString, string headerCmdArg) 362 { 363 _fieldList = fieldList; 364 _hasHeader = hasHeader; 365 _headerFields = headerFields.dup; 366 if (!cmdOptionString.empty) _cmdOptionMsgPart = "[--" ~ cmdOptionString ~ "] "; 367 if (!headerCmdArg.empty) _headerCmdArg = "--" ~ headerCmdArg; 368 _fieldGroupRange = findFieldGroups(fieldList); 369 370 /* _namedFieldMatches must be initialized in the constructor because it 371 * is a nested struct. 372 */ 373 _namedFieldMatches = namedFieldRegexMatches!(T, convertToZero)(["X"], ctRegex!`^No Match$`); 374 375 try 376 { 377 consumeNextFieldGroup(); 378 enforce(!empty, format("Empty field list: '%s'.", _fieldList)); 379 } 380 catch (Exception e) 381 { 382 throw new Exception(_cmdOptionMsgPart ~ e.msg); 383 } 384 385 assert(_consumed <= _fieldList.length); 386 } 387 388 private void consumeNextFieldGroup() 389 { 390 if (!_fieldGroupRange.empty) 391 { 392 auto fieldGroup = _fieldGroupRange.front.value; 393 _consumed = _fieldGroupRange.front.consumed; 394 _fieldGroupRange.popFront; 395 396 enforce(!fieldGroup.isNumericFieldGroupWithHyphenFirstOrLast, 397 format("Incomplete ranges are not supported: '%s'.", 398 fieldGroup)); 399 400 if (fieldGroup.isNumericFieldGroup) 401 { 402 _isFrontNumericRange = true; 403 _numericFieldRange = 404 parseNumericFieldGroup!(T, convertToZero, allowZero)(fieldGroup); 405 } 406 else 407 { 408 enforce(_hasHeader, 409 format("Non-numeric field group: '%s'. Use '%s' when using named field groups.", 410 fieldGroup, _headerCmdArg)); 411 412 enforce(!fieldGroup.isMixedNumericNamedFieldGroup, 413 format("Ranges with both numeric and named components are not supported: '%s'.", 414 fieldGroup)); 415 416 auto fieldGroupRegex = namedFieldGroupToRegex(fieldGroup); 417 418 if (!fieldGroupRegex[1].empty) 419 { 420 /* A range formed by a pair of field names. Find the field 421 * numbers and generate the string form of the numeric 422 * field-group. Pass this to parseNumberFieldRange. 423 */ 424 auto f0 = namedFieldRegexMatches(_headerFields, fieldGroupRegex[0]).array; 425 auto f1 = namedFieldRegexMatches(_headerFields, fieldGroupRegex[1]).array; 426 427 string hintMsg = "Not specifying a range? Backslash escape any hyphens in the field name."; 428 429 enforce(f0.length > 0, 430 format("First field in range not found in file header. Range: '%s'.\n%s", 431 fieldGroup, hintMsg)); 432 enforce(f1.length > 0, 433 format("Second field in range not found in file header. Range: '%s'.\n%s", 434 fieldGroup, hintMsg)); 435 enforce(f0.length == 1, 436 format("First field in range matches multiple header fields. Range: '%s'.\n%s", 437 fieldGroup, hintMsg)); 438 enforce(f1.length == 1, 439 format("Second field in range matches multiple header fields. Range: '%s'.\n%s", 440 fieldGroup, hintMsg)); 441 442 _isFrontNumericRange = true; 443 auto fieldGroupAsNumericRange = format("%d-%d", f0[0][0], f1[0][0]); 444 _numericFieldRange = 445 parseNumericFieldGroup!(T, convertToZero, allowZero)(fieldGroupAsNumericRange); 446 } 447 else 448 { 449 enforce (!fieldGroupRegex[0].empty, "Empty field list entry: '%s'.", fieldGroup); 450 451 _isFrontNumericRange = false; 452 _namedFieldMatches = 453 namedFieldRegexMatches!(T, convertToZero)(_headerFields, fieldGroupRegex[0]); 454 455 enforce(!_namedFieldMatches.empty, 456 format("Field not found in file header: '%s'.", fieldGroup)); 457 } 458 } 459 } 460 } 461 462 bool empty() @safe 463 { 464 return _fieldGroupRange.empty && 465 (_isFrontNumericRange ? _numericFieldRange.empty : _namedFieldMatches.empty); 466 } 467 468 @property T front() @safe 469 { 470 assert(!empty, "Attempting to fetch the front of an empty field list."); 471 return _isFrontNumericRange ? _numericFieldRange.front : _namedFieldMatches.front[0]; 472 } 473 474 void popFront() @safe 475 { 476 477 /* TODO: Move these definitions to a common location in the file. */ 478 enum char SPACE = ' '; 479 enum char COLON = ':'; 480 481 assert(!empty, "Attempting to popFront an empty field-list."); 482 483 try 484 { 485 if (_isFrontNumericRange) _numericFieldRange.popFront; 486 else _namedFieldMatches.popFront; 487 488 if (_isFrontNumericRange ? _numericFieldRange.empty : _namedFieldMatches.empty) 489 { 490 consumeNextFieldGroup(); 491 } 492 493 assert(_consumed <= _fieldList.length); 494 495 if (empty) 496 { 497 static if (consumeEntire) 498 { 499 enforce(_consumed == _fieldList.length, 500 format("Invalid field list: '%s'.", _fieldList)); 501 } 502 else 503 { 504 enforce((_consumed == _fieldList.length || 505 _fieldList[_consumed] == SPACE || 506 _fieldList[_consumed] == COLON), 507 format("Invalid field list: '%s'.", _fieldList)); 508 } 509 } 510 } 511 catch (Exception e) 512 { 513 throw new Exception(_cmdOptionMsgPart ~ e.msg); 514 } 515 } 516 517 size_t consumed() const nothrow pure @safe 518 { 519 return _consumed; 520 } 521 } 522 523 return new Result(fieldList, hasHeader, headerFields, cmdOptionString, headerCmdArg); 524 } 525 526 /// Basic cases showing how `parseFieldList` works 527 @safe unittest 528 { 529 import std.algorithm : each, equal; 530 531 string[] emptyHeader = []; 532 533 // Numeric field-lists, with no header line. 534 assert(`5`.parseFieldList 535 .equal([5])); 536 537 assert(`10`.parseFieldList(false, emptyHeader) 538 .equal([10])); 539 540 assert(`1-3,17`.parseFieldList(false, emptyHeader) 541 .equal([1, 2, 3, 17])); 542 543 // General field lists, when a header line is available 544 assert(`5,1-3`.parseFieldList(true, [`f1`, `f2`, `f3`, `f4`, `f5`]) 545 .equal([5, 1, 2, 3])); 546 547 assert(`f1`.parseFieldList(true, [`f1`, `f2`, `f3`]) 548 .equal([1])); 549 550 assert(`f3`.parseFieldList(true, [`f1`, `f2`, `f3`]) 551 .equal([3])); 552 553 assert(`f1-f3`.parseFieldList(true, [`f1`, `f2`, `f3`]) 554 .equal([1, 2, 3])); 555 556 assert(`f3-f1`.parseFieldList(true, [`f1`, `f2`, `f3`]) 557 .equal([3, 2, 1])); 558 559 assert(`f*`.parseFieldList(true, [`f1`, `f2`, `f3`]) 560 .equal([1, 2, 3])); 561 562 assert(`B*`.parseFieldList(true, [`A1`, `A2`, `B1`, `B2`]) 563 .equal([3, 4])); 564 565 assert(`*2`.parseFieldList(true, [`A1`, `A2`, `B1`, `B2`]) 566 .equal([2, 4])); 567 568 assert(`1-2,f4`.parseFieldList(true, [`f1`, `f2`, `f3`, `f4`, `f5`]) 569 .equal([1, 2, 4])); 570 571 /* The next few examples are closer to the code that would really be 572 * used during in command line arg processing. 573 */ 574 { 575 string getoptOption = "f|fields"; 576 bool hasHeader = true; 577 auto headerFields = [`A1`, `A2`, `B1`, `B2`]; 578 auto fieldListCmdArg = `B*,A1`; 579 auto fieldNumbers = fieldListCmdArg.parseFieldList(hasHeader, headerFields, getoptOption); 580 assert(fieldNumbers.equal([3, 4, 1])); 581 assert(fieldNumbers.consumed == fieldListCmdArg.length); 582 } 583 { 584 /* Supplimentary options after the field-list. */ 585 string getoptOption = "f|fields"; 586 bool hasHeader = false; 587 string[] headerFields; 588 auto fieldListCmdArg = `3,4:option`; 589 auto fieldNumbers = 590 fieldListCmdArg.parseFieldList!(size_t, No.convertToZeroBasedIndex, 591 No.allowFieldNumZero, No.consumeEntireFieldListString) 592 (hasHeader, headerFields, getoptOption); 593 assert(fieldNumbers.equal([3, 4])); 594 assert(fieldNumbers.consumed == 3); 595 assert(fieldListCmdArg[fieldNumbers.consumed .. $] == `:option`); 596 } 597 { 598 /* Supplimentary options after the field-list. */ 599 string getoptOption = "f|fields"; 600 bool hasHeader = true; 601 auto headerFields = [`A1`, `A2`, `B1`, `B2`]; 602 auto fieldListCmdArg = `B*:option`; 603 auto fieldNumbers = 604 fieldListCmdArg.parseFieldList!(size_t, No.convertToZeroBasedIndex, 605 No.allowFieldNumZero, No.consumeEntireFieldListString) 606 (hasHeader, headerFields, getoptOption); 607 assert(fieldNumbers.equal([3, 4])); 608 assert(fieldNumbers.consumed == 2); 609 assert(fieldListCmdArg[fieldNumbers.consumed .. $] == `:option`); 610 } 611 { 612 /* Supplementary options after the field-list. */ 613 string getoptOption = "f|fields"; 614 bool hasHeader = true; 615 auto headerFields = [`A1`, `A2`, `B1`, `B2`]; 616 auto fieldListCmdArg = `B* option`; 617 auto fieldNumbers = 618 fieldListCmdArg.parseFieldList!(size_t, No.convertToZeroBasedIndex, 619 No.allowFieldNumZero, No.consumeEntireFieldListString) 620 (hasHeader, headerFields, getoptOption); 621 assert(fieldNumbers.equal([3, 4])); 622 assert(fieldNumbers.consumed == 2); 623 assert(fieldListCmdArg[fieldNumbers.consumed .. $] == ` option`); 624 } 625 { 626 /* Mixed numeric and named fields. */ 627 string getoptOption = "f|fields"; 628 bool hasHeader = true; 629 auto headerFields = [`A1`, `A2`, `B1`, `B2`]; 630 auto fieldListCmdArg = `B2,1`; 631 auto fieldNumbers = 632 fieldListCmdArg.parseFieldList!(size_t, No.convertToZeroBasedIndex, 633 No.allowFieldNumZero, No.consumeEntireFieldListString) 634 (hasHeader, headerFields, getoptOption); 635 assert(fieldNumbers.equal([4, 1])); 636 assert(fieldNumbers.consumed == fieldListCmdArg.length); 637 } 638 } 639 640 // parseFieldList - Empty and erroneous field list tests 641 @safe unittest 642 { 643 import std.exception : assertThrown, assertNotThrown; 644 645 assertThrown(``.parseFieldList); 646 assertThrown(`,`.parseFieldList); 647 assertThrown(`:`.parseFieldList); 648 assertThrown(` `.parseFieldList); 649 assertThrown(`\`.parseFieldList); 650 assertThrown(`,x`.parseFieldList); 651 assertThrown(`:option`.parseFieldList); 652 assertThrown(` option`.parseFieldList); 653 assertThrown(`:1-3`.parseFieldList); 654 655 { 656 string getoptOption = "f|fields"; 657 string cmdHeaderOption = "header"; 658 bool hasHeader = true; 659 auto headerFields = [`A1`, `A2`, `B1`, `B2`]; 660 auto fieldListCmdArg = `XYZ`; 661 size_t[] fieldNumbers; 662 bool wasCaught = false; 663 try fieldNumbers = fieldListCmdArg.parseFieldList(hasHeader, headerFields, getoptOption).array; 664 catch (Exception e) 665 { 666 wasCaught = true; 667 assert(e.msg == "[--f|fields] Field not found in file header: 'XYZ'."); 668 } 669 finally assert(wasCaught); 670 } 671 { 672 string getoptOption = "f|fields"; 673 bool hasHeader = false; // hasHeader=false triggers this error. 674 auto headerFields = [`A1`, `A2`, `B1`, `B2`]; 675 auto fieldListCmdArg = `A1`; 676 size_t[] fieldNumbers; 677 bool wasCaught = false; 678 679 try fieldNumbers = fieldListCmdArg.parseFieldList(hasHeader, headerFields, getoptOption).array; 680 catch (Exception e) 681 { 682 wasCaught = true; 683 assert(e.msg == "[--f|fields] Non-numeric field group: 'A1'. Use '--H|header' when using named field groups."); 684 } 685 finally assert(wasCaught); 686 687 string cmdHeaderOption = "ZETA"; 688 689 try fieldNumbers = fieldListCmdArg.parseFieldList(hasHeader, headerFields, getoptOption, cmdHeaderOption).array; 690 catch (Exception e) 691 { 692 wasCaught = true; 693 assert(e.msg == "[--f|fields] Non-numeric field group: 'A1'. Use '--ZETA' when using named field groups."); 694 } 695 finally assert(wasCaught); 696 } 697 { 698 bool hasHeader = true; 699 auto headerFields = [`A1`, `A2`, `B1`, `B2`]; 700 701 assertThrown(`XYZ`.parseFieldList(hasHeader, headerFields)); 702 assertThrown(`XYZ-B1`.parseFieldList(hasHeader, headerFields)); 703 assertThrown(`B1-XYZ`.parseFieldList(hasHeader, headerFields)); 704 assertThrown(`A*-B1`.parseFieldList(hasHeader, headerFields)); 705 assertThrown(`B1-A*`.parseFieldList(hasHeader, headerFields)); 706 assertThrown(`B1-`.parseFieldList(hasHeader, headerFields)); 707 assertThrown(`-A1`.parseFieldList(hasHeader, headerFields)); 708 assertThrown(`A1-3`.parseFieldList(hasHeader, headerFields)); 709 assertThrown(`1-A3`.parseFieldList(hasHeader, headerFields)); 710 } 711 712 } 713 714 //parseFieldList - Named field groups 715 @safe unittest 716 { 717 import std.algorithm : each, equal; 718 719 bool hasHeader = true; 720 auto singleFieldHeader = [`a`]; 721 722 assert(`a`.parseFieldList(hasHeader, singleFieldHeader) 723 .equal([1])); 724 725 assert(`a*`.parseFieldList(hasHeader, singleFieldHeader) 726 .equal([1])); 727 728 assert(`*a`.parseFieldList(hasHeader, singleFieldHeader) 729 .equal([1])); 730 731 assert(`*a*`.parseFieldList(hasHeader, singleFieldHeader) 732 .equal([1])); 733 734 assert(`*`.parseFieldList(hasHeader, singleFieldHeader) 735 .equal([1])); 736 737 auto twoFieldHeader = [`f1`, `f2`]; 738 739 assert(`f1`.parseFieldList(hasHeader, twoFieldHeader) 740 .equal([1])); 741 742 assert(`f2`.parseFieldList(hasHeader, twoFieldHeader) 743 .equal([2])); 744 745 assert(`f1,f2`.parseFieldList(hasHeader, twoFieldHeader) 746 .equal([1, 2])); 747 748 assert(`f2,f1`.parseFieldList(hasHeader, twoFieldHeader) 749 .equal([2, 1])); 750 751 assert(`f1-f2`.parseFieldList(hasHeader, twoFieldHeader) 752 .equal([1, 2])); 753 754 assert(`f2-f1`.parseFieldList(hasHeader, twoFieldHeader) 755 .equal([2, 1])); 756 757 assert(`*`.parseFieldList(hasHeader, twoFieldHeader) 758 .equal([1, 2])); 759 760 auto multiFieldHeader = [`f1`, `f2`, `x`, `01`, `02`, `3`, `snow storm`, `雪风暴`, `Tempête de neige`, `x`]; 761 762 assert(`*`.parseFieldList(hasHeader, multiFieldHeader) 763 .equal([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])); 764 765 assert(`*2`.parseFieldList(hasHeader, multiFieldHeader) 766 .equal([2, 5])); 767 768 assert(`snow*`.parseFieldList(hasHeader, multiFieldHeader) 769 .equal([7])); 770 771 assert(`snow\ storm`.parseFieldList(hasHeader, multiFieldHeader) 772 .equal([7])); 773 774 assert(`雪风暴`.parseFieldList(hasHeader, multiFieldHeader) 775 .equal([8])); 776 777 assert(`雪风*`.parseFieldList(hasHeader, multiFieldHeader) 778 .equal([8])); 779 780 assert(`*风*`.parseFieldList(hasHeader, multiFieldHeader) 781 .equal([8])); 782 783 assert(`Tempête\ de\ neige`.parseFieldList(hasHeader, multiFieldHeader) 784 .equal([9])); 785 786 assert(`x`.parseFieldList(hasHeader, multiFieldHeader) 787 .equal([3, 10])); 788 789 /* Convert to zero - A subset of the above tests. */ 790 assert(`a`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, singleFieldHeader) 791 .equal([0])); 792 793 assert(`a*`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, singleFieldHeader) 794 .equal([0])); 795 796 assert(`f1`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, twoFieldHeader) 797 .equal([0])); 798 799 assert(`f2`.parseFieldList!(long, Yes.convertToZeroBasedIndex)(hasHeader, twoFieldHeader) 800 .equal([1])); 801 802 assert(`f2,f1`.parseFieldList!(int, Yes.convertToZeroBasedIndex)(hasHeader, twoFieldHeader) 803 .equal([1, 0])); 804 805 assert(`f2-f1`.parseFieldList!(uint, Yes.convertToZeroBasedIndex)(hasHeader, twoFieldHeader) 806 .equal([1, 0])); 807 808 assert(`*`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, multiFieldHeader) 809 .equal([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])); 810 811 assert(`*2`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, multiFieldHeader) 812 .equal([1, 4])); 813 814 assert(`snow*`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, multiFieldHeader) 815 .equal([6])); 816 817 assert(`snow\ storm`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, multiFieldHeader) 818 .equal([6])); 819 820 assert(`雪风暴`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, multiFieldHeader) 821 .equal([7])); 822 823 assert(`雪风*`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, multiFieldHeader) 824 .equal([7])); 825 826 assert(`x`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex)(hasHeader, multiFieldHeader) 827 .equal([2, 9])); 828 829 /* Allow zero tests. */ 830 assert(`0,f1`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero) 831 (hasHeader, twoFieldHeader) 832 .equal([-1, 0])); 833 834 assert(`f2,0`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero) 835 (hasHeader, twoFieldHeader) 836 .equal([1, -1])); 837 838 assert(`f2,f1,0`.parseFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero) 839 (hasHeader, twoFieldHeader) 840 .equal([2, 1, 0])); 841 842 assert(`0,f2-f1`.parseFieldList!(uint, No.convertToZeroBasedIndex, Yes.allowFieldNumZero) 843 (hasHeader, twoFieldHeader) 844 .equal([0, 2, 1])); 845 846 assert(`*,0`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero) 847 (hasHeader, multiFieldHeader) 848 .equal([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 0])); 849 850 assert(`0,snow\ storm`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero) 851 (hasHeader, multiFieldHeader) 852 .equal([0,7])); 853 } 854 855 // parseFieldList - The same tests as used for parseNumericFieldGroup 856 @safe unittest 857 { 858 import std.algorithm : each, equal; 859 import std.exception : assertThrown, assertNotThrown; 860 861 /* Basic tests. */ 862 assert(`1`.parseFieldList.equal([1])); 863 assert(`1,2`.parseFieldList.equal([1, 2])); 864 assert(`1,2,3`.parseFieldList.equal([1, 2, 3])); 865 assert(`1-2`.parseFieldList.equal([1, 2])); 866 assert(`1-2,6-4`.parseFieldList.equal([1, 2, 6, 5, 4])); 867 assert(`1-2,1,1-2,2,2-1`.parseFieldList.equal([1, 2, 1, 1, 2, 2, 2, 1])); 868 assert(`1-2,5`.parseFieldList!size_t.equal([1, 2, 5])); 869 870 /* Signed Int tests */ 871 assert(`1`.parseFieldList!int.equal([1])); 872 assert(`1,2,3`.parseFieldList!int.equal([1, 2, 3])); 873 assert(`1-2`.parseFieldList!int.equal([1, 2])); 874 assert(`1-2,6-4`.parseFieldList!int.equal([1, 2, 6, 5, 4])); 875 assert(`1-2,5`.parseFieldList!int.equal([1, 2, 5])); 876 877 /* Convert to zero tests */ 878 assert(`1`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0])); 879 assert(`1,2,3`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0, 1, 2])); 880 assert(`1-2`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0, 1])); 881 assert(`1-2,6-4`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0, 1, 5, 4, 3])); 882 assert(`1-2,5`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0, 1, 4])); 883 884 assert(`1`.parseFieldList!(long, Yes.convertToZeroBasedIndex).equal([0])); 885 assert(`1,2,3`.parseFieldList!(long, Yes.convertToZeroBasedIndex).equal([0, 1, 2])); 886 assert(`1-2`.parseFieldList!(long, Yes.convertToZeroBasedIndex).equal([0, 1])); 887 assert(`1-2,6-4`.parseFieldList!(long, Yes.convertToZeroBasedIndex).equal([0, 1, 5, 4, 3])); 888 assert(`1-2,5`.parseFieldList!(long, Yes.convertToZeroBasedIndex).equal([0, 1, 4])); 889 890 /* Allow zero tests. */ 891 assert(`0`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 892 assert(`1,0,3`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([1, 0, 3])); 893 assert(`1-2,5`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([1, 2, 5])); 894 assert(`0`.parseFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 895 assert(`1,0,3`.parseFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([1, 0, 3])); 896 assert(`1-2,5`.parseFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([1, 2, 5])); 897 assert(`0`.parseFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([-1])); 898 assert(`1,0,3`.parseFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0, -1, 2])); 899 assert(`1-2,5`.parseFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0, 1, 4])); 900 901 /* Error cases. */ 902 assertThrown(``.parseFieldList.each); 903 assertThrown(` `.parseFieldList.each); 904 assertThrown(`,`.parseFieldList.each); 905 assertThrown(`5 6`.parseFieldList.each); 906 assertThrown(`,7`.parseFieldList.each); 907 assertThrown(`8,`.parseFieldList.each); 908 assertThrown(`8,9,`.parseFieldList.each); 909 assertThrown(`10,,11`.parseFieldList.each); 910 assertThrown(``.parseFieldList!(long, Yes.convertToZeroBasedIndex).each); 911 assertThrown(`1,2-3,`.parseFieldList!(long, Yes.convertToZeroBasedIndex).each); 912 assertThrown(`2-,4`.parseFieldList!(long, Yes.convertToZeroBasedIndex).each); 913 assertThrown(`1,2,3,,4`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 914 assertThrown(`,7`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 915 assertThrown(`8,`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 916 assertThrown(`10,0,,11`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 917 assertThrown(`8,9,`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 918 assertThrown(`0`.parseFieldList.each); 919 assertThrown(`1,0,3`.parseFieldList.each); 920 assertThrown(`0`.parseFieldList!(int, Yes.convertToZeroBasedIndex, No.allowFieldNumZero).each); 921 assertThrown(`1,0,3`.parseFieldList!(int, Yes.convertToZeroBasedIndex, No.allowFieldNumZero).each); 922 assertThrown(`0-2,6-0`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 923 assertThrown(`0-2,6-0`.parseFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 924 assertThrown(`0-2,6-0`.parseFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 925 } 926 927 // parseFieldList - Subset of tests used for parseNumericFieldGroup, but allowing non-consumed characters. 928 @safe unittest 929 { 930 import std.algorithm : each, equal; 931 import std.exception : assertThrown, assertNotThrown; 932 933 /* Basic tests. */ 934 assert(`1`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 935 .equal([1])); 936 assert(`1,2`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 937 .equal([1, 2])); 938 assert(`1,2,3`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 939 .equal([1, 2, 3])); 940 assert(`1-2`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 941 .equal([1, 2])); 942 assert(`1-2,6-4`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 943 .equal([1, 2, 6, 5, 4])); 944 assert(`1-2,1,1-2,2,2-1`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 945 .equal([1, 2, 1, 1, 2, 2, 2, 1])); 946 assert(`1-2,5`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 947 .equal([1, 2, 5])); 948 949 /* Signed Int tests. */ 950 assert(`1`.parseFieldList!(int, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 951 .equal([1])); 952 assert(`1,2,3`.parseFieldList!(int, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 953 .equal([1, 2, 3])); 954 assert(`1-2`.parseFieldList!(int, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 955 .equal([1, 2])); 956 assert(`1-2,6-4`.parseFieldList!(int, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 957 .equal([1, 2, 6, 5, 4])); 958 assert(`1-2,5`.parseFieldList!(int, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 959 .equal([1, 2, 5])); 960 961 /* Convert to zero tests */ 962 assert(`1`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 963 .equal([0])); 964 assert(`1,2,3`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 965 .equal([0, 1, 2])); 966 assert(`1-2`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 967 .equal([0, 1])); 968 assert(`1-2,6-4`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 969 .equal([0, 1, 5, 4, 3])); 970 assert(`1-2,5`.parseFieldList!(size_t, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString) 971 .equal([0, 1, 4])); 972 973 /* Allow zero tests. */ 974 assert(`0`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString) 975 .equal([0])); 976 assert(`1,0,3`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString) 977 .equal([1, 0, 3])); 978 assert(`1-2,5`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString) 979 .equal([1, 2, 5])); 980 assert(`0`.parseFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString) 981 .equal([-1])); 982 assert(`1,0,3`.parseFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString) 983 .equal([0, -1, 2])); 984 assert(`1-2,5`.parseFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString) 985 .equal([0, 1, 4])); 986 987 /* Error cases. */ 988 assertThrown(``.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 989 assertThrown(` `.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 990 assertThrown(`,`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 991 assertThrown(`,7`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 992 993 assertThrown(``.parseFieldList!(long, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 994 assertThrown(`2-,4`.parseFieldList!(long, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 995 assertThrown(`,7`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString).each); 996 997 assertThrown(`0`.parseFieldList!(int, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 998 assertThrown(`1,0,3`.parseFieldList!(int, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 999 1000 assertThrown(`0`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 1001 assertThrown(`1,0,3`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 1002 1003 assertThrown(`0-2,6-0`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString).each); 1004 assertThrown(`0-2,6-0`.parseFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString).each); 1005 assertThrown(`0-2,6-0`.parseFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString).each); 1006 1007 /* Allowed termination without consuming entire string. */ 1008 { 1009 auto x = `5:abc`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString); 1010 assert(x.equal([5])); 1011 assert(x.consumed == 1); 1012 } 1013 1014 { 1015 auto x = `1-3,6-10:abc`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString); 1016 assert(x.equal([1, 2, 3, 6, 7, 8, 9, 10])); 1017 assert(x.consumed == 8); 1018 } 1019 1020 { 1021 auto x = `1-3,6-10 xyz`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString); 1022 assert(x.equal([1, 2, 3, 6, 7, 8, 9, 10])); 1023 assert(x.consumed == 8); 1024 } 1025 1026 { 1027 auto x = `5 6`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString); 1028 assert(x.equal([5])); 1029 assert(x.consumed == 1); 1030 } 1031 1032 /* Invalid termination when not consuming the entire string. */ 1033 assertThrown(`8,`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 1034 assertThrown(`8,9,`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 1035 assertThrown(`10,,11`.parseFieldList!(size_t, No.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 1036 assertThrown(`1,2-3,`.parseFieldList!(long, Yes.convertToZeroBasedIndex, No.allowFieldNumZero, No.consumeEntireFieldListString).each); 1037 assertThrown(`1,2,3,,4`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString).each); 1038 assertThrown(`8,`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString).each); 1039 assertThrown(`10,0,,11`.parseFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString).each); 1040 assertThrown(`8,9,`.parseFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero, No.consumeEntireFieldListString).each); 1041 } 1042 1043 /** 1044 `findFieldGroups` creates range that iterates over the 'field-groups' in a 'field-list'. 1045 (Private function.) 1046 1047 Input is typically a string or character array. The range becomes empty when the end 1048 of input is reached or an unescaped field-list terminator character is found. 1049 1050 A 'field-list' is a comma separated list of 'field-groups'. A 'field-group' is a 1051 single numeric or named field, or a hyphen-separated pair of numeric or named fields. 1052 For example: 1053 1054 ``` 1055 1,3,4-7 # 3 numeric field-groups 1056 field_a,field_b # 2 named fields 1057 ``` 1058 1059 Each element in the range is represented by a tuple of two values: 1060 1061 $(LIST 1062 * consumed - The total index positions consumed by the range so far 1063 * value - A slice containing the text of the field-group. 1064 ) 1065 1066 The field-group slice does not contain the separator character, but this is included 1067 in the total consumed. The field-group tuples from the previous examples: 1068 1069 ``` 1070 Input: 1,2,4-7 1071 tuple(1, "1") 1072 tuple(3, "2") 1073 tuple(7, "4-7") 1074 1075 Input: field_a,field_b 1076 tuple(7, "field_a") 1077 tuple(8, "field_b") 1078 ``` 1079 1080 The details of field-groups are not material to this routine, it is only concerned 1081 with finding the boundaries between field-groups and the termination boundary for the 1082 field-list. This is relatively straightforward. The main parsing concern is the use 1083 of escape character when delimiter characters are included in field names. 1084 1085 Field-groups are separated by a single comma (','). A field-list is terminated by a 1086 colon (':') or space (' ') character. Comma, colon, and space characters can be 1087 included in a field-group by preceding them with a backslash. A backslash not 1088 intended as an escape character must also be backslash escaped. 1089 1090 A field-list is also terminated if an unescaped backslash is encountered or a pair 1091 of consecutive commas. This is normally an error, but handling of these cases is left 1092 to the caller. 1093 1094 Additional characters need to be backslash escaped inside field-groups, the asterisk 1095 ('*') and hyphen ('-') characters in particular. However, this routine needs only be 1096 aware of characters that affect field-list and field-group boundaries, which are the 1097 set listed above. 1098 1099 Backslash escape sequences are recognized but not removed from field-groups. 1100 1101 Field and record delimiter characters (usually TAB and newline) are not handled by 1102 this routine. They cannot be used in field names as there is no way to represent them 1103 in the header line. However, it is not necessary for this routine to check for them, 1104 these checks occurs naturally when processing header lines. 1105 1106 $(ALWAYS_DOCUMENT) 1107 */ 1108 private auto findFieldGroups(Range)(Range r) 1109 if (isInputRange!Range && 1110 (is(Unqual!(ElementEncodingType!Range) == char) || is(Unqual!(ElementEncodingType!Range) == ubyte)) && 1111 (isNarrowString!Range || (isRandomAccessRange!Range && 1112 hasSlicing!Range && 1113 hasLength!Range)) 1114 ) 1115 { 1116 static struct Result 1117 { 1118 private alias R = Unqual!Range; 1119 private alias Char = ElementType!R; 1120 private alias ResultType = Tuple!(size_t, "consumed", R, "value"); 1121 1122 private R _input; 1123 private R _front; 1124 private size_t _consumed; 1125 1126 this(Range data) nothrow pure @safe 1127 { 1128 auto fieldGroup = nextFieldGroup!true(data); 1129 assert(fieldGroup.start == 0); 1130 1131 _front = data[0 .. fieldGroup.end]; 1132 _consumed = fieldGroup.end; 1133 _input = data[fieldGroup.end .. $]; 1134 1135 // writefln("[this] data: '%s', _front: '%s', _input: '%s', _frontEnd: %d", data, _front, _input, _frontEnd); 1136 } 1137 1138 bool empty() const nothrow pure @safe 1139 { 1140 return _front.empty; 1141 } 1142 1143 ResultType front() const nothrow pure @safe 1144 { 1145 assert(!empty, "Attempt to take the front of an empty findFieldGroups."); 1146 1147 return ResultType(_consumed, _front); 1148 } 1149 1150 void popFront() nothrow pure @safe 1151 { 1152 assert(!empty, "Attempt to popFront an empty findFieldGroups."); 1153 1154 auto fieldGroup = nextFieldGroup!false(_input); 1155 1156 // writefln("[popFront] _input: '%s', next start: %d, next end: %d", _input, fieldGroup.start, fieldGroup.end); 1157 1158 _front = _input[fieldGroup.start .. fieldGroup.end]; 1159 _consumed += fieldGroup.end; 1160 _input = _input[fieldGroup.end .. $]; 1161 } 1162 1163 /* Finds the start and end indexes of the next field-group. 1164 * 1165 * The start and end indexes exclude delimiter characters (comma, space, colon). 1166 */ 1167 private auto nextFieldGroup(bool isFirst)(R r) const nothrow pure @safe 1168 { 1169 alias RetType = Tuple!(size_t, "start", size_t, "end"); 1170 1171 enum Char COMMA = ','; 1172 enum Char BACKSLASH = '\\'; 1173 enum Char SPACE = ' '; 1174 enum Char COLON = ':'; 1175 1176 if (r.empty) return RetType(0, 0); 1177 1178 size_t start = 0; 1179 1180 static if (!isFirst) 1181 { 1182 if (r[0] == COMMA) start = 1; 1183 } 1184 1185 size_t end = start; 1186 1187 while (end < r.length) 1188 { 1189 Char lookingAt = r[end]; 1190 1191 if (lookingAt == COMMA || lookingAt == SPACE || lookingAt == COLON) break; 1192 1193 if (lookingAt == BACKSLASH) 1194 { 1195 if (end + 1 == r.length) break; 1196 end += 2; 1197 } 1198 else 1199 { 1200 end += 1; 1201 } 1202 } 1203 1204 return RetType(start, end); 1205 } 1206 } 1207 1208 return Result(r); 1209 } 1210 1211 // findFieldGroups 1212 @safe unittest 1213 { 1214 import std.algorithm : equal; 1215 1216 /* Note: backticks generate string literals without escapes. */ 1217 1218 /* Immediate termination. */ 1219 assert(``.findFieldGroups.empty); 1220 assert(`,`.findFieldGroups.empty); 1221 assert(`:`.findFieldGroups.empty); 1222 assert(` `.findFieldGroups.empty); 1223 assert(`\`.findFieldGroups.empty); 1224 1225 assert(`,1`.findFieldGroups.empty); 1226 assert(`:1`.findFieldGroups.empty); 1227 assert(` 1`.findFieldGroups.empty); 1228 1229 /* Common cases. */ 1230 assert(equal(`1`.findFieldGroups, 1231 [tuple(1, `1`) 1232 ])); 1233 1234 assert(equal(`1,2`.findFieldGroups, 1235 [tuple(1, `1`), 1236 tuple(3, `2`) 1237 ])); 1238 1239 assert(equal(`1,2,3`.findFieldGroups, 1240 [tuple(1, `1`), 1241 tuple(3, `2`), 1242 tuple(5, `3`) 1243 ])); 1244 1245 assert(equal(`1-3`.findFieldGroups, 1246 [tuple(3, `1-3`) 1247 ])); 1248 1249 assert(equal(`1-3,5,7-2`.findFieldGroups, 1250 [tuple(3, `1-3`), 1251 tuple(5, `5`), 1252 tuple(9, `7-2`) 1253 ])); 1254 1255 assert(equal(`field1`.findFieldGroups, 1256 [tuple(6, `field1`) 1257 ])); 1258 1259 assert(equal(`field1,field2`.findFieldGroups, 1260 [tuple(6, `field1`), 1261 tuple(13, `field2`) 1262 ])); 1263 1264 assert(equal(`field1-field5`.findFieldGroups, 1265 [tuple(13, `field1-field5`) 1266 ])); 1267 1268 assert(equal(`snow\ storm,雪风暴,Tempête\ de\ neige,Χιονοθύελλα,吹雪`.findFieldGroups, 1269 [tuple(11, `snow\ storm`), 1270 tuple(21, `雪风暴`), 1271 tuple(41, `Tempête\ de\ neige`), 1272 tuple(64, `Χιονοθύελλα`), 1273 tuple(71, `吹雪`) 1274 ])); 1275 1276 /* Escape sequences. */ 1277 assert(equal(`Field\ 1,Field\ 2,Field\ 5-Field\ 11`.findFieldGroups, 1278 [tuple(8, `Field\ 1`), 1279 tuple(17, `Field\ 2`), 1280 tuple(36, `Field\ 5-Field\ 11`) 1281 ])); 1282 1283 assert(equal(`Jun\ 03\-08,Jul\ 14\-23`.findFieldGroups, 1284 [tuple(11, `Jun\ 03\-08`), 1285 tuple(23, `Jul\ 14\-23`) 1286 ])); 1287 1288 assert(equal(`field\:1`.findFieldGroups, 1289 [tuple(8, `field\:1`) 1290 ])); 1291 1292 assert(equal(`\\,\,,\:,\ ,\a`.findFieldGroups, 1293 [tuple(2, `\\`), 1294 tuple(5, `\,`), 1295 tuple(8, `\:`), 1296 tuple(11, `\ `), 1297 tuple(14, `\a`) 1298 ])); 1299 1300 assert(equal(`\001,\a\b\c\ \ \-\d,fld\*1`.findFieldGroups, 1301 [tuple(4, `\001`), 1302 tuple(19, `\a\b\c\ \ \-\d`), 1303 tuple(26, `fld\*1`) 1304 ])); 1305 1306 /* field-list termination. */ 1307 assert(equal(`X:`.findFieldGroups, 1308 [tuple(1, `X`) 1309 ])); 1310 1311 assert(equal(`X `.findFieldGroups, 1312 [tuple(1, `X`) 1313 ])); 1314 1315 assert(equal(`X\`.findFieldGroups, 1316 [tuple(1, `X`) 1317 ])); 1318 1319 assert(equal(`1-3:5-7`.findFieldGroups, 1320 [tuple(3, `1-3`) 1321 ])); 1322 1323 assert(equal(`1-3,4:5-7`.findFieldGroups, 1324 [tuple(3, `1-3`), 1325 tuple(5, `4`) 1326 ])); 1327 1328 assert(equal(`abc,,def`.findFieldGroups, 1329 [tuple(3, `abc`), 1330 ])); 1331 1332 assert(equal(`abc,,`.findFieldGroups, 1333 [tuple(3, `abc`), 1334 ])); 1335 1336 assert(equal(`abc,`.findFieldGroups, 1337 [tuple(3, `abc`), 1338 ])); 1339 1340 /* Leading, trailing, or solo hyphen. Captured for error handling. */ 1341 assert(equal(`-1,1-,-`.findFieldGroups, 1342 [tuple(2, `-1`), 1343 tuple(5, `1-`), 1344 tuple(7, `-`) 1345 ])); 1346 } 1347 1348 /** 1349 `isNumericFieldGroup` determines if a field-group is a valid numeric field-group. 1350 (Private function.) 1351 1352 A numeric field-group is single, non-negative integer or a pair of non-negative 1353 integers separated by a hyphen. 1354 1355 Note that zero is valid by this definition, even though it is usually disallowed as a 1356 field number, except when representing the entire line. 1357 1358 $(ALWAYS_DOCUMENT) 1359 */ 1360 private bool isNumericFieldGroup(const char[] fieldGroup) @safe 1361 { 1362 return cast(bool) fieldGroup.matchFirst(ctRegex!`^[0-9]+(-[0-9]+)?$`); 1363 } 1364 1365 @safe unittest 1366 { 1367 import std.conv : to; 1368 1369 assert(!isNumericFieldGroup(``)); 1370 assert(!isNumericFieldGroup(`-`)); 1371 assert(!isNumericFieldGroup(`\1`)); 1372 assert(!isNumericFieldGroup(`\01`)); 1373 assert(!isNumericFieldGroup(`1-`)); 1374 assert(!isNumericFieldGroup(`-1`)); 1375 assert(!isNumericFieldGroup(`a`)); 1376 assert(!isNumericFieldGroup(`a1`)); 1377 assert(!isNumericFieldGroup(`1.1`)); 1378 1379 assert(isNumericFieldGroup(`1`)); 1380 assert(isNumericFieldGroup(`0123456789`)); 1381 assert(isNumericFieldGroup(`0-0`)); 1382 assert(isNumericFieldGroup(`3-5`)); 1383 assert(isNumericFieldGroup(`30-5`)); 1384 assert(isNumericFieldGroup(`0123456789-0123456789`)); 1385 1386 assert(`0123456789-0123456789`.to!(char[]).isNumericFieldGroup); 1387 } 1388 1389 /** 1390 `isNumericFieldGroupWithHyphenFirstOrLast` determines if a field-group is a field 1391 number with a leading or trailing hyphen. (Private function.) 1392 1393 This routine is used for better error handling. Currently, incomplete field ranges 1394 are not supported. That is, field ranges leaving off the first or last field, 1395 defaulting to the end of the line. This syntax is available in `cut`, e.g. 1396 1397 $(CONSOLE 1398 $ cut -f 2- 1399 ) 1400 1401 In `cut`, this represents field 2 to the end of the line. This routine identifies 1402 these forms so an error message specific to this case can be generated. 1403 1404 $(ALWAYS_DOCUMENT) 1405 */ 1406 private bool isNumericFieldGroupWithHyphenFirstOrLast(const char[] fieldGroup) @safe 1407 { 1408 return cast(bool) fieldGroup.matchFirst(ctRegex!`^((\-[0-9]+)|([0-9]+\-))$`); 1409 } 1410 1411 @safe unittest 1412 { 1413 assert(!isNumericFieldGroupWithHyphenFirstOrLast(``)); 1414 assert(!isNumericFieldGroupWithHyphenFirstOrLast(`-`)); 1415 assert(!isNumericFieldGroupWithHyphenFirstOrLast(`1-2`)); 1416 assert(!isNumericFieldGroupWithHyphenFirstOrLast(`-a`)); 1417 assert(isNumericFieldGroupWithHyphenFirstOrLast(`-1`)); 1418 assert(isNumericFieldGroupWithHyphenFirstOrLast(`-12`)); 1419 assert(isNumericFieldGroupWithHyphenFirstOrLast(`1-`)); 1420 assert(isNumericFieldGroupWithHyphenFirstOrLast(`12-`)); 1421 assert(!isNumericFieldGroupWithHyphenFirstOrLast(`-1333-`)); 1422 assert(!isNumericFieldGroupWithHyphenFirstOrLast(`\-1`)); 1423 assert(!isNumericFieldGroupWithHyphenFirstOrLast(`\-12`)); 1424 assert(!isNumericFieldGroupWithHyphenFirstOrLast(`1\-`)); 1425 assert(!isNumericFieldGroupWithHyphenFirstOrLast(`12\-`)); 1426 } 1427 1428 /** 1429 `isMixedNumericNamedFieldGroup` determines if a field group is a range where one 1430 element is a field number and the other element is a named field (not a number). 1431 1432 This routine is used for better error handling. Currently, field ranges must be 1433 either entirely numeric or entirely named. This is primarily to catch unintended 1434 used of a mixed range on the command line. 1435 1436 $(ALWAYS_DOCUMENT) 1437 */ 1438 private bool isMixedNumericNamedFieldGroup(const char[] fieldGroup) @safe 1439 { 1440 /* Patterns cases: 1441 * - Field group starts with a series of digits followed by a hyphen, followed 1442 * sequence containing a non-digit character. 1443 * ^([0-9]+\-.*[^0-9].*)$ 1444 * - Field ends with an unescaped hyphen and a series of digits. Two start cases: 1445 * - Non-digit, non-backslash immediately preceding the hyphen 1446 * ^(.*[^0-9\\]\-[0-9]+)$ 1447 * - Digit immediately preceding the hyphen, non-hyphen earlier 1448 * ^(.*[^0-9].*[0-9]\-[0-9]+)$ 1449 * These two combined: 1450 * ^( ( (.*[^0-9\\]) | (.*[^0-9].*[0-9]) ) \-[0-9]+ )$ 1451 * 1452 * All cases combined: 1453 * ^( ([0-9]+\-.*[^0-9].*) | ( (.*[^0-9\\]) | (.*[^0-9].*[0-9]) ) \-[0-9]+)$ 1454 */ 1455 return cast(bool) fieldGroup.matchFirst(ctRegex!`^(([0-9]+\-.*[^0-9].*)|((.*[^0-9\\])|(.*[^0-9].*[0-9]))\-[0-9]+)$`); 1456 } 1457 1458 @safe unittest 1459 { 1460 assert(isMixedNumericNamedFieldGroup(`1-g`)); 1461 assert(isMixedNumericNamedFieldGroup(`y-2`)); 1462 assert(isMixedNumericNamedFieldGroup(`23-zy`)); 1463 assert(isMixedNumericNamedFieldGroup(`pB-37`)); 1464 1465 assert(isMixedNumericNamedFieldGroup(`5x-0`)); 1466 assert(isMixedNumericNamedFieldGroup(`x5-9`)); 1467 assert(isMixedNumericNamedFieldGroup(`0-2m`)); 1468 assert(isMixedNumericNamedFieldGroup(`9-m2`)); 1469 assert(isMixedNumericNamedFieldGroup(`5x-37`)); 1470 assert(isMixedNumericNamedFieldGroup(`x5-37`)); 1471 assert(isMixedNumericNamedFieldGroup(`37-2m`)); 1472 assert(isMixedNumericNamedFieldGroup(`37-m2`)); 1473 1474 assert(isMixedNumericNamedFieldGroup(`18-23t`)); 1475 assert(isMixedNumericNamedFieldGroup(`x12-632`)); 1476 assert(isMixedNumericNamedFieldGroup(`15-15.5`)); 1477 1478 assert(isMixedNumericNamedFieldGroup(`1-g\-h`)); 1479 assert(isMixedNumericNamedFieldGroup(`z\-y-2`)); 1480 assert(isMixedNumericNamedFieldGroup(`23-zy\-st`)); 1481 assert(isMixedNumericNamedFieldGroup(`ts\-pB-37`)); 1482 1483 assert(!isMixedNumericNamedFieldGroup(`a-c`)); 1484 assert(!isMixedNumericNamedFieldGroup(`1-3`)); 1485 assert(!isMixedNumericNamedFieldGroup(`\1-g`)); 1486 assert(!isMixedNumericNamedFieldGroup(`-g`)); 1487 assert(!isMixedNumericNamedFieldGroup(`h-`)); 1488 assert(!isMixedNumericNamedFieldGroup(`-`)); 1489 assert(!isMixedNumericNamedFieldGroup(``)); 1490 assert(!isMixedNumericNamedFieldGroup(`\2-\3`)); 1491 assert(!isMixedNumericNamedFieldGroup(`\10-\20`)); 1492 assert(!isMixedNumericNamedFieldGroup(`x`)); 1493 assert(!isMixedNumericNamedFieldGroup(`xyz`)); 1494 assert(!isMixedNumericNamedFieldGroup(`0`)); 1495 assert(!isMixedNumericNamedFieldGroup(`9`)); 1496 1497 assert(!isMixedNumericNamedFieldGroup(`1\-g`)); 1498 assert(!isMixedNumericNamedFieldGroup(`y\-2`)); 1499 assert(!isMixedNumericNamedFieldGroup(`23\-zy`)); 1500 assert(!isMixedNumericNamedFieldGroup(`pB\-37`)); 1501 assert(!isMixedNumericNamedFieldGroup(`18\-23t`)); 1502 assert(!isMixedNumericNamedFieldGroup(`x12\-632`)); 1503 1504 assert(!isMixedNumericNamedFieldGroup(`5x\-0`)); 1505 assert(!isMixedNumericNamedFieldGroup(`x5\-9`)); 1506 assert(!isMixedNumericNamedFieldGroup(`0\-2m`)); 1507 assert(!isMixedNumericNamedFieldGroup(`9\-m2`)); 1508 assert(!isMixedNumericNamedFieldGroup(`5x\-37`)); 1509 assert(!isMixedNumericNamedFieldGroup(`x5\-37`)); 1510 assert(!isMixedNumericNamedFieldGroup(`37\-2m`)); 1511 assert(!isMixedNumericNamedFieldGroup(`37\-m2`)); 1512 1513 assert(!isMixedNumericNamedFieldGroup(`1\-g\-h`)); 1514 assert(!isMixedNumericNamedFieldGroup(`z\-y\-2`)); 1515 assert(!isMixedNumericNamedFieldGroup(`23\-zy\-st`)); 1516 assert(!isMixedNumericNamedFieldGroup(`ts\-pB\-37`)); 1517 1518 assert(!isMixedNumericNamedFieldGroup(`\-g`)); 1519 assert(!isMixedNumericNamedFieldGroup(`h\-`)); 1520 assert(!isMixedNumericNamedFieldGroup(`i\-j`)); 1521 assert(!isMixedNumericNamedFieldGroup(`\-2`)); 1522 assert(!isMixedNumericNamedFieldGroup(`2\-`)); 1523 assert(!isMixedNumericNamedFieldGroup(`2\-3`)); 1524 assert(!isMixedNumericNamedFieldGroup(`\2\-\3`)); 1525 } 1526 1527 /** 1528 `namedFieldGroupToRegex` generates regular expressions for matching fields in named 1529 field-group to field names in a header line. (Private function.) 1530 1531 One regex is generated for a single field, two are generated for a range. These are 1532 returned as a tuple with a pair of regex instances. The first regex is used for 1533 single field entries and the first entry of range. The second regex is filled with 1534 the second entry of a range and is empty otherwise. (Test with 'empty()'.) 1535 1536 This routine converts all field-list escape and wildcard syntax into the necessary 1537 regular expression syntax. Backslash escaped characters are converted to their plain 1538 characters and asterisk wildcarding (glob style) is converted to regex syntax. 1539 1540 Regular expressions include beginning and end of string markers. This is intended for 1541 matching field names after they have been extracted from the header line. 1542 1543 Most field-group syntax errors requiring end-user error messages should be detected 1544 elsewhere in field-list processing. The exception is field-names with a non-escaped 1545 leading or trailing hyphen. A user-appropriate error message is thrown for this case. 1546 Other erroneous inputs result in both regex's set empty. 1547 1548 There is no detection of numeric field-groups. If a numeric-field group is passed in 1549 it will be treated as a named field-group and regular expressions generated. 1550 1551 $(ALWAYS_DOCUMENT) 1552 */ 1553 private auto namedFieldGroupToRegex(const char[] fieldGroup) 1554 { 1555 import std.array : appender; 1556 import std.conv : to; 1557 import std.uni : byCodePoint, byGrapheme; 1558 1559 import std.stdio; 1560 1561 enum dchar BACKSLASH = '\\'; 1562 enum dchar HYPHEN = '-'; 1563 enum dchar ASTERISK = '*'; 1564 1565 auto createRegex(const dchar[] basePattern) 1566 { 1567 return ("^"d ~ basePattern ~ "$").to!string.regex; 1568 } 1569 1570 Regex!char field1Regex; 1571 Regex!char field2Regex; 1572 1573 auto regexString = appender!(dchar[])(); 1574 1575 bool hyphenSeparatorFound = false; 1576 bool isEscaped = false; 1577 foreach (g; fieldGroup.byGrapheme) 1578 { 1579 if (isEscaped) 1580 { 1581 put(regexString, [g].byCodePoint.escaper); 1582 isEscaped = false; 1583 } 1584 else if (g.length == 1) 1585 { 1586 if (g[0] == HYPHEN) 1587 { 1588 enforce(!hyphenSeparatorFound && regexString.data.length != 0, 1589 format("Hyphens in field names must be backslash escaped unless separating two field names: '%s'.", 1590 fieldGroup)); 1591 1592 assert(field1Regex.empty); 1593 1594 field1Regex = createRegex(regexString.data); 1595 hyphenSeparatorFound = true; 1596 regexString.clear; 1597 } 1598 else if (g[0] == BACKSLASH) 1599 { 1600 isEscaped = true; 1601 } 1602 else if (g[0] == ASTERISK) 1603 { 1604 put(regexString, ".*"d); 1605 } 1606 else 1607 { 1608 put(regexString, [g].byCodePoint.escaper); 1609 } 1610 } 1611 else 1612 { 1613 put(regexString, [g].byCodePoint.escaper); 1614 } 1615 } 1616 enforce(!hyphenSeparatorFound || regexString.data.length != 0, 1617 format("Hyphens in field names must be backslash escaped unless separating two field names: '%s'.", 1618 fieldGroup)); 1619 1620 if (!hyphenSeparatorFound) 1621 { 1622 if (regexString.data.length != 0) field1Regex = createRegex(regexString.data); 1623 } 1624 else field2Regex = createRegex(regexString.data); 1625 1626 return tuple(field1Regex, field2Regex); 1627 } 1628 1629 @safe unittest 1630 { 1631 import std.algorithm : all, equal; 1632 import std.exception : assertThrown; 1633 1634 /* Use when both regexes should be empty. */ 1635 void testBothRegexEmpty(string test, Tuple!(Regex!char, Regex!char) regexPair) 1636 { 1637 assert(regexPair[0].empty, format("[namedFieldGroupToRegex: %s]", test)); 1638 assert(regexPair[1].empty, format("[namedFieldGroupToRegex: %s]", test)); 1639 } 1640 1641 /* Use when there should only be one regex. */ 1642 void testFirstRegexMatches(string test, Tuple!(Regex!char, Regex!char) regexPair, 1643 string[] regex1Matches) 1644 { 1645 assert(!regexPair[0].empty, format("[namedFieldGroupToRegex: %s]", test)); 1646 assert(regexPair[1].empty, format("[namedFieldGroupToRegex: %s]", test)); 1647 1648 assert(regex1Matches.all!(s => s.matchFirst(regexPair[0])), 1649 format("[namedFieldGroupToRegex: %s] regex: %s; strings: %s", 1650 test, regexPair[0], regex1Matches)); 1651 } 1652 1653 /* Use when there should be two regex with matches. */ 1654 void testBothRegexMatches(string test, Tuple!(Regex!char, Regex!char) regexPair, 1655 const (char[])[] regex1Matches, const (char[])[] regex2Matches) 1656 { 1657 assert(!regexPair[0].empty, format("[namedFieldGroupToRegex: %s]", test)); 1658 assert(!regexPair[1].empty, format("[namedFieldGroupToRegex: %s]", test)); 1659 1660 assert(regex1Matches.all!(s => s.matchFirst(regexPair[0])), 1661 format("[namedFieldGroupToRegex: %s] regex1: %s; strings: %s", 1662 test, regexPair[0], regex1Matches)); 1663 1664 assert(regex2Matches.all!(s => s.matchFirst(regexPair[1])), 1665 format("[namedFieldGroupToRegex: %s] regex2: %s; strings: %s", 1666 test, regexPair[1], regex2Matches)); 1667 } 1668 1669 /* Invalid hyphen use. These are the only error cases. */ 1670 assertThrown(`-`.namedFieldGroupToRegex); 1671 assertThrown(`a-`.namedFieldGroupToRegex); 1672 assertThrown(`-a`.namedFieldGroupToRegex); 1673 assertThrown(`a-b-`.namedFieldGroupToRegex); 1674 assertThrown(`a-b-c`.namedFieldGroupToRegex); 1675 1676 /* Some special cases. These cases are caught elsewhere and errors signaled to the 1677 * user. nameFieldGroupToRegex should just send back empty. 1678 */ 1679 testBothRegexEmpty(`test-empty-1`, ``.namedFieldGroupToRegex); 1680 testBothRegexEmpty(`test-empty-2`, `\`.namedFieldGroupToRegex); 1681 1682 /* Single name cases. */ 1683 testFirstRegexMatches(`test-single-1`, `a`.namedFieldGroupToRegex, [`a`]); 1684 testFirstRegexMatches(`test-single-2`, `\a`.namedFieldGroupToRegex, [`a`]); 1685 testFirstRegexMatches(`test-single-3`, `abc`.namedFieldGroupToRegex, [`abc`]); 1686 testFirstRegexMatches(`test-single-4`, `abc*`.namedFieldGroupToRegex, [`abc`, `abcd`, `abcde`]); 1687 testFirstRegexMatches(`test-single-5`, `*`.namedFieldGroupToRegex, [`a`, `ab`, `abc`, `abcd`, `abcde`, `*`]); 1688 testFirstRegexMatches(`test-single-6`, `abc\*`.namedFieldGroupToRegex, [`abc*`]); 1689 testFirstRegexMatches(`test-single-7`, `abc{}`.namedFieldGroupToRegex, [`abc{}`]); 1690 testFirstRegexMatches(`test-single-8`, `\002`.namedFieldGroupToRegex, [`002`]); 1691 testFirstRegexMatches(`test-single-9`, `\\002`.namedFieldGroupToRegex, [`\002`]); 1692 testFirstRegexMatches(`test-single-10`, `With A Space`.namedFieldGroupToRegex, [`With A Space`]); 1693 testFirstRegexMatches(`test-single-11`, `With\-A\-Hyphen`.namedFieldGroupToRegex, [`With-A-Hyphen`]); 1694 testFirstRegexMatches(`test-single-11`, `\a\b\c\d\e\f\g`.namedFieldGroupToRegex, [`abcdefg`]); 1695 testFirstRegexMatches(`test-single-12`, `雪风暴`.namedFieldGroupToRegex, [`雪风暴`]); 1696 testFirstRegexMatches(`test-single-13`, `\雪风暴`.namedFieldGroupToRegex, [`雪风暴`]); 1697 testFirstRegexMatches(`test-single-14`, `\雪\风\暴`.namedFieldGroupToRegex, [`雪风暴`]); 1698 testFirstRegexMatches(`test-single-15`, `雪*`.namedFieldGroupToRegex, [`雪`]); 1699 testFirstRegexMatches(`test-single-16`, `雪*`.namedFieldGroupToRegex, [`雪风`]); 1700 testFirstRegexMatches(`test-single-17`, `雪*`.namedFieldGroupToRegex, [`雪风暴`]); 1701 testFirstRegexMatches(`test-single-18`, `g̈각நிกำषिkʷक्षि`.namedFieldGroupToRegex, [`g̈각நிกำषिkʷक्षि`]); 1702 testFirstRegexMatches(`test-single-19`, `*g̈각நிกำषिkʷक्षि*`.namedFieldGroupToRegex, [`XYZg̈각நிกำषिkʷक्षिPQR`]); 1703 1704 testBothRegexMatches(`test-pair-1`, `a-b`.namedFieldGroupToRegex, [`a`], [`b`]); 1705 testBothRegexMatches(`test-pair-2`, `\a-\b`.namedFieldGroupToRegex, [`a`], [`b`]); 1706 testBothRegexMatches(`test-pair-3`, `a*-b*`.namedFieldGroupToRegex, [`a`, `ab`, `abc`], [`b`, `bc`, `bcd`]); 1707 testBothRegexMatches(`test-pair-4`, `abc-bcd`.namedFieldGroupToRegex, [`abc`], [`bcd`]); 1708 testBothRegexMatches(`test-pair-5`, `a\-f-r\-t`.namedFieldGroupToRegex, [`a-f`], [`r-t`]); 1709 testBothRegexMatches(`test-pair-6`, `雪风暴-吹雪`.namedFieldGroupToRegex, [`雪风暴`], [`吹雪`]); 1710 testBothRegexMatches(`test-pair-7`, `நிกำ각-aिg̈क्षिkʷ`.namedFieldGroupToRegex, [`நிกำ각`], [`aिg̈क्षिkʷ`]); 1711 } 1712 1713 /** 1714 `namedFieldRegexMatches` returns an input range iterating over all the fields (strings) 1715 in an input range that match a regular expression. (Private function.) 1716 1717 This routine is used in conjunction with `namedFieldGroupToRegex` to find the set of 1718 header line fields that match a field in a field-group expression. The input is a 1719 range where the individual elements are strings, e.g. an array of strings. 1720 1721 The elements of the returned range are a tuple where the first element is the 1722 one-based field number of the matching field and the second is the matched field 1723 name. A zero-based index is returned if `convertToZero` is Yes. 1724 1725 The regular expression must not be empty. 1726 1727 $(ALWAYS_DOCUMENT) 1728 */ 1729 private auto namedFieldRegexMatches(T = size_t, 1730 ConvertToZeroBasedIndex convertToZero = No.convertToZeroBasedIndex, 1731 Range) 1732 (Range headerFields, Regex!char fieldRegex) 1733 if (isInputRange!Range && is(ElementEncodingType!Range == string)) 1734 { 1735 import std.algorithm : filter; 1736 1737 assert(!fieldRegex.empty); 1738 1739 static if (convertToZero) enum T indexOffset = 0; 1740 else enum T indexOffset = 1; 1741 1742 return headerFields 1743 .enumerate!(T)(indexOffset) 1744 .filter!(x => x[1].matchFirst(fieldRegex)); 1745 } 1746 1747 /* namedFieldRegexMatches tests. Some additional testing of namedFieldGroupToRegex, 1748 * though all the regex edge cases occur in the namedFieldGroupToRegex tests. 1749 */ 1750 @safe unittest 1751 { 1752 import std.algorithm : equal; 1753 import std.array : array; 1754 1755 void testBothRegexMatches(T = size_t, 1756 ConvertToZeroBasedIndex convertToZero = No.convertToZeroBasedIndex) 1757 (string test, string[] headerFields, 1758 Tuple!(Regex!char, Regex!char) regexPair, 1759 Tuple!(T, string)[] regex0Matches, 1760 Tuple!(T, string)[] regex1Matches) 1761 { 1762 if (regexPair[0].empty) 1763 { 1764 assert(regex1Matches.empty, 1765 format("[namedFieldRegexMatches: %s] (empty regex[0], non-empty matches]", test)); 1766 } 1767 else 1768 { 1769 assert(equal(headerFields.namedFieldRegexMatches!(T, convertToZero)(regexPair[0]), 1770 regex0Matches), 1771 format("[namedFieldRegexMatches: %s] (regex[0] mismatch\nExpected: %s\nActual : %s", 1772 test, regex0Matches, headerFields.namedFieldRegexMatches!(T, convertToZero)(regexPair[0]).array)); 1773 } 1774 1775 if (regexPair[1].empty) 1776 { 1777 assert(regex1Matches.empty, 1778 format("[namedFieldRegexMatches: %s] (empty regex[1], non-empty matches]", test)); 1779 } 1780 else 1781 { 1782 assert(equal(headerFields.namedFieldRegexMatches!(T, convertToZero)(regexPair[1]), 1783 regex1Matches), 1784 format("[namedFieldRegexMatches: %s] (regex[1] mismatch\nExpected: %s\nActual : %s", 1785 test, regex1Matches, headerFields.namedFieldRegexMatches!(T, convertToZero)(regexPair[1]).array)); 1786 } 1787 } 1788 1789 Tuple!(size_t, string)[] emptyRegexMatch; 1790 1791 testBothRegexMatches( 1792 "test-1", 1793 [`a`, `b`, `c`], // Header line 1794 `a`.namedFieldGroupToRegex, // field-group 1795 [ tuple(1UL, `a`) ], // regex-0 expected match 1796 emptyRegexMatch); // regex-1 expected match 1797 1798 testBothRegexMatches( 1799 "test-2", 1800 [`a`, `b`, `c`], 1801 `b`.namedFieldGroupToRegex, 1802 [ tuple(2UL, `b`) ], 1803 emptyRegexMatch); 1804 1805 testBothRegexMatches( 1806 "test-3", 1807 [`a`, `b`, `c`], 1808 `c`.namedFieldGroupToRegex, 1809 [ tuple(3UL, `c`) ], 1810 emptyRegexMatch); 1811 1812 testBothRegexMatches( 1813 "test-4", 1814 [`a`, `b`, `c`], 1815 `x`.namedFieldGroupToRegex, 1816 emptyRegexMatch, 1817 emptyRegexMatch); 1818 1819 testBothRegexMatches( 1820 "test-5", 1821 [`a`], 1822 `a`.namedFieldGroupToRegex, 1823 [ tuple(1UL, `a`) ], 1824 emptyRegexMatch); 1825 1826 testBothRegexMatches( 1827 "test-6", 1828 [`abc`, `def`, `ghi`], 1829 `abc`.namedFieldGroupToRegex, 1830 [ tuple(1UL, `abc`) ], 1831 emptyRegexMatch); 1832 1833 testBothRegexMatches( 1834 "test-7", 1835 [`x_abc`, `y_def`, `x_ghi`], 1836 `x_*`.namedFieldGroupToRegex, 1837 [ tuple(1UL, `x_abc`), tuple(3UL, `x_ghi`),], 1838 emptyRegexMatch); 1839 1840 testBothRegexMatches( 1841 "test-8", 1842 [`x_abc`, `y_def`, `x_ghi`], 1843 `*`.namedFieldGroupToRegex, 1844 [ tuple(1UL, `x_abc`), tuple(2UL, `y_def`), tuple(3UL, `x_ghi`),], 1845 emptyRegexMatch); 1846 1847 testBothRegexMatches( 1848 "test-9", 1849 [`a`, `b`, `c`], 1850 `a-c`.namedFieldGroupToRegex, 1851 [ tuple(1UL, `a`),], 1852 [ tuple(3UL, `c`),]); 1853 1854 testBothRegexMatches( 1855 "test-10", 1856 [`a`, `b`, `c`], 1857 `c-a`.namedFieldGroupToRegex, 1858 [ tuple(3UL, `c`),], 1859 [ tuple(1UL, `a`),]); 1860 1861 testBothRegexMatches( 1862 "test-11", 1863 [`a`, `b`, `c`], 1864 `c*-a*`.namedFieldGroupToRegex, 1865 [ tuple(3UL, `c`),], 1866 [ tuple(1UL, `a`),]); 1867 1868 testBothRegexMatches( 1869 "test-12", 1870 [`abc`, `abc-def`, `def`], 1871 `abc-def`.namedFieldGroupToRegex, 1872 [ tuple(1UL, `abc`) ], 1873 [ tuple(3UL, `def`) ]); 1874 1875 testBothRegexMatches( 1876 "test-13", 1877 [`abc`, `abc-def`, `def`], 1878 `abc\-def`.namedFieldGroupToRegex, 1879 [ tuple(2UL, `abc-def`) ], 1880 emptyRegexMatch); 1881 1882 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1883 ("test-101", 1884 [`a`, `b`, `c`], 1885 `a`.namedFieldGroupToRegex, 1886 [ tuple(0UL, `a`) ], 1887 emptyRegexMatch); 1888 1889 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1890 ("test-102", 1891 [`a`, `b`, `c`], 1892 `b`.namedFieldGroupToRegex, 1893 [ tuple(1UL, `b`) ], 1894 emptyRegexMatch); 1895 1896 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1897 ("test-103", 1898 [`a`, `b`, `c`], 1899 `c`.namedFieldGroupToRegex, 1900 [ tuple(2UL, `c`) ], 1901 emptyRegexMatch); 1902 1903 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1904 ("test-104", 1905 [`a`, `b`, `c`], 1906 `x`.namedFieldGroupToRegex, 1907 emptyRegexMatch, 1908 emptyRegexMatch); 1909 1910 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1911 ("test-105", 1912 [`a`], 1913 `a`.namedFieldGroupToRegex, 1914 [ tuple(0UL, `a`) ], 1915 emptyRegexMatch); 1916 1917 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1918 ("test-106", 1919 [`abc`, `def`, `ghi`], 1920 `abc`.namedFieldGroupToRegex, 1921 [ tuple(0UL, `abc`) ], 1922 emptyRegexMatch); 1923 1924 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1925 ("test-107", 1926 [`x_abc`, `y_def`, `x_ghi`], 1927 `x_*`.namedFieldGroupToRegex, 1928 [ tuple(0UL, `x_abc`), tuple(2UL, `x_ghi`),], 1929 emptyRegexMatch); 1930 1931 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1932 ("test-108", 1933 [`x_abc`, `y_def`, `x_ghi`], 1934 `*`.namedFieldGroupToRegex, 1935 [ tuple(0UL, `x_abc`), tuple(1UL, `y_def`), tuple(2UL, `x_ghi`),], 1936 emptyRegexMatch); 1937 1938 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1939 ("test-109", 1940 [`a`, `b`, `c`], 1941 `a-c`.namedFieldGroupToRegex, 1942 [ tuple(0UL, `a`),], 1943 [ tuple(2UL, `c`),]); 1944 1945 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1946 ("test-110", 1947 [`a`, `b`, `c`], 1948 `c-a`.namedFieldGroupToRegex, 1949 [ tuple(2UL, `c`),], 1950 [ tuple(0UL, `a`),]); 1951 1952 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1953 ("test-111", 1954 [`a`, `b`, `c`], 1955 `c*-a*`.namedFieldGroupToRegex, 1956 [ tuple(2UL, `c`),], 1957 [ tuple(0UL, `a`),]); 1958 1959 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1960 ("test-112", 1961 [`abc`, `abc-def`, `def`], 1962 `abc-def`.namedFieldGroupToRegex, 1963 [ tuple(0UL, `abc`) ], 1964 [ tuple(2UL, `def`) ]); 1965 1966 testBothRegexMatches!(size_t, Yes.convertToZeroBasedIndex) 1967 ("test-113", 1968 [`abc`, `abc-def`, `def`], 1969 `abc\-def`.namedFieldGroupToRegex, 1970 [ tuple(1UL, `abc-def`) ], 1971 emptyRegexMatch); 1972 1973 Tuple!(int, string)[] intEmptyRegexMatch; 1974 Tuple!(uint, string)[] uintEmptyRegexMatch; 1975 Tuple!(long, string)[] longEmptyRegexMatch; 1976 1977 testBothRegexMatches!(int, Yes.convertToZeroBasedIndex) 1978 ("test-201", 1979 [`a`, `b`, `c`], 1980 `a`.namedFieldGroupToRegex, 1981 [ tuple(0, `a`) ], 1982 intEmptyRegexMatch); 1983 1984 testBothRegexMatches!(long, Yes.convertToZeroBasedIndex) 1985 ("test-202", 1986 [`a`, `b`, `c`], 1987 `b`.namedFieldGroupToRegex, 1988 [ tuple(1L, `b`) ], 1989 longEmptyRegexMatch); 1990 1991 testBothRegexMatches!(uint, Yes.convertToZeroBasedIndex) 1992 ("test-203", 1993 [`a`, `b`, `c`], 1994 `c`.namedFieldGroupToRegex, 1995 [ tuple(2U, `c`) ], 1996 uintEmptyRegexMatch); 1997 1998 testBothRegexMatches!(uint, Yes.convertToZeroBasedIndex)( 1999 "test-204", 2000 [`a`, `b`, `c`], 2001 `x`.namedFieldGroupToRegex, 2002 uintEmptyRegexMatch, 2003 uintEmptyRegexMatch); 2004 2005 testBothRegexMatches!(int) 2006 ("test-211", 2007 [`a`, `b`, `c`], 2008 `c*-a*`.namedFieldGroupToRegex, 2009 [ tuple(3, `c`),], 2010 [ tuple(1, `a`),]); 2011 2012 testBothRegexMatches!(long) 2013 ("test-212", 2014 [`abc`, `abc-def`, `def`], 2015 `abc-def`.namedFieldGroupToRegex, 2016 [ tuple(1L, `abc`) ], 2017 [ tuple(3L, `def`) ]); 2018 2019 testBothRegexMatches!(uint) 2020 ("test-213", 2021 [`abc`, `abc-def`, `def`], 2022 `abc\-def`.namedFieldGroupToRegex, 2023 [ tuple(2U, `abc-def`) ], 2024 uintEmptyRegexMatch); 2025 } 2026 2027 /** 2028 `parseNumericFieldGroup` parses a single number or number range. E.g. '5' or '5-8'. 2029 (Private function.) 2030 2031 `parseNumericFieldGroup` returns a range that iterates over all the values in the 2032 field-group. It has options supporting conversion of field numbers to zero-based 2033 indices and the use of '0' (zero) as a field number. 2034 2035 This was part of the original code supporting numeric field list and is used by 2036 both numeric and named field-list routines. 2037 2038 $(ALWAYS_DOCUMENT) 2039 */ 2040 private auto parseNumericFieldGroup(T = size_t, 2041 ConvertToZeroBasedIndex convertToZero = No.convertToZeroBasedIndex, 2042 AllowFieldNumZero allowZero = No.allowFieldNumZero) 2043 (string fieldRange) 2044 if (isIntegral!T && (!allowZero || !convertToZero || !isUnsigned!T)) 2045 { 2046 import std.algorithm : findSplit; 2047 import std.conv : to; 2048 import std.range : iota; 2049 import std.traits : Signed; 2050 2051 /* Pick the largest compatible integral type for the IOTA range. This must be the 2052 * signed type if convertToZero is true, as a reverse order range may end at -1. 2053 */ 2054 static if (convertToZero) alias S = Signed!T; 2055 else alias S = T; 2056 2057 enforce(fieldRange.length != 0, "Empty field number."); 2058 2059 auto rangeSplit = findSplit(fieldRange, "-"); 2060 2061 /* Make sure the range does not start or end with a dash. */ 2062 enforce(rangeSplit[1].empty || (!rangeSplit[0].empty && !rangeSplit[2].empty), 2063 format("Incomplete ranges are not supported: '%s'.", fieldRange)); 2064 2065 S start = rangeSplit[0].to!S; 2066 S last = rangeSplit[1].empty ? start : rangeSplit[2].to!S; 2067 Signed!T increment = (start <= last) ? 1 : -1; 2068 2069 static if (allowZero) 2070 { 2071 enforce(rangeSplit[1].empty || (start != 0 && last != 0), 2072 format("Zero cannot be used as part of a range: '%s'.", fieldRange)); 2073 } 2074 2075 static if (allowZero) 2076 { 2077 enforce(start >= 0 && last >= 0, 2078 format("Field numbers must be non-negative integers: '%d'.", 2079 (start < 0) ? start : last)); 2080 } 2081 else 2082 { 2083 enforce(start >= 1 && last >= 1, 2084 format("Field numbers must be greater than zero: '%d'.", 2085 (start < 1) ? start : last)); 2086 } 2087 2088 static if (convertToZero) 2089 { 2090 start--; 2091 last--; 2092 } 2093 2094 return iota(start, last + increment, increment); 2095 } 2096 2097 // parseNumericFieldGroup. 2098 @safe unittest 2099 { 2100 import std.algorithm : equal; 2101 import std.exception : assertThrown, assertNotThrown; 2102 2103 /* Basic cases */ 2104 assert(parseNumericFieldGroup("1").equal([1])); 2105 assert("2".parseNumericFieldGroup.equal([2])); 2106 assert("3-4".parseNumericFieldGroup.equal([3, 4])); 2107 assert("3-5".parseNumericFieldGroup.equal([3, 4, 5])); 2108 assert("4-3".parseNumericFieldGroup.equal([4, 3])); 2109 assert("10-1".parseNumericFieldGroup.equal([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])); 2110 2111 /* Convert to zero-based indices */ 2112 assert(parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)("1").equal([0])); 2113 assert("2".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex).equal([1])); 2114 assert("3-4".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex).equal([2, 3])); 2115 assert("3-5".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex).equal([2, 3, 4])); 2116 assert("4-3".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex).equal([3, 2])); 2117 assert("10-1".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex).equal([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])); 2118 2119 /* Allow zero. */ 2120 assert("0".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 2121 assert(parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)("1").equal([1])); 2122 assert("3-4".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([3, 4])); 2123 assert("10-1".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])); 2124 2125 /* Allow zero, convert to zero-based index. */ 2126 assert("0".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([-1])); 2127 assert(parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)("1").equal([0])); 2128 assert("3-4".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([2, 3])); 2129 assert("10-1".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])); 2130 2131 /* Alternate integer types. */ 2132 assert("2".parseNumericFieldGroup!uint.equal([2])); 2133 assert("3-5".parseNumericFieldGroup!uint.equal([3, 4, 5])); 2134 assert("10-1".parseNumericFieldGroup!uint.equal([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])); 2135 assert("2".parseNumericFieldGroup!int.equal([2])); 2136 assert("3-5".parseNumericFieldGroup!int.equal([3, 4, 5])); 2137 assert("10-1".parseNumericFieldGroup!int.equal([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])); 2138 assert("2".parseNumericFieldGroup!ushort.equal([2])); 2139 assert("3-5".parseNumericFieldGroup!ushort.equal([3, 4, 5])); 2140 assert("10-1".parseNumericFieldGroup!ushort.equal([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])); 2141 assert("2".parseNumericFieldGroup!short.equal([2])); 2142 assert("3-5".parseNumericFieldGroup!short.equal([3, 4, 5])); 2143 assert("10-1".parseNumericFieldGroup!short.equal([10, 9, 8, 7, 6, 5, 4, 3, 2, 1])); 2144 2145 assert("0".parseNumericFieldGroup!(long, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 2146 assert("0".parseNumericFieldGroup!(uint, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 2147 assert("0".parseNumericFieldGroup!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 2148 assert("0".parseNumericFieldGroup!(ushort, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 2149 assert("0".parseNumericFieldGroup!(short, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 2150 assert("0".parseNumericFieldGroup!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([-1])); 2151 assert("0".parseNumericFieldGroup!(short, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([-1])); 2152 2153 /* Max field value cases. */ 2154 assert("65535".parseNumericFieldGroup!ushort.equal([65535])); // ushort max 2155 assert("65533-65535".parseNumericFieldGroup!ushort.equal([65533, 65534, 65535])); 2156 assert("32767".parseNumericFieldGroup!short.equal([32767])); // short max 2157 assert("32765-32767".parseNumericFieldGroup!short.equal([32765, 32766, 32767])); 2158 assert("32767".parseNumericFieldGroup!(short, Yes.convertToZeroBasedIndex).equal([32766])); 2159 2160 /* Error cases. */ 2161 assertThrown("".parseNumericFieldGroup); 2162 assertThrown(" ".parseNumericFieldGroup); 2163 assertThrown("-".parseNumericFieldGroup); 2164 assertThrown(" -".parseNumericFieldGroup); 2165 assertThrown("- ".parseNumericFieldGroup); 2166 assertThrown("1-".parseNumericFieldGroup); 2167 assertThrown("-2".parseNumericFieldGroup); 2168 assertThrown("-1".parseNumericFieldGroup); 2169 assertThrown("1.0".parseNumericFieldGroup); 2170 assertThrown("0".parseNumericFieldGroup); 2171 assertThrown("0-3".parseNumericFieldGroup); 2172 assertThrown("3-0".parseNumericFieldGroup); 2173 assertThrown("-2-4".parseNumericFieldGroup); 2174 assertThrown("2--4".parseNumericFieldGroup); 2175 assertThrown("2-".parseNumericFieldGroup); 2176 assertThrown("a".parseNumericFieldGroup); 2177 assertThrown("0x3".parseNumericFieldGroup); 2178 assertThrown("3U".parseNumericFieldGroup); 2179 assertThrown("1_000".parseNumericFieldGroup); 2180 assertThrown(".".parseNumericFieldGroup); 2181 2182 assertThrown("".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2183 assertThrown(" ".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2184 assertThrown("-".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2185 assertThrown("1-".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2186 assertThrown("-2".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2187 assertThrown("-1".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2188 assertThrown("0".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2189 assertThrown("0-3".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2190 assertThrown("3-0".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2191 assertThrown("-2-4".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2192 assertThrown("2--4".parseNumericFieldGroup!(size_t, Yes.convertToZeroBasedIndex)); 2193 2194 assertThrown("".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2195 assertThrown(" ".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2196 assertThrown("-".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2197 assertThrown("1-".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2198 assertThrown("-2".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2199 assertThrown("-1".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2200 assertThrown("0-3".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2201 assertThrown("3-0".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2202 assertThrown("-2-4".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2203 assertThrown("2--4".parseNumericFieldGroup!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2204 2205 assertThrown("".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2206 assertThrown(" ".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2207 assertThrown("-".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2208 assertThrown("1-".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2209 assertThrown("-2".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2210 assertThrown("-1".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2211 assertThrown("0-3".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2212 assertThrown("3-0".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2213 assertThrown("-2-4".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2214 assertThrown("2--4".parseNumericFieldGroup!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2215 2216 /* Value out of range cases. */ 2217 assertThrown("65536".parseNumericFieldGroup!ushort); // One more than ushort max. 2218 assertThrown("65535-65536".parseNumericFieldGroup!ushort); 2219 assertThrown("32768".parseNumericFieldGroup!short); // One more than short max. 2220 assertThrown("32765-32768".parseNumericFieldGroup!short); 2221 // Convert to zero limits signed range. 2222 assertThrown("32768".parseNumericFieldGroup!(ushort, Yes.convertToZeroBasedIndex)); 2223 assert("32767".parseNumericFieldGroup!(ushort, Yes.convertToZeroBasedIndex).equal([32766])); 2224 } 2225 2226 /** 2227 Numeric field-lists 2228 2229 Numeric field-lists are the original form of field-list supported by tsv-utils tools. 2230 They have largely been superseded by the more general field-list support provided by 2231 [parseFieldList], but the basic facilities for processing numeric field-lists are 2232 still available. 2233 2234 A numeric field-list is a string entered on the command line identifying one or more 2235 field numbers. They are used by the majority of the tsv-utils applications. There are 2236 two helper functions, [makeFieldListOptionHandler] and [parseNumericFieldList]. Most 2237 applications will use [makeFieldListOptionHandler], it creates a delegate that can be 2238 passed to `std.getopt` to process the command option. Actual processing of the option 2239 text is done by [parseNumericFieldList]. It can be called directly when the text of the 2240 option value contains more than just the field number. 2241 2242 Syntax and behavior: 2243 2244 A 'numeric field-list' is a list of numeric field numbers entered on the command line. 2245 Fields are 1-upped integers representing locations in an input line, in the traditional 2246 meaning of Unix command line tools. Fields can be entered as single numbers or a range. 2247 Multiple entries are separated by commas. Some examples (with 'fields' as the command 2248 line option): 2249 2250 ``` 2251 --fields 3 # Single field 2252 --fields 4,1 # Two fields 2253 --fields 3-9 # A range, fields 3 to 9 inclusive 2254 --fields 1,2,7-34,11 # A mix of ranges and fields 2255 --fields 15-5,3-1 # Two ranges in reverse order. 2256 ``` 2257 2258 Incomplete ranges are not supported, for example, '6-'. Zero is disallowed as a field 2259 value by default, but can be enabled to support the notion of zero as representing the 2260 entire line. However, zero cannot be part of a range. Field numbers are one-based by 2261 default, but can be converted to zero-based. If conversion to zero-based is enabled, 2262 field number zero must be disallowed or a signed integer type specified for the 2263 returned range. 2264 2265 An error is thrown if an invalid field specification is encountered. Error text is 2266 intended for display. Error conditions include: 2267 2268 $(LIST 2269 * Empty fields list 2270 * Empty value, e.g. Two consecutive commas, a trailing comma, or a leading comma 2271 * String that does not parse as a valid integer 2272 * Negative integers, or zero if zero is disallowed. 2273 * An incomplete range 2274 * Zero used as part of a range. 2275 ) 2276 2277 No other behaviors are enforced. Repeated values are accepted. If zero is allowed, 2278 other field numbers can be entered as well. Additional restrictions need to be 2279 applied by the caller. 2280 2281 Notes: 2282 2283 $(LIST 2284 * The data type determines the max field number that can be entered. Enabling 2285 conversion to zero restricts to the signed version of the data type. 2286 * Use 'import std.typecons : Yes, No' to use the convertToZeroBasedIndex and 2287 allowFieldNumZero template parameters. 2288 ) 2289 */ 2290 2291 /** 2292 `OptionHandlerDelegate` is the signature of the delegate returned by 2293 [makeFieldListOptionHandler]. 2294 */ 2295 alias OptionHandlerDelegate = void delegate(string option, string value); 2296 2297 /** 2298 `makeFieldListOptionHandler` creates a std.getopt option handler for processing field-lists 2299 entered on the command line. A field-list is as defined by [parseNumericFieldList]. 2300 */ 2301 OptionHandlerDelegate makeFieldListOptionHandler( 2302 T, 2303 ConvertToZeroBasedIndex convertToZero = No.convertToZeroBasedIndex, 2304 AllowFieldNumZero allowZero = No.allowFieldNumZero) 2305 (ref T[] fieldsArray) 2306 if (isIntegral!T && (!allowZero || !convertToZero || !isUnsigned!T)) 2307 { 2308 void fieldListOptionHandler(ref T[] fieldArray, string option, string value) pure @safe 2309 { 2310 import std.algorithm : each; 2311 try value.parseNumericFieldList!(T, convertToZero, allowZero).each!(x => fieldArray ~= x); 2312 catch (Exception exc) 2313 { 2314 exc.msg = format("[--%s] %s", option, exc.msg); 2315 throw exc; 2316 } 2317 } 2318 2319 return (option, value) => fieldListOptionHandler(fieldsArray, option, value); 2320 } 2321 2322 // makeFieldListOptionHandler. 2323 unittest 2324 { 2325 import std.exception : assertThrown, assertNotThrown; 2326 import std.getopt; 2327 2328 { 2329 size_t[] fields; 2330 auto args = ["program", "--fields", "1", "--fields", "2,4,7-9,23-21"]; 2331 getopt(args, "f|fields", fields.makeFieldListOptionHandler); 2332 assert(fields == [1, 2, 4, 7, 8, 9, 23, 22, 21]); 2333 } 2334 { 2335 size_t[] fields; 2336 auto args = ["program", "--fields", "1", "--fields", "2,4,7-9,23-21"]; 2337 getopt(args, 2338 "f|fields", fields.makeFieldListOptionHandler!(size_t, Yes.convertToZeroBasedIndex)); 2339 assert(fields == [0, 1, 3, 6, 7, 8, 22, 21, 20]); 2340 } 2341 { 2342 size_t[] fields; 2343 auto args = ["program", "-f", "0"]; 2344 getopt(args, 2345 "f|fields", fields.makeFieldListOptionHandler!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2346 assert(fields == [0]); 2347 } 2348 { 2349 size_t[] fields; 2350 auto args = ["program", "-f", "0", "-f", "1,0", "-f", "0,1"]; 2351 getopt(args, 2352 "f|fields", fields.makeFieldListOptionHandler!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2353 assert(fields == [0, 1, 0, 0, 1]); 2354 } 2355 { 2356 size_t[] ints; 2357 size_t[] fields; 2358 auto args = ["program", "--ints", "1,2,3", "--fields", "1", "--ints", "4,5,6", "--fields", "2,4,7-9,23-21"]; 2359 std.getopt.arraySep = ","; 2360 getopt(args, 2361 "i|ints", "Built-in list of integers.", &ints, 2362 "f|fields", "Field-list style integers.", fields.makeFieldListOptionHandler); 2363 assert(ints == [1, 2, 3, 4, 5, 6]); 2364 assert(fields == [1, 2, 4, 7, 8, 9, 23, 22, 21]); 2365 } 2366 2367 /* Basic cases involved unsigned types smaller than size_t. */ 2368 { 2369 uint[] fields; 2370 auto args = ["program", "-f", "0", "-f", "1,0", "-f", "0,1", "-f", "55-58"]; 2371 getopt(args, 2372 "f|fields", fields.makeFieldListOptionHandler!(uint, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2373 assert(fields == [0, 1, 0, 0, 1, 55, 56, 57, 58]); 2374 } 2375 { 2376 ushort[] fields; 2377 auto args = ["program", "-f", "0", "-f", "1,0", "-f", "0,1", "-f", "55-58"]; 2378 getopt(args, 2379 "f|fields", fields.makeFieldListOptionHandler!(ushort, No.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2380 assert(fields == [0, 1, 0, 0, 1, 55, 56, 57, 58]); 2381 } 2382 2383 /* Basic cases involving unsigned types. */ 2384 { 2385 long[] fields; 2386 auto args = ["program", "--fields", "1", "--fields", "2,4,7-9,23-21"]; 2387 getopt(args, "f|fields", fields.makeFieldListOptionHandler); 2388 assert(fields == [1, 2, 4, 7, 8, 9, 23, 22, 21]); 2389 } 2390 { 2391 long[] fields; 2392 auto args = ["program", "--fields", "1", "--fields", "2,4,7-9,23-21"]; 2393 getopt(args, 2394 "f|fields", fields.makeFieldListOptionHandler!(long, Yes.convertToZeroBasedIndex)); 2395 assert(fields == [0, 1, 3, 6, 7, 8, 22, 21, 20]); 2396 } 2397 { 2398 long[] fields; 2399 auto args = ["program", "-f", "0"]; 2400 getopt(args, 2401 "f|fields", fields.makeFieldListOptionHandler!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2402 assert(fields == [-1]); 2403 } 2404 { 2405 int[] fields; 2406 auto args = ["program", "--fields", "1", "--fields", "2,4,7-9,23-21"]; 2407 getopt(args, "f|fields", fields.makeFieldListOptionHandler); 2408 assert(fields == [1, 2, 4, 7, 8, 9, 23, 22, 21]); 2409 } 2410 { 2411 int[] fields; 2412 auto args = ["program", "--fields", "1", "--fields", "2,4,7-9,23-21"]; 2413 getopt(args, 2414 "f|fields", fields.makeFieldListOptionHandler!(int, Yes.convertToZeroBasedIndex)); 2415 assert(fields == [0, 1, 3, 6, 7, 8, 22, 21, 20]); 2416 } 2417 { 2418 int[] fields; 2419 auto args = ["program", "-f", "0"]; 2420 getopt(args, 2421 "f|fields", fields.makeFieldListOptionHandler!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2422 assert(fields == [-1]); 2423 } 2424 { 2425 short[] fields; 2426 auto args = ["program", "--fields", "1", "--fields", "2,4,7-9,23-21"]; 2427 getopt(args, "f|fields", fields.makeFieldListOptionHandler); 2428 assert(fields == [1, 2, 4, 7, 8, 9, 23, 22, 21]); 2429 } 2430 { 2431 short[] fields; 2432 auto args = ["program", "--fields", "1", "--fields", "2,4,7-9,23-21"]; 2433 getopt(args, 2434 "f|fields", fields.makeFieldListOptionHandler!(short, Yes.convertToZeroBasedIndex)); 2435 assert(fields == [0, 1, 3, 6, 7, 8, 22, 21, 20]); 2436 } 2437 { 2438 short[] fields; 2439 auto args = ["program", "-f", "0"]; 2440 getopt(args, 2441 "f|fields", fields.makeFieldListOptionHandler!(short, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero)); 2442 assert(fields == [-1]); 2443 } 2444 2445 { 2446 /* Error cases. */ 2447 size_t[] fields; 2448 auto args = ["program", "-f", "0"]; 2449 assertThrown(getopt(args, "f|fields", fields.makeFieldListOptionHandler)); 2450 2451 args = ["program", "-f", "-1"]; 2452 assertThrown(getopt(args, "f|fields", fields.makeFieldListOptionHandler)); 2453 2454 args = ["program", "-f", "--fields", "1"]; 2455 assertThrown(getopt(args, "f|fields", fields.makeFieldListOptionHandler)); 2456 2457 args = ["program", "-f", "a"]; 2458 assertThrown(getopt(args, "f|fields", fields.makeFieldListOptionHandler)); 2459 2460 args = ["program", "-f", "1.5"]; 2461 assertThrown(getopt(args, "f|fields", fields.makeFieldListOptionHandler)); 2462 2463 args = ["program", "-f", "2-"]; 2464 assertThrown(getopt(args, "f|fields", fields.makeFieldListOptionHandler)); 2465 2466 args = ["program", "-f", "3,5,-7"]; 2467 assertThrown(getopt(args, "f|fields", fields.makeFieldListOptionHandler)); 2468 2469 args = ["program", "-f", "3,5,"]; 2470 assertThrown(getopt(args, "f|fields", fields.makeFieldListOptionHandler)); 2471 2472 args = ["program", "-f", "-1"]; 2473 assertThrown(getopt(args, 2474 "f|fields", fields.makeFieldListOptionHandler!( 2475 size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero))); 2476 } 2477 } 2478 2479 /** 2480 `parseNumericFieldList` lazily generates a range of fields numbers from a 2481 'numeric field-list' string. 2482 */ 2483 auto parseNumericFieldList( 2484 T = size_t, 2485 ConvertToZeroBasedIndex convertToZero = No.convertToZeroBasedIndex, 2486 AllowFieldNumZero allowZero = No.allowFieldNumZero) 2487 (string fieldList, char delim = ',') 2488 if (isIntegral!T && (!allowZero || !convertToZero || !isUnsigned!T)) 2489 { 2490 import std.algorithm : splitter; 2491 import std.conv : to; 2492 2493 alias SplitFieldListRange = typeof(fieldList.splitter(delim)); 2494 alias NumericFieldGroupParse 2495 = ReturnType!(parseNumericFieldGroup!(T, convertToZero, allowZero)); 2496 2497 static struct Result 2498 { 2499 private SplitFieldListRange _splitFieldList; 2500 private NumericFieldGroupParse _currFieldParse; 2501 2502 this(string fieldList, char delim) 2503 { 2504 _splitFieldList = fieldList.splitter(delim); 2505 _currFieldParse = 2506 (_splitFieldList.empty ? "" : _splitFieldList.front) 2507 .parseNumericFieldGroup!(T, convertToZero, allowZero); 2508 2509 if (!_splitFieldList.empty) _splitFieldList.popFront; 2510 } 2511 2512 bool empty() pure nothrow @safe @nogc 2513 { 2514 return _currFieldParse.empty; 2515 } 2516 2517 T front() pure @safe 2518 { 2519 import std.conv : to; 2520 2521 assert(!empty, "Attempting to fetch the front of an empty numeric field-list."); 2522 assert(!_currFieldParse.empty, "Internal error. Call to front with an empty _currFieldParse."); 2523 2524 return _currFieldParse.front.to!T; 2525 } 2526 2527 void popFront() pure @safe 2528 { 2529 assert(!empty, "Attempting to popFront an empty field-list."); 2530 2531 _currFieldParse.popFront; 2532 if (_currFieldParse.empty && !_splitFieldList.empty) 2533 { 2534 _currFieldParse = _splitFieldList.front.parseNumericFieldGroup!( 2535 T, convertToZero, allowZero); 2536 _splitFieldList.popFront; 2537 } 2538 } 2539 } 2540 2541 return Result(fieldList, delim); 2542 } 2543 2544 // parseNumericFieldList. 2545 @safe unittest 2546 { 2547 import std.algorithm : each, equal; 2548 import std.exception : assertThrown, assertNotThrown; 2549 2550 /* Basic tests. */ 2551 assert("1".parseNumericFieldList.equal([1])); 2552 assert("1,2".parseNumericFieldList.equal([1, 2])); 2553 assert("1,2,3".parseNumericFieldList.equal([1, 2, 3])); 2554 assert("1-2".parseNumericFieldList.equal([1, 2])); 2555 assert("1-2,6-4".parseNumericFieldList.equal([1, 2, 6, 5, 4])); 2556 assert("1-2,1,1-2,2,2-1".parseNumericFieldList.equal([1, 2, 1, 1, 2, 2, 2, 1])); 2557 assert("1-2,5".parseNumericFieldList!size_t.equal([1, 2, 5])); 2558 2559 /* Signed Int tests */ 2560 assert("1".parseNumericFieldList!int.equal([1])); 2561 assert("1,2,3".parseNumericFieldList!int.equal([1, 2, 3])); 2562 assert("1-2".parseNumericFieldList!int.equal([1, 2])); 2563 assert("1-2,6-4".parseNumericFieldList!int.equal([1, 2, 6, 5, 4])); 2564 assert("1-2,5".parseNumericFieldList!int.equal([1, 2, 5])); 2565 2566 /* Convert to zero tests */ 2567 assert("1".parseNumericFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0])); 2568 assert("1,2,3".parseNumericFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0, 1, 2])); 2569 assert("1-2".parseNumericFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0, 1])); 2570 assert("1-2,6-4".parseNumericFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0, 1, 5, 4, 3])); 2571 assert("1-2,5".parseNumericFieldList!(size_t, Yes.convertToZeroBasedIndex).equal([0, 1, 4])); 2572 2573 assert("1".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex).equal([0])); 2574 assert("1,2,3".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex).equal([0, 1, 2])); 2575 assert("1-2".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex).equal([0, 1])); 2576 assert("1-2,6-4".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex).equal([0, 1, 5, 4, 3])); 2577 assert("1-2,5".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex).equal([0, 1, 4])); 2578 2579 /* Allow zero tests. */ 2580 assert("0".parseNumericFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 2581 assert("1,0,3".parseNumericFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([1, 0, 3])); 2582 assert("1-2,5".parseNumericFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([1, 2, 5])); 2583 assert("0".parseNumericFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0])); 2584 assert("1,0,3".parseNumericFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([1, 0, 3])); 2585 assert("1-2,5".parseNumericFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([1, 2, 5])); 2586 assert("0".parseNumericFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([-1])); 2587 assert("1,0,3".parseNumericFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0, -1, 2])); 2588 assert("1-2,5".parseNumericFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).equal([0, 1, 4])); 2589 2590 /* Error cases. */ 2591 assertThrown("".parseNumericFieldList.each); 2592 assertThrown(" ".parseNumericFieldList.each); 2593 assertThrown(",".parseNumericFieldList.each); 2594 assertThrown("5 6".parseNumericFieldList.each); 2595 assertThrown(",7".parseNumericFieldList.each); 2596 assertThrown("8,".parseNumericFieldList.each); 2597 assertThrown("8,9,".parseNumericFieldList.each); 2598 assertThrown("10,,11".parseNumericFieldList.each); 2599 assertThrown("".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex).each); 2600 assertThrown("1,2-3,".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex).each); 2601 assertThrown("2-,4".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex).each); 2602 assertThrown("1,2,3,,4".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 2603 assertThrown(",7".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 2604 assertThrown("8,".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 2605 assertThrown("10,0,,11".parseNumericFieldList!(long, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 2606 assertThrown("8,9,".parseNumericFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 2607 2608 assertThrown("0".parseNumericFieldList.each); 2609 assertThrown("1,0,3".parseNumericFieldList.each); 2610 assertThrown("0".parseNumericFieldList!(int, Yes.convertToZeroBasedIndex, No.allowFieldNumZero).each); 2611 assertThrown("1,0,3".parseNumericFieldList!(int, Yes.convertToZeroBasedIndex, No.allowFieldNumZero).each); 2612 assertThrown("0-2,6-0".parseNumericFieldList!(size_t, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 2613 assertThrown("0-2,6-0".parseNumericFieldList!(int, No.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 2614 assertThrown("0-2,6-0".parseNumericFieldList!(int, Yes.convertToZeroBasedIndex, Yes.allowFieldNumZero).each); 2615 }