Skip to content

CSV Interface v4.0.0

Choose a tag to compare

@ws-garcia ws-garcia released this 23 Dec 09:02
· 143 commits to master since this release
45151cb

Bugs fixed:

  • The GetRecord method was not able to discard unwanted fields.
  • DumpToSheet method cannot rename the attached sheet if it does not exist before.

Improvements:

  • Insert and remove fields and records.
  • Data filtering.
  • Rearrange, merge, split fields.
  • Shift fields and records.
  • Dedupe CSV records.
  • The user can now sort data based on multiple columns with the IntroSort, QuickSort, HeapSort and Merge algorithms via the Sort method in a intuitive way: use -1 to indicate a descending sort on column 1. Also data can be sorted by fields, Microssoft calls it left to right sort, that means that all records will be rearranged together with the record, typically header, chosen as key.
  • Optimized DumpToSheet and ExportToCSV methods.
  • Parser and writer accept Unix DSV files, e.g., \, will escape the , when used as a field delimiter.
  • I/O operations on UTF-8 CSV files, commonly found on web sites, are now supported via streams.
  • Refactoring: Added CSVSniffer class module.
  • Refactoring: Added CSVdialect class module (field delimiter, record delimiter and escape token are managed in it and used in CSVparserConfig module).
  • Refactoring: Added EscapeStyle enumeration.
  • Refactoring: Added escapeMode property.
  • Refactoring: Added SortingAlgorithms enumeration.
  • Refactoring: Added the utf8EncodedFile property.
  • The delimiter sniffer has a powerful mix of simple scoring due to the field data type and robust statistical scoring in order to check the uniformity of data in fields and records. Only one row of data can instruct the dialect, however, if the CSV file has headers the disambiguation rate increases by 298%! The sniffer now returns a CSVdialect object with the guessed delimiters.

Member changes

  • Renamed method: GuessDelimiters --> SniffDelimiters
  • Renamed method: CSVdatasetSplit --> CSVsubsetSplit
  • Renamed property: turnStreamRecDelimiterToLF --> multiEndOfLineCSV
  • Renamed property: rectangularResults --> uniformLengthRecords
  • Renamed enumeration: EscapeTokens --> QuoteTokens
  • Renamed class module: parserConfig --> CSVparserConfig
  • Renamed class module: ECPArrayList --> CSVArrayList
  • Renamed class module: ECPTextStream --> CSVTextStream

Deprecated members

  • unixEscapeMechanism property.

Documentation:

  • Added an extensive set of details for each module of the CSV interface class.