Releases: mscaudill/tabbed
Tabbed v1.2.1
Commit: 80b5cea
This release enhances the speed of string, numeric, time, date and datetime cell conversions. Specifically the parsing module now
- reorders the conversion logic so float is tested first
- compute time, date, and datetime formats just once and store as global variables
- differentiate strings with both numeric and text components from dates more reliably
v1.2.0 Release
What Changed
This release incorporates all suggestions from the JOSS review including:
- Redefining all types to be generic types
- Support for using comma as a decimal
- Improvements to header and metadata detection for short files
- YY.MM.DD support in datetime parsing
Tabbed v1.1.1
What's Changed
This release contains the following Bugfix:
- The Sniffer's
typemethod now reports a column to have consistent types if the type is one of {int, float, complex} as these are subsets. This means the Sniffer considers these all Numeric type. This change improves type change detection of the header row but has no backward compatibility conflicts. At some point Tabbed may define this Numeric type formally.
Tabbed 1.1.0
What's Changed
This release contains the following Improvements:
- Sniffing.types method now support excluding lines that have missing values via an
exclusionparameter. The default is to ignore lines with any of['', ' ', '-', 'nan', 'NAN', 'NaN']for metadata, header and type detection. - Reader now accepts
pollandexclusionparameters to allow clients to more finely control what rows of a sniffed sample should be used for header, metadata and type detection.
This release also makes the following Bug Fixes
- Text files with a single column reported the delimiter as the empty string or the delimiter used in the metadata section (if present). When Sniffer makes a dialect it now checks for this and assigns the carriage return '\r' as the delimiter.
- Location of header and metadata by type differences was buggy because it mixed line length differences and type differences in the same function
sniffing._type_difference'. This has been clarified by creating a new_length_difference` protected method of the Sniffer class. This method is used to exclusively locate metadata rows by looking for differences in length compared to the data section rows. - Type difference detection of header, and metadata sections were flawed because
float,int, andcomplexnumbers were seen as different types. For example if a column of the file contained bothintandfloata type difference was detected and the header (or metadata) erroneously assigned. Tabbed now ignores these numeric-to-numeric like type differences when determining the header and metadata sections.
A Joss manuscript for tabbed has been submitted
Full Changelog: https://github.com/mscaudill/tabbed/commits/1.1.0
Tabbed v1.0.1
What's Changed
This is the initial release of tabbed. The four features of this delimited text file reader are:
- automatic sniffing of the metadata, header and data sections of irregularly structured files
- automatic type casting to
int,float,complex,time,dateanddatetimeinstances - conditional reading of rows with equality, membership, rich comparisons, regex and custom callable filters called tabs
- partial and iterative reading for large file support
A Joss manuscript for tabbed is underway
Full Changelog: https://github.com/mscaudill/tabbed/commits/1.0.1
Tabbed 1.0.1
What's Changed
This is the initial release of tabbed. The four features of this delimited text file reader are:
- automatic sniffing of the metadata, header and data sections of irregularly structured files
- automatic type casting to
int,float,complex,time,dateanddatetimeinstances - conditional reading of rows with equality, membership, rich comparisons, regex and custom callable filters called tabs
- partial and iterative reading for large file support
A Joss manuscript for tabbed is underway
Full Changelog: https://github.com/mscaudill/tabbed/commits/1.0.1