Skip to content

Extend support of CSV with CSV Dialect  #528

@nichtich

Description

@nichtich

First thanks for this great work! CSV format can be troublesome because of its many dialects. If support of CSV is going to be extended, I recommend using the CSV Dialect Specification or a compatible subset of it. By now, fq supports two CSV Dialect properties with differing name and default. An alternative is CSVW dialect description (thanks xkcd #972!). Here is a comparision:

fq default csvddf default csvw default
comma , delimiter , delimiter ,
comment # commentChar not set commentPrefix null or # (spec is ambiguous)

Names can be adjusted by aliases, I prefer short names anyway. Remaining properties found in csvddf and csvw are:

csvddf default csvw default
quoteChar " quoteChar "
skipInitialSpace false skipInitialSpace false
header true header true
being discussed headerRowCount 1 if header set else 0
lineTerminator \r\n lineTerminators ["\r\n", "\n"]
doubleQuote true
doubleQuote true
escapeChar not set
nullSequence not set
skipBlankRows false
skipColumns 0
skipRows 0
encoding utf-8
trim true

Property doubleQuote differs in meaning between the two (csvw uses it to also set escapeChar). csvddf further has property caseSensitiveHeader (default false) but there are discussions to remove it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions