You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 09_Working_with_CSV.md
+72-9Lines changed: 72 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,17 +36,49 @@ See that the elements have no name literal names but are only numbers.
36
36
But the csv has a header we need to add the option `(hasHeader="true")` to `decode-csv` in the flux.
37
37
38
38
39
-
You can extract specified fields while converting to another tabular format by using the fix. This is quite handy for analysis of specific fields or to generate reports.
39
+
You can extract specified fields while converting to another tabular format by using the fix. This is quite handy for analysis of specific fields or to generate reports. In the following example we only keep three columns (`ISBN"`,`"Title"`,`"Author"`):
By default Metafactures `decode-csv` expects that CSV fields are separated by comma ‘,’ and strings are quoted with double qoutes ‘”‘ or single quotes `'`. You can specify other characters as separator or quotes with the option ‘separator’ and clean special quote signs with the fix:
[See the example in the Playground](https://metafacture.org/playground/?flux=%22https%3A//lib.ugent.be/download/librecat/data/goodreads.csv%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-csv%28hasHeader%3D%22true%22%29%0A%7C+fix%28transformationFile%29%0A%7C+encode-csv%28includeHeader%3D%22true%22%29%0A%7C+print%0A%3B&transformation=retain%28%22ISBN%22%2C%22Title%22%2C%22Author%22%29)
60
+
61
+
By default Metafactures `decode-csv` expects that CSV fields are separated by comma ‘,’ and strings are quoted with double qoutes ‘”‘ or single quotes `'`. You can specify other characters as separator or quotes with the option ‘separator’ and clean special quote signs with the fix. (In contrast to Catmandu quote-chars cannot be manipulated by the decoder directly, yet.)
62
+
63
+
Flux:
44
64
45
-
See:
65
+
```text
66
+
"12157;$The Journal of Headache and Pain$;2193-1801"
(Different to Catmandu quote-chars cannot be manipulated by the decoder directly.)
81
+
[See the example in the Playground.](https://metafacture.org/playground/?flux=%2212157%3B%24The+Journal+of+Headache+and+Pain%24%3B2193-1801%22%0A%7C+read-string%0A%7C+as-lines%0A%7C+decode-csv%28separator%3D%22%3B%22%29%0A%7C+fix%28transformationFile%29%0A%7C+encode-csv%28separator%3D%22\t%22%2C+includeheader%3D%22true%22%29%0A%7C+print%3B&transformation=replace_all%28%22%3F%22%2C%22%5E\\%24%7C\\%24%24%22%2C%22%22%29)
50
82
51
83
In the example above we read the string as a little CSV fragment using the `read-string` command for our small test. It will read the tiny CSV string which uses “;” and “$” as separation and quotation characters.
52
84
The string is then read each line by `as-lines` and decoded as csv with the separator `,`.
@@ -55,13 +87,44 @@ With a little fix you can
55
87
56
88
## Writing CSVs
57
89
58
-
When exporting data a tabular format you can change the field names in the header or omit the header:
90
+
When harvesting data in tabular format you also can change the field names in the header or omit the header:
[See example in he playground.](https://metafacture.org/playground/?flux=%22https%3A//lib.ugent.be/download/librecat/data/goodreads.csv%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-csv%28hasheader%3D%22true%22%29%0A%7C+fix%28transformationFile%29%0A%7C+encode-csv%28includeHeader%3D%22true%22%29%0A%7C+print%3B&transformation=move_field%28%22ISBN%22%2C%22A%22%29%0Amove_field%28%22Title%22%2C%22B%22%29%0Amove_field%28%22Author%22%2C%22C%22%29%0A%0Aretain%28%22A%22%2C%22B%22%2C%22C%22%29)
61
115
62
116
You can transform the data to an tsv file with the separator \t which has no header like this.
[See example in playground.](https://metafacture.org/playground/?flux=%22https%3A//lib.ugent.be/download/librecat/data/goodreads.csv%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-csv%28hasheader%3D%22true%22%29%0A%7C+fix%28transformationFile%29%0A%7C+encode-csv%28separator%3D%22\t%22%2C+noQuotes%3D%22true%22%29%0A%7C+print%3B&transformation=retain%28%22ISBN%22%2C%22Title%22%2C%22Author%22%29)
65
128
66
129
When you create a CSV from a by export complex/nested data structures to a tabular format, you must “flatten” the datastructure. Also
67
130
you have to be aware that the order and number of elements in every record is the same otherwise the header does not match the records.
0 commit comments