You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+80-76Lines changed: 80 additions & 76 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,89 +4,91 @@
4
4
5
5
## Introduction
6
6
7
-
When populating CouchDB databases, often the source of the data is initially a CSV or TSV file. CouchImport is designed to assist you with importing flat data into CouchDB efficiently.
7
+
When populating CouchDB databases, often the source of the data is initially a CSV or TSV file. *couchimport* is designed to assist you with importing flat data into CouchDB efficiently.
8
8
It can be used either as command-line utilities `couchimport` and `couchexport` or the underlying functions can be used programmatically:
9
9
10
-
* simply pipe the data file to 'couchimport' on the command line
11
-
* handles tab or commaseparated data
12
-
* uses Node.js's streams for memory efficiency
13
-
* plug in a custom function to add your own changes before the data is written
14
-
* writes the data in bulk for speed
15
-
* can also write huge JSON files using a streaming JSON parser
16
-
* allows multiple writes to happen at once using the `--parallelism` option
10
+
* simply pipe the data file to *couchimport* on the command line.
11
+
* handles tab or comma-separated data.
12
+
* uses Node.js's streams for memory efficiency.
13
+
* plug in a custom function to add your own changes before the data is written.
14
+
* writes the data in bulk for speed.
15
+
* can also read huge JSON files using a streaming JSON parser.
16
+
* allows multiple HTTP writes to happen at once using the `--parallelism` option.
@@ -95,24 +97,28 @@ This example downloads public crime data, unzips and imports it:
95
97
cat crime_incidents_2013_CSV.csv | couchimport
96
98
```
97
99
98
-
In the above example we use "(ccurl)[https://github.com/glynnbird/ccurl]" a command-line utility that uses the same environment variables as "couchimport".
100
+
In the above example we use (ccurl)[https://github.com/glynnbird/ccurl], a command-line utility that uses the same environment variables as *couchimport*.
99
101
100
102
## Output
101
103
102
104
The following output is visible on the console when "couchimport" runs:
The configuration, whether default or overriden by environment variables or command line arguments, is shown. This is followed by a line of output for each block of 500 documents written, plus a cumulative total.
@@ -121,7 +127,7 @@ The configuration, whether default or overriden by environment variables or comm
121
127
122
128
If you want to see a preview of the JSON that would be created from your csv/tsv files then add `--preview true` to your command-line:
123
129
124
-
```
130
+
```sh
125
131
> cat text.txt | couchimport --preview true
126
132
Detected a TAB column delimiter
127
133
{ product_id: '1',
@@ -137,21 +143,21 @@ As well as showing a JSON preview, preview mode also attempts to detect the colu
137
143
138
144
If your source document is a GeoJSON text file, `couchimport` can be used. Let's say your JSON looks like this:
139
145
140
-
```
146
+
```js
141
147
{ "features": [ { "a":1}, {"a":2}] }
142
148
```
143
149
144
-
and we need to import each feature object into CouchDB as separate documents, then this can be imported using the `type="json"` argument and specifying the JSON path using the `json-path` argument:
150
+
and we need to import each feature object into CouchDB as separate documents, then this can be imported using the `type="json"` argument and specifying the JSON path using the `jsonpath` argument:
@@ -361,10 +367,8 @@ The emitted data is an object containing:
361
367
362
368
## Parallelism
363
369
364
-
Using the `COUCH_PARALLELISM` environment variable or the `--parallelism` command-line option, couchimport can
365
-
be configured to write data in multiple parallel operations. If you have the networkbandwidth, this can significantly
366
-
speed up large data imports e.g.
370
+
Using the `COUCH_PARALLELISM` environment variable or the `--parallelism` command-line option, couchimport can be configured to write data in multiple parallel operations. If you have the networkbandwidth, this can significantly speed up large data imports e.g.
0 commit comments