You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/ROOT/pages/tools/neo4j-admin/neo4j-admin-import.adoc
+34-14Lines changed: 34 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,17 @@ Change Data Capture does **not** capture any data changes resulting from the use
25
25
See link:{neo4j-docs-base-uri}/cdc/current/get-started/self-managed/#non-tx-log-changes/[Change Data Capture -> Key considerations] for more information.
26
26
====
27
27
28
+
[role=label--beta]
29
+
== Parquet file support
30
+
Starting with Neo4j 5.24, Neo4j provides support for the Parquet file format in a public beta version.
31
+
The additional parameter `--input-type [csv|parquet]` was introduced to explicitly tell the importer to use either CSV or Parquet.
32
+
Its value defaults to CSV if it is not defined.
33
+
34
+
Most of the parameters that can be used to configure the import are also valid for the Parquet format.
35
+
There are indicators in the parameter overview to point out which parameters are supported.
36
+
37
+
The xref:tools/neo4j-admin/neo4j-admin-import.adoc#import-tool-examples[examples] for CSV can also be used with Parquet.
38
+
28
39
== Overview
29
40
30
41
The `neo4j-admin database import` command has two modes both used for initial data import:
@@ -149,15 +160,15 @@ For horizontal tabulation (HT), use `\t` or the Unicode character ID `\9`.
149
160
Unicode character ID can be used if prepended by `\`.
150
161
|;
151
162
152
-
| --auto-skip-subsequent-headers[=true\|false]
163
+
| --auto-skip-subsequent-headers[=true\|false]^1^
153
164
|Automatically skip accidental header lines in subsequent files in file groups with more than one file.
154
165
|false
155
166
156
167
|--bad-tolerance=<num>
157
168
|Number of bad entries before the import is aborted. The import process is optimized for error-free data. Therefore, cleaning the data before importing it is highly recommended. If you encounter any bad entries during the import process, you can set the number of bad entries to a specific value that suits your needs. However, setting a high value may affect the performance of the tool.
158
169
|1000
159
170
160
-
|--delimiter=<char>
171
+
|--delimiter=<char>^1^
161
172
|Delimiter character between values in CSV data. Also accepts `TAB` and e.g. `U+20AC` for specifying a character using Unicode.
162
173
163
174
====
@@ -206,14 +217,19 @@ Possible values are:
206
217
|Whether or not empty string fields, i.e. "" from input source are ignored, i.e. treated as null.
207
218
|false
208
219
209
-
|--ignore-extra-columns[=true\|false]
220
+
|--ignore-extra-columns[=true\|false]^1^
210
221
|If unspecified columns should be ignored during the import.
211
222
|false
212
223
213
-
|--input-encoding=<character-set>
224
+
|--input-encoding=<character-set>^1^
214
225
|Character set that input data is encoded in.
215
226
|UTF-8
216
227
228
+
|--input-type[=csv\|parquet]
229
+
label:beta[]
230
+
|File type to import.
231
+
|csv
232
+
217
233
|--legacy-style-quoting[=true\|false]
218
234
|Whether or not a backslash-escaped quote e.g. \" is interpreted as an inner quote.
219
235
|false
@@ -225,7 +241,7 @@ Values can be plain numbers, such as `10000000`, or `20G` for 20 gigabytes.
225
241
It can also be specified as a percentage of the available memory, for example `70%`.
226
242
|90%
227
243
228
-
|--multiline-fields[=true\|false]
244
+
|--multiline-fields[=true\|false]^1^
229
245
|Whether or not fields from an input source can span multiple lines, i.e. contain newline characters.
230
246
231
247
Setting `--multiline-fields=true` can severely degrade the performance of the importer.
@@ -253,7 +269,7 @@ For an example, see <<import-tool-multiple-input-files-regex-example>>.
253
269
|Delete any existing database files prior to the import.
254
270
|false
255
271
256
-
|--quote=<char>
272
+
|--quote=<char>^1^
257
273
|Character to treat as quotation character for values in CSV data.
258
274
259
275
Quotes can be escaped as per link:{rfc-4180}[RFC 4180] by doubling them, for example `""` would be interpreted as a literal `"`.
@@ -328,7 +344,7 @@ If enabled all those relationships will be found but at the cost of lower perfor
328
344
performance, this value should not be greater than the number of available processors.
329
345
|20
330
346
331
-
|--trim-strings[=true\|false]
347
+
|--trim-strings[=true\|false]^1^
332
348
|Whether or not strings should be trimmed for whitespaces.
333
349
|false
334
350
@@ -337,6 +353,8 @@ performance, this value should not be greater than the number of available proce
337
353
|
338
354
|===
339
355
356
+
^1^ __Ignored by Parquet import label:beta[].__ +
357
+
340
358
[NOTE]
341
359
.Heap size for the import
342
360
====
@@ -666,15 +684,15 @@ For horizontal tabulation (HT), use `\t` or the Unicode character ID `\9`.
666
684
Unicode character ID can be used if prepended by `\`.
667
685
|;
668
686
669
-
| --auto-skip-subsequent-headers[=true\|false]
687
+
| --auto-skip-subsequent-headers[=true\|false]^1^
670
688
|Automatically skip accidental header lines in subsequent files in file groups with more than one file.
671
689
|false
672
690
673
691
|--bad-tolerance=<num>
674
692
|Number of bad entries before the import is aborted. The import process is optimized for error-free data. Therefore, cleaning the data before importing it is highly recommended. If you encounter any bad entries during the import process, you can set the number of bad entries to a specific value that suits your needs. However, setting a high value may affect the performance of the tool.
675
693
|1000
676
694
677
-
|--delimiter=<char>
695
+
|--delimiter=<char>^1^
678
696
|Delimiter character between values in CSV data. Also accepts `TAB` and e.g. `U+20AC` for specifying a character using Unicode.
679
697
680
698
====
@@ -721,11 +739,11 @@ Possible values are:
721
739
|Whether or not empty string fields, i.e. "" from input source are ignored, i.e. treated as null.
722
740
|false
723
741
724
-
|--ignore-extra-columns[=true\|false]
742
+
|--ignore-extra-columns[=true\|false]^1^
725
743
|If unspecified columns should be ignored during the import.
726
744
|false
727
745
728
-
|--input-encoding=<character-set>
746
+
|--input-encoding=<character-set>^1^
729
747
|Character set that input data is encoded in.
730
748
|UTF-8
731
749
@@ -740,7 +758,7 @@ Values can be plain numbers, such as `10000000`, or `20G` for 20 gigabytes.
740
758
It can also be specified as a percentage of the available memory, for example `70%`.
741
759
|90%
742
760
743
-
|--multiline-fields[=true\|false]
761
+
|--multiline-fields[=true\|false]^1^
744
762
|Whether or not fields from an input source can span multiple lines, i.e. contain newline characters.
745
763
746
764
Setting `--multiline-fields=true` can severely degrade the performance of the importer.
@@ -764,7 +782,7 @@ For an example, see <<import-tool-multiple-input-files-regex-example>>.
764
782
|When `true`, non-array property values are converted to their equivalent Cypher types. For example, all integer values will be converted to 64-bit long integers.
765
783
| true
766
784
767
-
|--quote=<char>
785
+
|--quote=<char>^1^
768
786
|Character to treat as quotation character for values in CSV data.
769
787
770
788
Quotes can be escaped as per link:{rfc-4180}[RFC 4180] by doubling them, for example `""` would be interpreted as a literal `"`.
@@ -844,7 +862,7 @@ If enabled all those relationships will be found but at the cost of lower perfor
844
862
performance, this value should not be greater than the number of available processors.
845
863
|20
846
864
847
-
|--trim-strings[=true\|false]
865
+
|--trim-strings[=true\|false]^1^
848
866
|Whether or not strings should be trimmed for whitespaces.
849
867
|false
850
868
@@ -853,6 +871,8 @@ performance, this value should not be greater than the number of available proce
853
871
|
854
872
|===
855
873
874
+
^1^ __Ignored by Parquet import label:beta[].__ +
875
+
856
876
[NOTE]
857
877
.Using both a multi-value option and a positional parameter
0 commit comments