You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/ROOT/pages/tools/neo4j-admin/neo4j-admin-import.adoc
+23-62Lines changed: 23 additions & 62 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@
5
5
:rfc-4180: https://tools.ietf.org/html/rfc4180
6
6
7
7
`neo4j-admin database import` writes CSV data into Neo4j's native file format as fast as possible. +
8
-
From Neo4j 5.25, Neo4j also provides support for the Parquet file format in a public beta version.
8
+
Starting with version 5.26, Neo4j also provides support for the Parquet file format.
9
9
10
10
You should use this tool when:
11
11
@@ -128,27 +128,10 @@ For more information, please contact Neo4j Professional Services.
128
128
129
129
=== Options
130
130
131
-
[role=label--beta]
132
-
.Parquet file support
133
-
[NOTE]
134
-
====
135
-
Starting with Neo4j 5.25, Neo4j provides support for the Parquet file format in a public beta version.
136
-
The additional parameter `--input-type [csv|parquet]` is introduced to explicitly tell the importer to use either CSV or Parquet.
137
-
Its value defaults to CSV if it is not defined.
138
-
139
-
Most of the parameters that can be used to configure the import are also valid for the Parquet format.
140
-
The following parameters are not supported (see <<full-import-options-table, `neo4j-admin database import full` options>> table for more details):
141
-
142
-
- `--auto-skip-subsequent-headers`
143
-
- `--delimiter`
144
-
- `--ignore-extra-columns`
145
-
- `--input-encoding`
146
-
- `--multiline-fields`
147
-
- `--quote`
148
-
- `--trim-strings`
149
-
131
+
Starting from Neo4j 5.26, the importer also supports the Parquet file format.
132
+
An additional parameter `--input-type=csv|parquet` has been introduced to explicitly specify whether to use CSV or Parquet for the importer.
133
+
If not defined, the default value will be CSV.
150
134
The xref:tools/neo4j-admin/neo4j-admin-import.adoc#import-tool-examples[examples] for CSV can also be used with Parquet.
151
-
====
152
135
153
136
[[full-import-options-table]]
154
137
.`neo4j-admin database import full` options
@@ -177,15 +160,15 @@ For horizontal tabulation (HT), use `\t` or the Unicode character ID `\9`.
177
160
Unicode character ID can be used if prepended by `\`.
178
161
|;
179
162
180
-
| --auto-skip-subsequent-headers[=true\|false]
163
+
| --auto-skip-subsequent-headers[=true\|false] footnote:ingnoredByParquet1[Ignored by Parquet import.]
181
164
|Automatically skip accidental header lines in subsequent files in file groups with more than one file.
182
165
|false
183
166
184
167
|--bad-tolerance=<num>
185
168
|Number of bad entries before the import is aborted. The import process is optimized for error-free data. Therefore, cleaning the data before importing it is highly recommended. If you encounter any bad entries during the import process, you can set the number of bad entries to a specific value that suits your needs. However, setting a high value may affect the performance of the tool.
186
169
|1000
187
170
188
-
|--delimiter=<char>
171
+
|--delimiter=<char> footnote:ingnoredByParquet1[]
189
172
|Delimiter character between values in CSV data. Also accepts `TAB` and e.g. `U+20AC` for specifying a character using Unicode.
190
173
191
174
====
@@ -234,16 +217,16 @@ Possible values are:
234
217
|Whether or not empty string fields, i.e. "" from input source are ignored, i.e. treated as null.
|label:changed[Changed in 5.26] In v1, whether or not fields from an input source can span multiple lines, i.e. contain newline characters. Setting `--multiline-fields=true` can severely degrade the performance of the importer. Therefore, use it with care, especially with large imports. In v2, this option will specify the list of files that contain multiline fields. Files can also be specified using regular expressions.
|label:new[Introduced in 5.26] Controls the parsing of input source that can span multiple lines, i.e. contain newline characters. When set to v1, the value for `--multiline-fields` can only be true or false. When set to v2, the value for `--multiline-fields` should be the list of files that contain multiline fields.
266
249
|null
267
250
@@ -286,7 +269,7 @@ For an example, see <<import-tool-multiple-input-files-regex-example>>.
286
269
|Delete any existing database files prior to the import.
287
270
|false
288
271
289
-
|--quote=<char>
272
+
|--quote=<char> footnote:ingnoredByParquet1[]
290
273
|Character to treat as quotation character for values in CSV data.
291
274
292
275
Quotes can be escaped as per link:{rfc-4180}[RFC 4180] by doubling them, for example `""` would be interpreted as a literal `"`.
@@ -361,7 +344,7 @@ If enabled all those relationships will be found but at the cost of lower perfor
361
344
performance, this value should not be greater than the number of available processors.
|Whether or not strings should be trimmed for whitespaces.
366
349
|false
367
350
@@ -465,7 +448,7 @@ bin/neo4j-admin database import full --nodes import/movies_header.csv,import/mov
465
448
[[indexes-constraints-import]]
466
449
==== Provide indexes and constraints during import
467
450
468
-
Starting with Neo4j 5.24, you can use the `--schema` option that allows Cypher commands to be provided to create indexes/constraints during the initial import process.
451
+
Starting from Neo4j 5.24, you can use the `--schema` option that allows Cypher commands to be provided to create indexes/constraints during the initial import process.
469
452
It currently only works for the block format and full import.
470
453
471
454
You should have a Cypher script containing only `CREATE INDEX|CONSTRAINT` commands to be parsed and executed.
@@ -677,28 +660,6 @@ If the database into which you import does not exist prior to importing, you mus
677
660
678
661
=== Options
679
662
680
-
[role=label--beta]
681
-
.Parquet file support
682
-
[NOTE]
683
-
====
684
-
Starting with Neo4j 5.25, Neo4j provides support for the Parquet file format in a public beta version.
685
-
The additional parameter `--input-type [csv|parquet]` is introduced to explicitly tell the importer to use either CSV or Parquet.
686
-
Its value defaults to CSV if it is not defined.
687
-
688
-
Most of the parameters that can be used to configure the import are also valid for the Parquet format.
689
-
The following parameters are not supported (see <<incremental-import-options-table, `neo4j-admin database import incremental` options>> table for more details):
690
-
691
-
- `--auto-skip-subsequent-headers`
692
-
- `--delimiter`
693
-
- `--ignore-extra-columns`
694
-
- `--input-encoding`
695
-
- `--multiline-fields`
696
-
- `--quote`
697
-
- `--trim-strings`
698
-
699
-
The xref:tools/neo4j-admin/neo4j-admin-import.adoc#import-tool-examples[examples] for CSV can also be used with Parquet.
@@ -726,15 +687,15 @@ For horizontal tabulation (HT), use `\t` or the Unicode character ID `\9`.
726
687
Unicode character ID can be used if prepended by `\`.
727
688
|;
728
689
729
-
| --auto-skip-subsequent-headers[=true\|false]
690
+
| --auto-skip-subsequent-headers[=true\|false] footnote:ingnoredByParquet2[Ignored by Parquet import.]
730
691
|Automatically skip accidental header lines in subsequent files in file groups with more than one file.
731
692
|false
732
693
733
694
|--bad-tolerance=<num>
734
695
|Number of bad entries before the import is aborted. The import process is optimized for error-free data. Therefore, cleaning the data before importing it is highly recommended. If you encounter any bad entries during the import process, you can set the number of bad entries to a specific value that suits your needs. However, setting a high value may affect the performance of the tool.
735
696
|1000
736
697
737
-
|--delimiter=<char>
698
+
|--delimiter=<char> footnote:ingnoredByParquet2[]
738
699
|Delimiter character between values in CSV data. Also accepts `TAB` and e.g. `U+20AC` for specifying a character using Unicode.
739
700
740
701
====
@@ -781,16 +742,16 @@ Possible values are:
781
742
|Whether or not empty string fields, i.e. "" from input source are ignored, i.e. treated as null.
|label:changed[Changed in 5.26] In v1, whether or not fields from an input source can span multiple lines, i.e. contain newline characters. Setting `--multiline-fields=true` can severely degrade the performance of the importer. Therefore, use it with care, especially with large imports. In v2, this option will specify the list of files that contain multiline fields. Files can also be specified using regular expressions.
|label:new[Introduced in 5.26] Controls the parsing of input source that can span multiple lines, i.e. contain newline characters. When set to v1, the value for `--multiline-fields` can only be true or false. When set to v2, the value for `--multiline-fields` should be the list of files that contain multiline fields.
813
774
|null
814
775
@@ -829,7 +790,7 @@ For an example, see <<import-tool-multiple-input-files-regex-example>>.
829
790
|When `true`, non-array property values are converted to their equivalent Cypher types. For example, all integer values will be converted to 64-bit long integers.
830
791
| true
831
792
832
-
|--quote=<char>
793
+
|--quote=<char> footnote:ingnoredByParquet2[]
833
794
|Character to treat as quotation character for values in CSV data.
834
795
835
796
Quotes can be escaped as per link:{rfc-4180}[RFC 4180] by doubling them, for example `""` would be interpreted as a literal `"`.
@@ -913,7 +874,7 @@ If enabled all those relationships will be found but at the cost of lower perfor
913
874
performance, this value should not be greater than the number of available processors.
0 commit comments