diff --git a/modules/ROOT/content-nav.adoc b/modules/ROOT/content-nav.adoc index ecf09b0f9..3e130954b 100644 --- a/modules/ROOT/content-nav.adoc +++ b/modules/ROOT/content-nav.adoc @@ -113,6 +113,7 @@ ** Standard databases *** xref:database-administration/standard-databases/naming-databases.adoc[] *** xref:database-administration/standard-databases/create-databases.adoc[] +*** xref:database-administration/standard-databases/seed-from-uri.adoc[] *** xref:database-administration/standard-databases/listing-databases.adoc[] *** xref:database-administration/standard-databases/alter-databases.adoc[] *** xref:database-administration/standard-databases/delete-databases.adoc[] diff --git a/modules/ROOT/pages/backup-restore/planning.adoc b/modules/ROOT/pages/backup-restore/planning.adoc index 42d1e38ee..96f15821e 100644 --- a/modules/ROOT/pages/backup-restore/planning.adoc +++ b/modules/ROOT/pages/backup-restore/planning.adoc @@ -92,7 +92,7 @@ See xref:clustering/monitoring/show-databases-monitoring.adoc#show-databases-mon However, _restoring_ a database in a cluster is different since it is not known in advance how a database is going to be allocated to the servers in a cluster. This method relies on the seed already existing on one of the servers. -The recommended way to restore a database in a cluster is to xref:clustering/databases.adoc#cluster-seed-uri[seed from URI]. +The recommended way to restore a database in a cluster is to xref::database-administration/standard-databases/seed-from-uri.adoc[seed from URI]. [NOTE] ==== diff --git a/modules/ROOT/pages/clustering/databases.adoc b/modules/ROOT/pages/clustering/databases.adoc index 5408ae738..3a41ca751 100644 --- a/modules/ROOT/pages/clustering/databases.adoc +++ b/modules/ROOT/pages/clustering/databases.adoc @@ -299,7 +299,7 @@ See <> for mor If you provide a URI to a backup or a dump, the stores on all allocations will be replaced by the backup or the dump at the given URI. The new allocations can be put on any `ENABLED` server in the cluster. -See <> for more details. +See xref::database-administration/standard-databases/seed-from-uri.adoc[Seed from URI] for more details. [source, shell] @@ -371,9 +371,12 @@ CALL dbms.cluster.recreateDatabase("neo4j", {seedingServers: [], primaries: 3, s [[cluster-seed]] == Seed a cluster -There are two different ways to seed a cluster with data. -The first option is to use a _designated seeder_, where a designated server is used to create a backed-up database on other servers in the cluster. -The other options is to seed the cluster from URI, where all servers to host a database are seeded with an identical seed from an external source specified by the URI. +There are two different ways to seed a cluster with data: + +* The first option is to use a _designated seeder_, where a designated server is used to create a backed-up database on other servers in the cluster. +* The other option is to seed the cluster from a URI, where all servers to host the database are seeded with an identical seed from an external source specified by that URI. +For more details, see xref:database-administration/standard-databases/seed-from-uri.adoc[Create a database from a URI]. + Keep in mind that using a designated seeder can be problematic in some situations as it is not known in advance how a database is going to be allocated to the servers in a cluster. Also, this method relies on the seed already existing on one of the servers. @@ -450,227 +453,6 @@ SHOW DATABASE foo; 9 rows available after 3 ms, consumed after another 1 ms ---- -[[cluster-seed-uri]] -=== Seed from URI - -This method seeds all servers with an identical seed from an external source, specified by the URI. -The seed can either be a full backup, a differential backup (see xref:clustering/databases.adoc#cloud-seed-provider[`CloudSeedProvider`]), or a dump from an existing database. -The sources of seeds are called _seed providers_. - -The mechanism is pluggable, allowing new sources of seeds to be supported (see link:https://www.neo4j.com/docs/java-reference/current/extending-neo4j/project-setup/#extending-neo4j-plugin-seed-provider[Java Reference -> Implement custom seed providers] for more information). -The product has built-in support for seed from a mounted file system (file), FTP server, HTTP/HTTPS server, Amazon S3, Google Cloud Storage, and Azure Cloud Storage. - -[NOTE] -==== -Amazon S3, Google Cloud Storage, and Azure Cloud Storage are supported by default, but the other providers require configuration of xref:configuration/configuration-settings.adoc#config_dbms.databases.seed_from_uri_providers[`dbms.databases.seed_from_uri_providers`]. -==== - -The URI of the seed is specified when the `CREATE DATABASE` command is issued: - -[source, cypher, role="noplay"] ----- -CREATE DATABASE foo OPTIONS {existingData: 'use', seedURI:'s3://myBucket/myBackup.backup'} ----- - -Download and validation of the seed is only performed as the new database is started. -If it fails, the database is not available and it has the `statusMessage`: `Unable to start database` of the `SHOW DATABASES` command. - -[source, cypher, role="noplay"] ----- -neo4j@neo4j> SHOW DATABASES; -+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| name | type | aliases | access | address | role | writer | requestedStatus | currentStatus | statusMessage | default | home | constituents | -+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| "seed3" | "standard" | [] | "read-write" | "localhost:7682" | "unknown" | FALSE | "online" | "offline" | "Unable to start database `DatabaseId{3fe1a59b[seed3]}`" | FALSE | FALSE | [] | -+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ ----- - -To determine the cause of the problem, it is recommended to look at the `debug.log`. - -[NOTE] -==== -Starting from Neo4j 2025.01, seed from URI can also be used in combination with xref:database-administration/standard-databases/create-databases.adoc[`CREATE OR REPLACE DATABASE`]. -==== - - -[[file-seed-provider]] -==== FileSeedProvider - -The `FileSeedProvider` supports: - -** `file:` - -[[url-connection-seed-provider]] -==== URLConnectionSeedProvider - -The `URLConnectionSeedProvider` supports the following: - -** `ftp:` -** `http:` -** `https:` - -Starting from Neo4j 2025.01, the `URLConnectionSeedProvider` does not support `file`. -// This is true for both Cypher 5 and Cypher 25. - -[[cloud-seed-provider]] -==== CloudSeedProvider - -The `CloudSeedProvider` supports: - -** `s3:` -** `gs:` -** `azb:` - -The `CloudSeedProvider` supports using xref:backup-restore/modes.adoc#differential-backup[differential backup] files as seeds. -With the provided differential backup file, the `CloudSeedProvider` searches the directory containing differential backup files for a xref:backup-restore/online-backup.adoc#backup-chain[backup chain] ending at the specified differential backup, and then seeds using this backup chain. - -[.tabbed-example] -===== -[role=include-with-AWS-S3] -====== - -include::partial$/aws-s3-overrides.adoc[] - -include::partial$/aws-s3-credentials.adoc[] - -. Create database from `myBackup.backup`. -+ -[source,shell, role="nocopy"] ----- -CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup' } ----- - -====== -[role=include-with-Google-cloud-storage] -====== - -include::partial$/gcs-credentials.adoc[] - -. Create database from `myBackup.backup`. -+ -[source,shell] ----- -CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 'gs://myBucket/myBackup.backup' } ----- -====== -[role=include-with-Azure-cloud-storage] -====== - -include::partial$/azb-credentials.adoc[] - -. Create database from `myBackup.backup`. -+ -[source,shell] ----- -CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 'azb://myStorageAccount/myContainer/myBackup.backup' } ----- -====== -===== - -Starting from Neo4j 2025.01, the `CloudSeedProvider` supports seeding up to a specific date or transaction ID using the `seedRestoreUntil` option. - -[role=label--new-2025.01] -Seed up to a specific date:: - -To seed up to a specific date, you need to pass the differential backup, which contains the data up to that date. -+ -[source,shell] ----- -CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedRestoreUntil: datetime("2019-06-01T18:40:32.142+0100") } ----- -+ -This will seed the database with transactions committed before the provided timestamp. - -[role=label--new-2025.01] -Seed up to a specific transaction ID:: - -To seed up to a specific transaction ID, you need to pass the differential backup that contains the data up to that transaction ID. -+ -[source,shell] ----- -CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedRestoreUntil: 123 } ----- -+ -This will seed the database with transactions up to, but not including transaction 123. - -[role=label--deprecated] -[[s3-seed-provider]] -==== S3SeedProvider - -// When Cypher 25 is released, we have to label this section 'Cypher 5' as this functionality is only available in Cypher 5. - -The `S3SeedProvider` supports: - -** `s3:` label:deprecated[Deprecated in 5.26] - - -[NOTE] -==== -Neo4j comes bundled with necessary libraries for AWS S3 connectivity. -Therefore, if you use `S3SeedProvider`,`aws cli` is not required but can be used with the `CloudSeedProvider`. -==== - -The `S3SeedProvider` requires additional configuration. -This is specified with the `seedConfig` option. -This option expects a comma-separated list of configurations. -Each configuration value is specified as a name followed by `=` and the value, as such: - -[source, cypher, role="noplay"] ----- -CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedConfig: 'region=eu-west-1' } ----- - -`S3SeedProvider` also requires passing in credentials. -These are specified with the `seedCredentials` option. -Seed credentials are securely passed from the Cypher command to each server hosting the database. -For this to work, Neo4j on each server in the cluster must be configured with identical keystores. -This is identical to the configuration required by remote aliases, see xref:database-administration/aliases/remote-database-alias-configuration.adoc#remote-alias-config-DBMS_admin-A[Configuration of DBMS with remote database alias]. -If this configuration is not performed, the `seedCredentials` option fails. - -[source, cypher, role="noplay"] ----- -CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedConfig: 'region=eu-west-1', seedCredentials: [accessKey];[secretKey] } ----- -Where `accessKey` and `secretKey` are provided by AWS. - -==== Seed provider reference - -[cols="1,2,2",options="header"] -|=== -| URL scheme -| Seed provider -| URI example - -| `file:` -| `FileSeedProvider` -| `file://tmp/backup1.backup` - -| `ftp:` -| `URLConnectionSeedProvider` -| `ftp://myftp.com/backups/backup1.backup` - -| `http:` -| `URLConnectionSeedProvider` -| `\http://myhttp.com/backups/backup1.backup` - -| `https:` -| `URLConnectionSeedProvider` -| `\https://myhttp.com/backups/backup1.backup` - -| `s3:` -| `S3SeedProvider` label:deprecated[Deprecated in 5.26], + -`CloudSeedProvider` -| `s3://mybucket/backups/backup1.backup` - -| `gs:` -| `CloudSeedProvider` -| `gs://mybucket/backups/backup1.backup` - -| `azb:` -| `CloudSeedProvider` -| `azb://mystorageaccount.blob/backupscontainer/backup1.backup` -|=== - [[cluster-allow-deny-db]] == Controlling locations with allowed/denied databases diff --git a/modules/ROOT/pages/configuration/configuration-settings.adoc b/modules/ROOT/pages/configuration/configuration-settings.adoc index f87138b06..62e71d6e4 100644 --- a/modules/ROOT/pages/configuration/configuration-settings.adoc +++ b/modules/ROOT/pages/configuration/configuration-settings.adoc @@ -2143,7 +2143,7 @@ The following values are available: `CloudSeedProvider`, `FileSeedProvider`, `S3 This list specifies enabled seed providers. If a seed source (URI scheme) is supported by multiple providers in the list, the first matching provider will be used. If the list is set to empty, the seed from URI functionality is effectively disabled. -See xref:/clustering/databases.adoc#cluster-seed-uri[Seed from URI] for more information. +See xref::database-administration/standard-databases/seed-from-uri.adoc[Seed from a URI] for more information. |Valid values a|A comma-separated list where each element is a string. |Default value diff --git a/modules/ROOT/pages/cypher-shell.adoc b/modules/ROOT/pages/cypher-shell.adoc index 322651de0..36936d24c 100644 --- a/modules/ROOT/pages/cypher-shell.adoc +++ b/modules/ROOT/pages/cypher-shell.adoc @@ -74,7 +74,7 @@ The syntax for running Cypher Shell is: |auto |-P PARAM, --param PARAM -|Add a parameter to this session. Example: `-P {a: 1}`, `-P '{a: 1, b: duration({seconds: 1})}'`, or using arrow syntax `-P 'a => 1'`. This argument can be specified multiple times. +|Add a parameter to this session. Example: `-P '{a: 1}'`, `-P '{a: 1, b: duration({seconds: 1})}'`, or using arrow syntax `-P 'a => 1'`. This argument can be specified multiple times. |[] |--non-interactive diff --git a/modules/ROOT/pages/database-administration/standard-databases/create-databases.adoc b/modules/ROOT/pages/database-administration/standard-databases/create-databases.adoc index 23c20b13e..6df8bfd53 100644 --- a/modules/ROOT/pages/database-administration/standard-databases/create-databases.adoc +++ b/modules/ROOT/pages/database-administration/standard-databases/create-databases.adoc @@ -132,19 +132,19 @@ Replaced by `existingDataSeedServer`. | `seedURI` | URI to a backup or a dump from an existing database. | -Defines a seed from an external source, which will be used to seed all servers. +Defines an identical seed from an external source which will be used to seed all servers. +For more information, see xref::database-administration/standard-databases/seed-from-uri.adoc[Seed from a URI]. | `seedConfig` | Comma-separated list of configuration values. | -For more information see xref::clustering/databases.adoc#cluster-seed-uri[Seed from URI]. | `seedCredentials` label:deprecated[Deprecated in 5.26] | credentials | Defines credentials that need to be passed into certain seed providers. It is recommended to use the `CloudSeedProvider` seed provider, which does not require this configuration when seeding from cloud storage. -For more information see xref::clustering/databases.adoc#cloud-seed-provider[CloudSeedProvider]. +For more information see xref::database-administration/standard-databases/seed-from-uri.adoc#cloud-seed-provider[CloudSeedProvider]. | `txLogEnrichment` | `FULL` \| `DIFF` \| `OFF` diff --git a/modules/ROOT/pages/database-administration/standard-databases/delete-databases.adoc b/modules/ROOT/pages/database-administration/standard-databases/delete-databases.adoc index 76f60adf8..7eef2427f 100644 --- a/modules/ROOT/pages/database-administration/standard-databases/delete-databases.adoc +++ b/modules/ROOT/pages/database-administration/standard-databases/delete-databases.adoc @@ -86,7 +86,7 @@ DROP DATABASE movies DUMP DATA ---- In Neo4j, dumps can be stored in the directory specified by the xref:configuration/configuration-settings.adoc#config_server.directories.dumps.root[`server.directories.dumps.root`] setting (by default, the path for storing dumps is xref:configuration/file-locations.adoc#data[`/data/dumps`]). -You can use dumps to create databases through the xref:clustering/databases.adoc#cluster-seed-uri[Seed from URI approach]. +You can use dumps to create databases using the xref::database-administration/standard-databases/seed-from-uri.adoc[seed from a URI] approach. The option `DESTROY DATA` explicitly requests the default behavior of the command. diff --git a/modules/ROOT/pages/database-administration/standard-databases/seed-from-uri.adoc b/modules/ROOT/pages/database-administration/standard-databases/seed-from-uri.adoc new file mode 100644 index 000000000..e824a149a --- /dev/null +++ b/modules/ROOT/pages/database-administration/standard-databases/seed-from-uri.adoc @@ -0,0 +1,229 @@ +:page-role: enterprise-edition +:description: How to create a database using a seed from URI. + +[[database-seed-uri]] += Create a database from a URI + +This method seeds all databases with an identical seed from an external source, specified by a URI. + +You specify the seed URI as an argument of the `CREATE DATABASE` command: + +[source, cypher, role="noplay"] +---- +CREATE DATABASE foo OPTIONS {existingData: 'use', seedURI:'s3://myBucket/myBackup.backup'} +---- + +Download and validation of the seed is only performed as the new database is started. +If it fails, the database is not available and it has the `statusMessage`: `Unable to start database` of the `SHOW DATABASES` command. + +[source, cypher, role="noplay"] +---- +neo4j@neo4j> SHOW DATABASES; ++---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| name | type | aliases | access | address | role | writer | requestedStatus | currentStatus | statusMessage | default | home | constituents | ++---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| "seed3" | "standard" | [] | "read-write" | "localhost:7682" | "unknown" | FALSE | "online" | "offline" | "Unable to start database `DatabaseId{3fe1a59b[seed3]}`" | FALSE | FALSE | [] | ++---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +---- + +To determine the cause of the problem, it is recommended to look at the `debug.log`. + +[NOTE] +==== +Starting from Neo4j 2025.01, seed from URI can also be used in combination with xref:database-administration/standard-databases/create-databases.adoc[`CREATE OR REPLACE DATABASE`]. +==== + +[[neo4j-seed-providers]] +== Seed providers in Neo4j + +The seed can either be a full backup, a differential backup (see <>), or a dump from an existing database. +The sources of seeds are called _seed providers_. + +The mechanism is pluggable, allowing new sources of seeds to be supported (see link:https://www.neo4j.com/docs/java-reference/current/extending-neo4j/project-setup/#extending-neo4j-plugin-seed-provider[Java Reference -> Implement custom seed providers] for more information). + +The product has built-in support for seed from a mounted file system (file), FTP server, HTTP/HTTPS server, Amazon S3, Google Cloud Storage, and Azure Cloud Storage. + +[NOTE] +==== +Amazon S3, Google Cloud Storage, and Azure Cloud Storage are supported by default, but the other providers require configuration of xref:configuration/configuration-settings.adoc#config_dbms.databases.seed_from_uri_providers[`dbms.databases.seed_from_uri_providers`]. +==== + +[[file-seed-provider]] +=== FileSeedProvider + +The `FileSeedProvider` supports: + +** `file:` + +[[url-connection-seed-provider]] +=== URLConnectionSeedProvider + +The `URLConnectionSeedProvider` supports the following: + +** `ftp:` +** `http:` +** `https:` + +Starting from Neo4j 2025.01, the `URLConnectionSeedProvider` does not support `file`. +// This is true for both Cypher 5 and Cypher 25. + +[[cloud-seed-provider]] +=== CloudSeedProvider + +The `CloudSeedProvider` supports: + +** `s3:` +** `gs:` +** `azb:` + +The `CloudSeedProvider` supports using xref:backup-restore/modes.adoc#differential-backup[differential backup] files as seeds. +With the provided differential backup file, the `CloudSeedProvider` searches the directory containing differential backup files for a xref:backup-restore/online-backup.adoc#backup-chain[backup chain] ending at the specified differential backup, and then seeds using this backup chain. + +[.tabbed-example] +===== +[role=include-with-AWS-S3] +====== + +include::partial$/aws-s3-overrides.adoc[] + +include::partial$/aws-s3-credentials.adoc[] + +. Create database from `myBackup.backup`. ++ +[source,shell, role="nocopy"] +---- +CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup' } +---- + +====== +[role=include-with-Google-cloud-storage] +====== + +include::partial$/gcs-credentials.adoc[] + +. Create database from `myBackup.backup`. ++ +[source,shell] +---- +CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 'gs://myBucket/myBackup.backup' } +---- +====== +[role=include-with-Azure-cloud-storage] +====== + +include::partial$/azb-credentials.adoc[] + +. Create database from `myBackup.backup`. ++ +[source,shell] +---- +CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 'azb://myStorageAccount/myContainer/myBackup.backup' } +---- +====== +===== + +==== Support for seeding up to a date or a transaction ID + +Starting from Neo4j 2025.01, the `CloudSeedProvider` supports seeding up to a specific date or transaction ID using the `seedRestoreUntil` option. + +[role=label--new-2025.01] +Seed up to a specific date:: + +To seed up to a specific date, you need to pass the differential backup, which contains the data up to that date. ++ +[source,shell] +---- +CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedRestoreUntil: datetime("2019-06-01T18:40:32.142+0100") } +---- ++ +This will seed the database with transactions committed before the provided timestamp. + +[role=label--new-2025.01] +Seed up to a specific transaction ID:: + +To seed up to a specific transaction ID, you need to pass the differential backup that contains the data up to that transaction ID. ++ +[source,shell] +---- +CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedRestoreUntil: 123 } +---- ++ +This will seed the database with transactions up to, but not including transaction 123. + +[role=label--deprecated] +[[s3-seed-provider]] +=== S3SeedProvider + +// When Cypher 25 is released, we have to label this section 'Cypher 5' as this functionality is only available in Cypher 5. + +The `S3SeedProvider` supports: + +** `s3:` label:deprecated[Deprecated in 5.26] + + +[NOTE] +==== +Neo4j comes bundled with necessary libraries for AWS S3 connectivity. +Therefore, if you use `S3SeedProvider`,`aws cli` is not required but can be used with the `CloudSeedProvider`. +==== + +The `S3SeedProvider` requires additional configuration. +This is specified with the `seedConfig` option. +This option expects a comma-separated list of configurations. +Each configuration value is specified as a name followed by `=` and the value, as such: + +[source, cypher, role="noplay"] +---- +CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedConfig: 'region=eu-west-1' } +---- + +`S3SeedProvider` also requires passing in credentials. +These are specified with the `seedCredentials` option. +Seed credentials are securely passed from the Cypher command to each server hosting the database. +For this to work, Neo4j on each server in the cluster must be configured with identical keystores. +This is identical to the configuration required by remote aliases, see xref:database-administration/aliases/remote-database-alias-configuration.adoc#remote-alias-config-DBMS_admin-A[Configuration of DBMS with remote database alias]. +If this configuration is not performed, the `seedCredentials` option fails. + +[source, cypher, role="noplay"] +---- +CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3://myBucket/myBackup.backup', seedConfig: 'region=eu-west-1', seedCredentials: [accessKey];[secretKey] } +---- +Where `accessKey` and `secretKey` are provided by AWS. + +=== Seed provider reference + +[cols="1,2,2",options="header"] +|=== +| URL scheme +| Seed provider +| URI example + +| `file:` +| `FileSeedProvider` +| `\file://tmp/backup1.backup` + +| `ftp:` +| `URLConnectionSeedProvider` +| `\ftp://myftp.com/backups/backup1.backup` + +| `http:` +| `URLConnectionSeedProvider` +| `\http://myhttp.com/backups/backup1.backup` + +| `https:` +| `URLConnectionSeedProvider` +| `\https://myhttp.com/backups/backup1.backup` + +| `s3:` +| `S3SeedProvider` label:deprecated[Deprecated in 5.26], + +`CloudSeedProvider` +| `s3://mybucket/backups/backup1.backup` + +| `gs:` +| `CloudSeedProvider` +| `gs://mybucket/backups/backup1.backup` + +| `azb:` +| `CloudSeedProvider` +| `azb://mystorageaccount.blob/backupscontainer/backup1.backup` +|=== \ No newline at end of file diff --git a/modules/ROOT/pages/import.adoc b/modules/ROOT/pages/import.adoc index 83abed3d7..05ea3efa8 100644 --- a/modules/ROOT/pages/import.adoc +++ b/modules/ROOT/pages/import.adoc @@ -135,15 +135,19 @@ The xref:import.adoc#import-tool-examples[examples] for CSV can also be used wit [[full-import-options-table]] .`neo4j-admin database import full` options -[options="header", cols="5m,10a,2m"] +[options="header", cols="4m,6a,2m,1,2"] |=== | Option | Description | Default +| CSV +| Parquet |--additional-config=footnote:[See xref:neo4j-admin-neo4j-cli.adoc#_configuration[Neo4j Admin and Neo4j CLI -> Configuration] for details.] |Configuration file with additional configuration. | +| {check-mark} +| {check-mark} |--array-delimiter= |Delimiter character between array elements within a value in CSV data. Also accepts `TAB` and e.g. `U+20AC` for specifying a character using Unicode. @@ -159,16 +163,22 @@ For horizontal tabulation (HT), use `\t` or the Unicode character ID `\9`. Unicode character ID can be used if prepended by `\`. |; +| {check-mark} +| {check-mark} -| --auto-skip-subsequent-headers[=true\|false]footnote:ingnoredByParquet1[Ignored by Parquet import.] +| --auto-skip-subsequent-headers[=true\|false] |Automatically skip accidental header lines in subsequent files in file groups with more than one file. |false +| {check-mark} +| |--bad-tolerance= |Number of bad entries before the import is aborted. The import process is optimized for error-free data. Therefore, cleaning the data before importing it is highly recommended. If you encounter any bad entries during the import process, you can set the number of bad entries to a specific value that suits your needs. However, setting a high value may affect the performance of the tool. |1000 +| {check-mark} +| {check-mark} -|--delimiter=footnote:ingnoredByParquet1[] +|--delimiter= |Delimiter character between values in CSV data. Also accepts `TAB` and e.g. `U+20AC` for specifying a character using Unicode. ==== @@ -182,25 +192,35 @@ For horizontal tabulation (HT), use `\t` or the Unicode character ID `\9`. Unicode character ID can be used if prepended by `\`. |, +| {check-mark} +| |--expand-commands |Allow command expansion in config value evaluation. | +| {check-mark} +| {check-mark} |--format= |Name of database format. The imported database will be created in the specified format or use the format set in the configuration. Valid formats are `standard`, `aligned`, `high_limit`, and `block`. | +| {check-mark} +| {check-mark} |-h, --help |Show this help message and exit. | +| {check-mark} +| {check-mark} |--high-parallel-io=on\|off\|auto |Ignore environment-based heuristics and indicate if the target storage subsystem can support parallel IO with high throughput or auto detect. Typically this is `on` for SSDs, large raid arrays, and network-attached storage. |auto +| {check-mark} +| {check-mark} |--id-type=string\|integer\|actual |Each node must provide a unique ID. @@ -212,26 +232,38 @@ Possible values are: * `integer` -- arbitrary integer values for identifying nodes. * `actual` -- (advanced) actual node IDs. |string +| {check-mark} +| {check-mark} |--ignore-empty-strings[=true\|false] |Whether or not empty string fields, i.e. "" from input source are ignored, i.e. treated as null. |false +| {check-mark} +| {check-mark} -|--ignore-extra-columns[=true\|false]footnote:ingnoredByParquet1[] +|--ignore-extra-columns[=true\|false] |If unspecified columns should be ignored during the import. |false +| {check-mark} +| -|--input-encoding=footnote:ingnoredByParquet1[] +|--input-encoding= |Character set that input data is encoded in. |UTF-8 +| {check-mark} +| |--input-type=csv\|parquet |File type to import from. Can be csv or parquet. Defaults to csv. | +| {check-mark} +| {check-mark} |--legacy-style-quoting[=true\|false] |Whether or not a backslash-escaped quote e.g. \" is interpreted as an inner quote. |false +| {check-mark} +| {check-mark} |--max-off-heap-memory= |Maximum memory that `neo4j-admin` can use for various data structures and caching to improve performance. @@ -239,14 +271,20 @@ Possible values are: Values can be plain numbers, such as `10000000`, or `20G` for 20 gigabytes. It can also be specified as a percentage of the available memory, for example `70%`. |90% +| {check-mark} +| {check-mark} -|--multiline-fields=true\|false\|[,]footnote:ingnoredByParquet1[] +|--multiline-fields=true\|false\|[,] |label:changed[Changed in 5.26] In v1, whether or not fields from an input source can span multiple lines, i.e. contain newline characters. Setting `--multiline-fields=true` can severely degrade the performance of the importer. Therefore, use it with care, especially with large imports. In v2, this option will specify the list of files that contain multiline fields. Files can also be specified using regular expressions. |null +| {check-mark} +| -|--multiline-fields-format=v1\|v2footnote:ingnoredByParquet1[] +|--multiline-fields-format=v1\|v2 |Controls the parsing of input source that can span multiple lines, i.e. contain newline characters. When set to v1, the value for `--multiline-fields` can only be true or false. When set to v2, the value for `--multiline-fields` should be the list of files that contain multiline fields. |null +| {check-mark} +| |--nodes=[