Skip to content

Commit 25b8b2e

Browse files
Safe process for recreating database after a incremental import (#2704)
Fixes CONTROL-351 --------- Co-authored-by: Natalia Ivakina <[email protected]>
1 parent 5a74a0e commit 25b8b2e

File tree

1 file changed

+18
-8
lines changed

1 file changed

+18
-8
lines changed

modules/ROOT/pages/import.adoc

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,16 @@ You should use this tool when:
1212

1313
* Import performance is important because you have a large amount of data (millions/billions of entities).
1414
* The database can be taken offline and you have direct access to one of the servers hosting your Neo4j DBMS.
15-
* The database is either empty or its content is unchanged since a previous incremental import.
15+
* The database is non-existent or empty and you need to perform the initial data load.
16+
* You need to update your graph with large amount of data.
17+
In this case, importing data incrementally can be more performant than transactional insertion.
18+
+
19+
[NOTE]
20+
====
21+
The incremental import can be done either within a single command or in stages.
22+
For details, see <<_incremental_import_in_a_single_command>> and <<incremental-import-stages>>.
23+
====
24+
+
1625
* The CSV data is clean/fault-free (nodes are not duplicated and relationships' start and end nodes exist).
1726
This tool can handle data faults but performance is not optimized.
1827
If your data has a lot of faults, it is recommended to clean it using a dedicated tool before import.
@@ -686,16 +695,17 @@ Incremental import into an existing database.
686695

687696
=== Usage and limitations
688697

689-
[WARNING]
690-
====
691698
The importer works well on standalone servers.
692699

693-
In clustering environments with multiple copies of the database, the updated database must be used as a source to reseed the rest of the database copies.
694-
You can use the procedure xref:procedures.adoc#procedure_dbms_recreateDatabase[`dbms.recreateDatabase()`].
695-
For details, see xref:database-administration/standard-databases/recreate-database.adoc[Recreate databases].
700+
To safely perform an incremental import in a clustered environment, follow these steps:
696701

697-
Starting the clustered database after an incremental import without reseeding or performing the incremental import on a single server while the database remains online on other clustered members may result in unpredictable consequences, including data inconsistency between cluster members.
698-
====
702+
. Run the incremental import command on a single server in the cluster.
703+
This server can then be used as the xref:clustering/databases.adoc#cluster-designated-seeder[designated seeder] from which other cluster members can copy the database.
704+
. Reconfigure the database topology to a single primary by running the xref:procedures.adoc#procedure_dbms_recreateDatabase[`dbms.recreateDatabase()`] procedure.
705+
. Then stop the database using xref::database-administration/standard-databases/start-stop-databases.adoc#manage-databases-stop[STOP DATABASE].
706+
. Perform the incremental import on the server that hosts the database.
707+
. Then start the database with xref::database-administration/standard-databases/start-stop-databases.adoc#manage-databases-start[START DATABASE].
708+
. Lastly, restore the desired database topology using xref::database-administration/standard-databases/alter-databases.adoc#[ALTER DATABASE].
699709

700710
The incremental import command can be used to add:
701711

0 commit comments

Comments
 (0)