|
| 1 | +# Neo4j Nodestream Plugin for Nodestream |
| 2 | + |
| 3 | +This plugin provides a [Nodestream](https://github.com/nodestream-proj/nodestream) interface to Neo4j. |
| 4 | + |
| 5 | +**THIS PLUGIN IS IN BETA. USE AT YOUR OWN RISK.** |
| 6 | + |
| 7 | +## Installation |
| 8 | + |
| 9 | +```bash |
| 10 | +pip install nodestream-plugin-neo4j |
| 11 | +``` |
| 12 | + |
| 13 | +## Usage |
| 14 | + |
| 15 | +```yaml |
| 16 | +# nodestream.yaml |
| 17 | +targets: |
| 18 | + neo4j: |
| 19 | + database: neo4j |
| 20 | + uri: bolt://localhost:7687 |
| 21 | + username: neo4j |
| 22 | + password: neo4j |
| 23 | + database_name: neo4j # optional; name of the database to use. |
| 24 | + use_enterprise_features: false # optional; use enterprise features (e.g. node key constraints) |
| 25 | +``` |
| 26 | +
|
| 27 | +### Extractor |
| 28 | +
|
| 29 | +The `Neo4jExtractor` class represents an extractor that reads records from a Neo4j database. It takes a single Cypher query as |
| 30 | +input and yields the records read from the database. The extractor will automatically paginate through the database until it reaches the end. Therefore, the query needs to include a `SKIP` and `LIMIT` clause. For example: |
| 31 | + |
| 32 | +```yaml |
| 33 | +- implementation: nodestream_plugin_neo4j.extractor:Neo4jExtractor |
| 34 | + arguments: |
| 35 | + query: MATCH (p:Person) WHERE p.name = $name RETURN p.name SKIP $offset LIMIT $limit |
| 36 | + uri: bolt://localhost:7687 |
| 37 | + username: neo4j |
| 38 | + password: neo4j |
| 39 | + database_name: my_database # Optional; defaults to neo4j |
| 40 | + limit: 100000 # Optional; defaults to 100 |
| 41 | + parameters: |
| 42 | + # Optional; defaults to {} |
| 43 | + # Any parameters to be passed to the query |
| 44 | + # For example, if you want to pass a parameter called "name" with the value "John Doe", you would do this: |
| 45 | + name: John Doe |
| 46 | +``` |
| 47 | + |
| 48 | +The extractor will automatically add the `SKIP` and `LIMIT` clauses to the query. The extractor will also automatically add the `offset` and `limit` parameters to the query. The extractor will start with `offset` set to `0` and `limit` set to `100` (unless overridden by setting `limit`) The extractor will continue to paginate through the database until the query returns no results. |
| 49 | + |
| 50 | +## Concepts |
| 51 | + |
| 52 | +### Migrations |
| 53 | + |
| 54 | +The plugin supports migrations. Migrations are used to create indexes and constraints on the database. |
| 55 | + |
| 56 | +As part of the migration process, the plugin will create `__NodestreamMigration__` nodes in the database. |
| 57 | +This node will have a `name` property that is set to the name of the migration. |
| 58 | + |
| 59 | +Additionally, the plugin will create a `__NodestreamMigrationLock__` node in the database. |
| 60 | +This node will be exit when the migration process is running and will be deleted when the migration process is complete. |
| 61 | +This is used to prevent multiple migration processes from running at the same time. |
0 commit comments