Skip to content

Commit 8cba570

Browse files
Add concurrency in GenAI example
1 parent d87a110 commit 8cba570

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

modules/ROOT/pages/genai-integrations.adoc

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -180,24 +180,26 @@ MATCH (m:Movie WHERE m.plot IS NOT NULL)
180180
WITH collect(m) AS moviesList, // <1>
181181
count(*) AS total,
182182
100 AS batchSize // <2>
183-
UNWIND range(0, total, batchSize) AS batchStart // <3>
183+
UNWIND range(0, total-1, batchSize) AS batchStart // <3>
184184
CALL (moviesList, batchStart, batchSize) { // <4>
185185
WITH [movie IN moviesList[batchStart .. batchStart + batchSize] | movie.title || ': ' || movie.plot] AS batch // <5>
186186
CALL genai.vector.encodeBatch(batch, 'OpenAI', { token: $token }) YIELD index, vector
187187
CALL db.create.setNodeVectorProperty(moviesList[batchStart + index], 'embedding', vector) // <6>
188-
} IN TRANSACTIONS OF 1 ROW <7>
188+
} IN CONCURRENT TRANSACTIONS OF 1 ROW <7>
189189
----
190190
191191
<1> xref:functions/aggregating.adoc#functions-collect[Collect] all returned `Movie` nodes into a `LIST<NODE>`.
192192
<2> `batchSize` defines the number of nodes in `moviesList` to be processed at once.
193193
Because vector embeddings can be very large, a larger batch size may require significantly more memory on the Neo4j server.
194194
Too large a batch size may also exceed the provider's threshold.
195195
<3> Process `Movie` nodes in increments of `batchSize`.
196+
The end range `total-1` is due to `range` being inclusive on both ends.
196197
<4> A xref:subqueries/subqueries-in-transactions.adoc[`CALL` subquery] executes a separate transaction for each batch.
197198
Note that this `CALL` subquery uses a xref:subqueries/call-subquery.adoc#variable-scope-clause[variable scope clause].
198199
<5> `batch` is a list of strings, each being the concatenation of `title` and `plot` of one movie.
199200
<6> The procedure sets `vector` as value for the property named `embedding` for the node at position `batchStart + index` in the `moviesList`.
200201
<7> Set to `1` the amount of batches to be processed at once.
202+
For more information on concurrency in transactions, see xref:subqueries/subqueries-in-transactions.adoc#concurrent-transactions[`CALL` subqueries -> Concurrent transactions]).
201203
202204
[NOTE]
203205
This example may not scale to larger datasets, as `collect(m)` requires the whole result set to be loaded in memory.

0 commit comments

Comments
 (0)