Skip to content

Commit 6ec0532

Browse files
Add concurrency in GenAI example
1 parent 8cc97e5 commit 6ec0532

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

modules/ROOT/pages/genai-integrations.adoc

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -183,25 +183,27 @@ MATCH (m:Movie WHERE m.plot IS NOT NULL)
183183
WITH collect(m) AS moviesList, // <1>
184184
count(*) AS total,
185185
100 AS batchSize // <2>
186-
UNWIND range(0, total, batchSize) AS batchStart // <3>
186+
UNWIND range(0, total-1, batchSize) AS batchStart // <3>
187187
CALL (moviesList, batchStart, batchSize) { // <4>
188188
WITH [movie IN moviesList[batchStart .. batchStart + batchSize] | movie.title || ': ' || movie.plot] AS batch // <5>
189189
CALL genai.vector.encodeBatch(batch, 'OpenAI', { token: $token }) YIELD index, vector
190190
CALL db.create.setNodeVectorProperty(moviesList[batchStart + index], 'embedding', vector) // <6>
191-
} IN TRANSACTIONS OF 1 ROW <7>
191+
} IN CONCURRENT TRANSACTIONS OF 1 ROW <7>
192192
----
193193
194194
<1> xref:functions/aggregating.adoc#functions-collect[Collect] all returned `Movie` nodes into a `LIST<NODE>`.
195195
<2> `batchSize` defines the number of nodes in `moviesList` to be processed at once.
196196
Because vector embeddings can be very large, a larger batch size may require significantly more memory on the Neo4j server.
197197
Too large a batch size may also exceed the provider's threshold.
198198
<3> Process `Movie` nodes in increments of `batchSize`.
199+
The end range `total-1` is due to `range` being inclusive on both ends.
199200
<4> A xref:subqueries/subqueries-in-transactions.adoc[`CALL` subquery] executes a separate transaction for each batch.
200201
Note that this `CALL` subquery uses a xref:subqueries/call-subquery.adoc#variable-scope-clause[variable scope clause] (introduced in Neo4j 5.23) to import variables.
201202
If you are using an older version of Neo4j, use an xref:subqueries/call-subquery.adoc#importing-with[importing `WITH` clause] instead.
202203
<5> `batch` is a list of strings, each being the concatenation of `title` and `plot` of one movie.
203204
<6> The procedure sets `vector` as value for the property named `embedding` for the node at position `batchStart + index` in the `moviesList`.
204205
<7> Set to `1` the amount of batches to be processed at once.
206+
Concurrency in transactions was introduced in Cypher 5.21 (see xref:subqueries/subqueries-in-transactions.adoc#concurrent-transactions[`CALL` subqueries -> Concurrent transactions]).
205207
206208
[NOTE]
207209
This example may not scale to larger datasets, as `collect(m)` requires the whole result set to be loaded in memory.

0 commit comments

Comments
 (0)