Skip to content

Conversation

@JPryce-Aklundh
Copy link
Collaborator

@JPryce-Aklundh JPryce-Aklundh commented Mar 3, 2025

Note: This feature is no longer earmarked for Cypher 25, but will be part of Cypher 5 in the 2025.03 release.

@JPryce-Aklundh JPryce-Aklundh marked this pull request as ready for review March 4, 2025 09:49
@neo-technology-commit-status-publisher
Copy link
Collaborator

neo-technology-commit-status-publisher commented Mar 6, 2025

Thanks for the documentation updates.

The preview documentation has now been torn down - reopening this PR will republish it.

Copy link
Contributor

@henriknyman henriknyman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!
Comments are mostly related to that the default error fallback behaviour is actually ON ERROR RETRY THEN FAIL.

=== `ON ERROR RETRY`

`ON ERROR RETRY` uses an exponential delay between retry attempts for transaction batches that fail due to transient errors (i.e. errors where retrying a transaction can be expected to give a different result), with an optional xref:subqueries/subqueries-in-transactions.adoc#specify-retry-duration[maximum retry duration].
If the transaction still fails after the maximum duration, the failure is handled according to an optionally specified xref:subqueries/subqueries-in-transactions.adoc#fallback-error-handling[fallback error handling mode] (`THEN CONTINUE` (default), `THEN BREAK`, `THEN FAIL`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is THEN FAIL.

If the transaction still fails after the maximum duration, the failure is handled according to an optionally specified xref:subqueries/subqueries-in-transactions.adoc#fallback-error-handling[fallback error handling mode] (`THEN CONTINUE` (default), `THEN BREAK`, `THEN FAIL`).

`ON ERROR RETRY` increases query robustness by handling transient errors without manual intervention.
It is particularly suitable for xref:subqueries/subqueries-in-transactions.adoc#concurrent-transactions[concurrent transactions], reducing the likelihood of xref:subqueries/subqueries-in-transactions.adoc#deadlocks[deadlocks].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise, it doesn't reduce the likelihood of deadlocks, but makes the query not fail because of deadlocks.

The below example demonstrates a basic retry scenario.
If a transient error occurs during the creation of a `User` node, the transaction will be retried for the default maximum retry duration (`30` seconds).
If the retry succeeds, the query continues. If the retry fails after the default duration, it behaves like xref:subqueries/subqueries-in-transactions#on-error-continue[`ON ERROR CONTINUE`] (the default fallback).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..., the query fails because it behaves like ON ERROR FAIL (the default fallback)

See xref:subqueries/subqueries-in-transactions.adoc#on-error-continue[`ON ERROR CONTINUE`] for more information about this behavior.

[NOTE]
Because `THEN CONTINUE` is the default fallback option it does not have to be specified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THEN FAIL is the default fallback option

Because `THEN CONTINUE` is the default fallback option it does not have to be specified.

* `ON ERROR RETRY ... THEN BREAK`: the query will ignore recoverable errors and stop the execution of subsequent inner transactions.
The out transaction succeeds, and `null` will be returned for the failed inner transaction and all subsequent ones.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "The out transaction succeeds" an incomplete sentence or meant to be "The outer transaction succeeds"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outer transactions succeeds, and ....

See xref:subqueries/subqueries-in-transactions.adoc#on-error-break[`ON ERROR BREAK`] for more information about this behavior.

* `ON ERROR RETRY ... THEN FAIL`: the query will acknowledge a recoverable error and stop the execution of subsequent inner transactions, causing the outer transaction to fail.
all subsequent ones.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"all subsequent ones."
Incomplete sentence or leftover?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover :/

It returns the `transactionID`, `commitStatus` and `errorMessage` of the failed transactions.
.Query using `ON ERROR CONTINUE` to ignore deadlocks and complete outer transaction
.Query using `ON ERROR RETRY` to ignore deadlocks and complete outer transaction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to ignore deadlocks" -> "to retry deadlocked inner transactions"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this query does not use ON ERROR RETRY - the following one does.

To retry the any failed inner transactions, use the error option `ON ERROR RETRY`, which retries any failing transactions until the maximum retry duration has been reached.
The following query uses `ON ERROR RETRY` to retry the above query for a maximum of `3` seconds.
Note that `ON ERROR RETRY` by default falls back to the error option `THEN CONTINUE`, which ensures that any deadlocks are bypassed and that subsequent inner transactions are executed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default fallback is ON ERROR FAIL.
To be precise, deadlocks are not bypassed, but among the the conflicting transactions, (e.g. if there are 2 of them) one of them (A) is selected to fail with a deadlock detected exception, and the other (B) (typically the one holding the most locks) is allowed to continue and may now acquire the locks held by the failed transaction (A) so that it can eventually commit. When the failed transaction (A) is retried after (B) has committed it might also succeed to commit.

Here it may be useful to provide a note or a warning that deadlock detection and retries are time consuming, and if the import data contains a significant amount of relationships to be merged between the same nodes but occurring in different batches, increasing the concurrency is not necessarily beneficial but could instead slow down the overall performance of the import.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added a note just before the example box

MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS ON ERROR RETRY FOR 3 SECONDS REPORT STATUS AS status
Copy link
Contributor

@henriknyman henriknyman Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to explicitly write ON ERROR RETRY THEN CONTINUE or otherwise REPORT STATUS will fail with a syntax error as the default is ON ERROR FAIL.

----

| New error handling option for `CALL { ... } IN TRANSACTIONS`: `ON ERROR RETRY`.
This option applies an exponential delay between retries for transaction batches failing due to transient errors, with an optional maximum retry duration, and handles failure based on a specified fallback error mode if the transaction does not succeed within the given time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic is currently exponential backoff with some random jitter, but I am not sure if we need to go into that level of detail anywhere (maybe compare with driver docs?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can skip this. The driver docs doesn't go into details either.

@recrwplay
Copy link
Contributor

This PR includes documentation updates
View the updated docs at https://neo4j-docs-cypher-1203.surge.sh

Updated pages:

@JPryce-Aklundh JPryce-Aklundh merged commit 3cfdb1b into neo4j:cypher-25 Mar 19, 2025
4 checks passed
@JPryce-Aklundh JPryce-Aklundh deleted the on_error_retry branch March 19, 2025 07:59
JPryce-Aklundh added a commit to JPryce-Aklundh/docs-cypher that referenced this pull request Mar 19, 2025
Note: This feature is no longer earmarked for Cypher 25, but will be
part of Cypher 5 in the 2025.03 release.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants