Skip to content

Commit 05c6286

Browse files
authored
Improve docs around lost updates (#2386) (#2412)
1 parent 6574ee5 commit 05c6286

File tree

1 file changed

+46
-36
lines changed

1 file changed

+46
-36
lines changed

modules/ROOT/pages/database-internals/concurrent-data-access.adoc

Lines changed: 46 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -25,41 +25,51 @@ All the anomalies listed here can only occur with the read-committed isolation l
2525
In Cypher, it is possible to acquire write locks to simulate improved isolation in some cases.
2626
Consider the case where multiple concurrent Cypher queries increment the value of a property.
2727
Due to the limitations of the _read-committed isolation level_, the increments might not result in a deterministic final value.
28-
If there is a direct dependency, Cypher automatically acquires a write lock before reading.
29-
A direct dependency is when the right-hand side of `SET` has a dependent property read in the expression or the value of a key-value pair in a literal map.
3028

31-
For example, if you run the following query by one hundred concurrent clients, it is very likely not to increment the property `n.prop` to 100, unless a write lock is acquired before reading the property value.
32-
This is because all queries read the value of `n.prop` within their own transaction, and cannot see the incremented value from any other transaction that has not yet been committed.
33-
In the worst-case scenario, the final value would be as low as 1 if all threads perform the read before any has committed their transaction.
29+
Cypher automatically acquires write locks in some cases, but not in others.
30+
When a Cypher query uses the `SET` clause to update a property, it may or may not acquire a write lock on the node or relationship being updated, depending on whether there is a direct dependency on the property being read.
3431

35-
.Cypher can acquire a write lock
36-
====
37-
The following example requires a write lock, and Cypher automatically acquires one:
32+
==== Acquiring a write lock automatically
3833

34+
When a Cypher query has a direct dependency on the property being read, Cypher automatically acquires a write lock before reading the property.
35+
This is the case when the query uses the `SET` clause to update a property on a node or relationship, and the right-hand side of the `SET` clause has a dependency on the property being read.
36+
For example, in the following queries, the right-hand side of `SET` has a dependent property read in an expression or a value of a key-value pair in a literal map.
37+
38+
.Incrementing a property using an expression
39+
====
3940
[source, cypher, role="noheader"]
4041
----
4142
MATCH (n:Example {id: 42})
4243
SET n.prop = n.prop + 1
4344
----
45+
This query increments the property `n.prop` by 1.
46+
In this case, Cypher automatically acquires a write lock on the node `n` before reading the value of `n.prop`.
47+
This ensures that no other concurrent queries can modify the node `n` while this query is running, thus preventing lost updates.
4448
====
4549

46-
.Cypher can acquire a write lock
50+
.Incrementing a property using a map literal
4751
====
48-
This example also requires a write lock, and Cypher automatically acquires one:
49-
5052
[source, cypher, role="noheader"]
5153
----
5254
MATCH (n)
5355
SET n += {prop: n.prop + 1}
5456
----
57+
58+
This query also increments the property `n.prop` by 1, but it does so using a map literal.
59+
In this case, Cypher also acquires a write lock on the node `n` before reading the value of `n.prop`.
5560
====
5661

57-
Due to the complexity of determining such a dependency in the general case, Cypher does not cover any of the following example cases:
62+
==== No direct dependency to acquire a write lock
5863

59-
.Complex Cypher
60-
====
61-
Variable depending on results from reading the property in an earlier statement:
64+
When a query does not have a direct dependency on the property being read, Cypher does not automatically acquire a write lock.
65+
This means if you run multiple concurrent queries that read and write the same property, it is possible to end up with lost updates by allowing other concurrent queries to modify the property value at the same time.
6266

67+
For example, if you run the following queries by one hundred concurrent clients, it is very likely not to increment the property `n.prop` to 100, unless a write lock is acquired before reading the property value.
68+
This is because all queries read the value of `n.prop` within their own transaction, and cannot see the incremented value from any other transaction that has not yet been committed.
69+
In the worst-case scenario, the final value would be as low as 1 if all threads perform the read before any has committed their transaction.
70+
71+
.Variable depending on results from reading the property in an earlier statement
72+
====
6373
[source, cypher, role="noheader"]
6474
----
6575
MATCH (n)
@@ -69,47 +79,45 @@ SET n.prop = k + 1
6979
----
7080
====
7181

72-
.Complex Cypher
82+
.Circular dependency between properties read and written in the same query
7383
====
74-
Circular dependency between properties read and written in the same query:
75-
7684
[source, cypher, role="noheader"]
7785
----
7886
MATCH (n)
7987
SET n += {propA: n.propB + 1, propB: n.propA + 1}
8088
----
8189
====
8290

91+
Workaround::
8392
To ensure deterministic behavior also in the more complex cases, it is necessary to explicitly acquire a write lock on the node in question.
8493
In Cypher there is no explicit support for this, but it is possible to work around this limitation by writing to a temporary property.
85-
86-
.Explicitly acquire a write lock
94+
For example, the following query acquires a write lock for the node by writing to a *dummy* property (`n._dummy_`) before reading the requested value (`n.prop`).
95+
When acquired, the write lock ensures that no other concurrent queries can modify the node until the transaction is committed or rolled back.
96+
The dummy property is used only to acquire the write lock, therefore, it can be removed immediately after the lock is acquired.
97+
+
98+
.Dummy property to acquire a write lock
8799
====
88-
This example acquires a write lock for the node by writing to a dummy property before reading the requested value:
89-
90100
[source, cypher, role="noheader"]
91101
----
92102
MATCH (n:Example {id: 42})
93-
SET n._LOCK_ = true
103+
SET n._dummy_ = true
104+
REMOVE n._dummy_
94105
WITH n.prop AS p
95106
// ... operations depending on p, producing k
96107
SET n.prop = k + 1
97-
REMOVE n._LOCK_
98108
----
99109
====
100110

101-
The existence of the `+SET n._LOCK_+` statement before the read of the `n.prop` read ensures the lock is acquired before the read action, and no updates are lost due to enforced serialization of all concurrent queries on that specific node.
102-
103111
=== Non-repeatable reads
104112

105113
A non-repeatable read is when the same transaction reads the same data but gets inconsistent results.
106114
This can easily happen if reading the same data twice in a query and the data gets modified in-between by another concurrent query.
107115

108-
.Non-repeatable read
109-
====
110-
The following example query shows that reading the same property twice can give inconsistent results.
116+
For example, the following query shows that reading the same property twice can give inconsistent results.
111117
If there are other queries running concurrently, it is not guaranteed that `p1` and `p2` have the same value.
112118

119+
.Non-repeatable read
120+
====
113121
[source, cypher, role="noheader"]
114122
----
115123
MATCH (n:Example {id: 42})
@@ -132,17 +140,17 @@ Similarly, the entity may not appear at all if the property is changed to a prev
132140

133141
This anomaly can only occur with operators that scan an index, or parts of an index, for example link:{neo4j-docs-base-uri}/cypher-manual/{page-version}/planning-and-tuning/operators/operators-detail/#query-plan-node-index-scan[`NodeIndexScan`] or link:{neo4j-docs-base-uri}/cypher-manual/{page-version}/planning-and-tuning/operators/operators-detail/#query-plan-directed-relationship-index-seek-by-range[`DirectedRelationshipIndexSeekByRange`].
134142

135-
.Missing and double read
136-
====
137143
In the following query, each node `n` that has the property `prop` is expected to appear exactly once.
138144
However, concurrent updates that modify the `prop` property during index scanning may cause a node to appear multiple times or not at all in the result set.
145+
146+
.Missing and double read
147+
====
139148
[source, cypher, role="noheader"]
140149
----
141150
MATCH (n:Example) WHERE n.prop IS NOT NULL
142151
RETURN n
143152
----
144153
====
145-
146154
== Locks
147155

148156
When a write transaction occurs, Neo4j takes locks to preserve data consistency while updating.
@@ -279,15 +287,13 @@ Setting `db.lock.acquisition.timeout` to `0` -- which is the default value -- di
279287

280288
This feature cannot be set dynamically.
281289

282-
.Configure lock acquisition timeout
290+
.Set the timeout to ten seconds
283291
====
284-
Set the timeout to ten seconds.
285292
[source, parameters]
286293
----
287294
db.lock.acquisition.timeout=10s
288295
----
289296
====
290-
291297
[[deadlocks]]
292298
== Deadlocks
293299

@@ -319,6 +325,7 @@ Other code that requires synchronization should be synchronized in such a way th
319325
For example, running the following two queries in https://neo4j.com/docs/operations-manual/current/tools/cypher-shell/[Cypher-shell] at the same time will result in a deadlock because they are attempting to modify the same node properties concurrently:
320326

321327
.Transaction A
328+
====
322329
[source, cypher, indent=0, role=nocopy noplay]
323330
----
324331
:begin
@@ -327,8 +334,9 @@ WITH collect(n) as nodes
327334
CALL apoc.util.sleep(5000)
328335
MATCH (m:Test2) SET m.prop = 1;
329336
----
330-
337+
====
331338
.Transaction B
339+
====
332340
[source, cypher, indent=0, role=nocopy noplay]
333341
----
334342
:begin
@@ -347,6 +355,8 @@ The transaction will be rolled back and terminated. Error: ForsetiClient[transac
347355
Client[6697] waits for [ForsetiClient[transactionId=6698, clientId=1]]]
348356
----
349357
358+
====
359+
350360
[NOTE]
351361
====
352362
The Cypher clause `MERGE` takes locks out of order to ensure the uniqueness of the data, and this may prevent Neo4j's internal sorting operations from ordering transactions in a way that avoids deadlocks.

0 commit comments

Comments
 (0)