@@ -203,35 +203,42 @@ UPDATE { logins: OLD.logins + 1 } IN users
203203RETURN { doc: NEW, type: OLD ? 'update' : 'insert' }
204204```
205205
206- ## Transactionality
207-
208- On a single server, upserts are executed transactionally in an all-or-nothing
209- fashion.
210-
211- If the RocksDB engine is used and intermediate commits are enabled, a query may
212- execute intermediate transaction commits in case the running transaction (AQL
213- query) hits the specified size thresholds. In this case, the query's operations
214- carried out so far will be committed and not rolled back in case of a later
215- abort/rollback. That behavior can be controlled by adjusting the intermediate
216- commit settings for the RocksDB engine.
217-
218- For sharded collections, the entire query and/or upsert operation may not be
219- transactional, especially if it involves different shards and/or DB-Servers.
220-
221- ## Limitations
206+ ## Transactionality and Limitations
207+
208+ - On a single server, upserts are generally executed transactionally in an
209+ all-or-nothing fashion.
210+
211+ For sharded collections in cluster deployments, the entire query and/or upsert
212+ operation may not be transactional, especially if it involves different shards,
213+ DB-Servers, or both.
214+
215+ - Queries may execute intermediate transaction commits in case the running
216+ transaction (AQL query) hits the specified size thresholds. This writes the
217+ data that has been modified so far and it is not rolled back in case of a later
218+ abort/rollback of the transaction.
219+
220+ Such ** intermediate commits** can occur for ` UPSERT ` operations over all
221+ documents of a large collection, for instance. This has the side-effect that
222+ atomicity of this operation cannot be guaranteed anymore and ArangoDB cannot
223+ guarantee that "read your own writes" in upserts work.
224+
225+ This is only an issue if you write a query where your search condition would
226+ hit the same document multiple times, and only if you have large transactions.
227+ You can adjust the behavior of the RocksDB storage engine by increasing the
228+ ` intermediateCommit ` thresholds for data size and operation counts.
222229
223230- The lookup and the insert/update/replace parts are executed one after
224231 another, so that other operations in other threads can happen in
225- between. This means if multiple UPSERT queries run concurrently, they
232+ between. This means if multiple ` UPSERT ` queries run concurrently, they
226233 may all determine that the target document does not exist and then
227234 create it multiple times!
228235
229236 Note that due to this gap between the lookup and insert/update/replace,
230- even with a unique index there may be duplicate key errors or conflicts.
237+ even with a unique index, duplicate key errors or conflicts can occur .
231238 But if they occur, the application/client code can execute the same query
232239 again.
233240
234- To prevent this from happening, one should add a unique index to the lookup
241+ To prevent this from happening, you should add a unique index to the lookup
235242 attribute(s). Note that in the cluster a unique index can only be created if
236243 it is equal to the shard key attribute of the collection or at least contains
237244 it as a part.
@@ -240,18 +247,20 @@ transactional, especially if it involves different shards and/or DB-Servers.
240247 ` exclusive ` option to limit write concurrency for this collection to 1, which
241248 helps avoiding conflicts but is bad for throughput!
242249
243- - Using very large transactions in an UPSERT (e.g. UPSERT over all documents in
244- a collection) an ** intermediate commit** can be triggered. This intermediate
245- commit will write the data that has been modified so far. However this will
246- have the side-effect that atomicity of this operation cannot be guaranteed
247- anymore and that ArangoDB cannot guarantee to that read your own writes in
248- upsert will work.
250+ - ` UPSERT ` operations do not observe their own writes correctly in cluster
251+ deployments. They only do for OneShard databases with the ` cluster-one-shard `
252+ optimizer rule active.
253+
254+ If upserts in a query create new documents and would then semantically hit the
255+ same documents again, the operation may incorrectly use the ` INSERT ` branch to
256+ create more documents instead of the ` UPDATE ` /` REPLACE ` branch to update the
257+ previously created documents.
249258
250- This will only be an issue if you write a query where your search condition
251- would hit the same document multiple times, and only if you have large
252- transactions. In order to avoid this issues you can increase the
253- ` intermediateCommit ` thresholds for data and operation counts .
259+ If upserts find existing documents for updating/replacing, you can access the
260+ current document via the ` OLD ` pseudo-variable, but this may hold the initial
261+ version of the document from before the query even if it has been modified
262+ by ` UPSERT ` in the meantime .
254263
255264- The lookup attribute(s) from the search expression should be indexed in order
256- to improve UPSERT performance. Ideally, the search expression contains the
265+ to improve the ` UPSERT ` performance. Ideally, the search expression contains the
257266 shard key, as this allows the lookup to be restricted to a single shard.
0 commit comments