Skip to content

Commit 66302f0

Browse files
committed
More edits on the STM posts
1 parent 8e4f0c3 commit 66302f0

File tree

2 files changed

+264
-6
lines changed

2 files changed

+264
-6
lines changed

_drafts/2024-12-25-STM-design.md

Lines changed: 84 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -68,14 +68,13 @@ You should look at the API documentation for relevant functions.
6868

6969
I'm going to ignore some aspects of Refs and transations such as validators and watchers. And I'm going to ignore agents. Though they are important tools in using Refs, they are incidental to the main logic of the STM implementation.
7070

71-
## Design thinking
71+
## Design thinking - in reverse
7272

73-
I don't want to derive all of MVCC from first principles, but a little thinking through scenarios will help motivate what we are going to see in the code.
73+
I don't want to derive all of MVCC from first principles. In fact, the analysis below is really reverse design; I looked at the code and came up with rationales. So let's get started.
7474

7575
The timeline of Ref values proceeds in discrete steps. Any given point, in the world outside of any ongoing transaction, we see a consistent set of values across all the Refs that have been created. There may be transactions running, but we do not see anything they are up to until they commit. When a transaction commits, it will update the Ref world atomically; from the point of view of anyone outside that transaction, the changes will appear to have happened all at once. We will never see a state where only a subset of the changes to Ref values made by the transaction have been applied.
7676

77-
Internally to transactions and to the Refs themselves, timestamps are used to keep track of the order of events. Timestamps are integers and are monotically increasing with time. Transactions get assigned timestamps when they are created, when they are forced to retry execution after a conflict and when they commit. They are used internally to determine the relative age of transactions.
78-
In addition, the value currently assigned to a Ref will have a timestamp associated with it, in fact, the timestamp assigned to the commit action that set the value. For reasons we will go into later, a Ref can keep a history of values with older timestamps. A snapshot of the "Ref world" would be a set of values in effect at a particular time.
77+
Internally to transactions and to the Refs themselves, timestamps are used to keep track of the order of events. Timestamps are integers and are monotically increasing with time. Transactions get assigned timestamps when they are created, when they are forced to retry execution after a conflict and when they commit. The timestamps are used internally to determine the relative age of transactions. In addition, the value currently assigned to a Ref will have a timestamp associated with it, the timestamp assigned to the commit action that set the value. For reasons we will go into later, a Ref can keep a history of values with older timestamps. A snapshot of the "Ref world" would be a set of values in effect at a particular time.
7978

8079
Let's say we have Refs `r1`, `r2`, and `r3` with the following histories, indicated with the notation '<value, timestamp>':
8180

@@ -107,6 +106,8 @@ Atomicity requires that either both changes are made or neither. Consistency re
107106

108107
You might have heard that MVCC doesn't involve locks. It's bit more nuanced than that. What MVCC avoids is coarse-grained locking and locks with non-trivial temporal extent. While T1 is running, before it gets to the point of committing, it is doing its various computations. It is _not_ locking the refs it is updating, except briefly when it has to read a value or do a little bookkeeping. Only at the point of committing does it lock the refs it is updating. And there is no user code being run while those refs are locked. The locks are held just long enough to update the data structures.
109108

109+
Or so I believed. In the code today, Refs that have been ensured hold read locks for an indefinite period, with observable results. More below.
110+
110111
Isolation is achieved by making invisibile to the outside any changes made to a ref (via `ref-set`, `alter`, or `commute`) until the transaction commits. While executing, the transaction _will_ see the changes it has made. This is accomplished by having the transaction keep track of the 'in-transaction-value' of any ref it has updated. Thus, the `deref` operation operates differently inside a transaction than outside.
111112

112113
### Stepping on toes
@@ -146,4 +147,82 @@ T1 will still see the world as it was at timestamp 24 if (a) we require Refs to
146147

147148
Because keeping every value across all time would be quite expensive, the size of the history is limited. One can set the `:min-history` and `:max-history` when the Ref is created (defaulting to 0 and 10, respectively) or call `set-min-history` and `set-max-history` to change the values on the fly. The size of the list will be dynamically adjusted to keep the history within the bounds. (As we shall see.)
148149

149-
Time to [look at the code]({{site.baseurl}}{% post_url 2024-12-26-STM-code %}).
150+
## Locking peril
151+
152+
As I was working through the code, I became very puzzled by how locking was working. The control flow is not exactly transparent -- not surprising given that we are working with multi-threading and expect there to be conflicts, and that conflicts result in throwing exceptions. That's complexity times three.
153+
154+
Eventually I discovered that a premise I was working with incorrect. The premise was that all locking was short-term. And that is just not true. Doing an ensure operation on a Ref gives rise to a long-term read lock that can actually block other transactions and force retries. I discovered this when looking for clues in the [online documentation for ensure](https://clojure.github.io/clojure/clojure.core-api.html#clojure.core/ensure). If you scroll to the bottom, you find this comment:
155+
156+
The doc string says:
157+
158+
Allows for more concurrency than (ref-set ref @ref)
159+
160+
What it doesn’t say is that ensure may degrade performance very seriously, as it is prone to livelock: see CLJ-2301. (ref-set ref @ref) is certainly more reliable than (ensure ref).
161+
162+
The issue [CLJ-2301](https://clojure.atlassian.net/jira/software/c/projects/CLJ/issues/CLJ-2301) has an example that demonstrates the problem. (It also has links to two old mailing list discussions that resulted in this issue being created.) These items are from 2017. The issue is ranked Minor -- don't hold your breath.
163+
164+
Before we hit actual code, it will be helpful to work through the relationships among `ensure`, `commute`, and `ref-set` and how locking comes into play. (`alter` is identical to `ref-set` in its effects, so we will ignore it.)
165+
166+
The three operations -- let's refer to them as `E`(nsure), `C`(ommute) (ref-)`S`(et) represent three levels of commitment.
167+
168+
- E = I depend on this value. I'm not going to change it. But if someone changes it while I'm working, I'll need to retry.
169+
- C = I will be changing the value of this Ref. I will compute an in-transaction value for it. However, when we commit, I'll recompute the value for it based on the then-current value, so it is okay if someone changes it while I'm working.
170+
- S = I will be changing the value of this Ref. If someone else changes it while I'm working, I'll need to retry.
171+
172+
The three do interact. In particular, let's consider a pair of operations in sequence.
173+
174+
| First op | Second op | Result |
175+
|:---:|:---:|:---------------------------|
176+
| S | S | The second value set will be used. |
177+
| S | E | The ensure is a no-op. We are already committed to change the Ref. |
178+
| S | C | This is okay. We will re-reun the commute function at the end of the commit phase. |
179+
| C | S | Invalid operation, exception thrown that will abort the transaction. |
180+
| C | E | These are independent. Both operations will be in effect. |
181+
| C | C | We have two functions to call at commit time. The more the merrier. |
182+
| E | S | We remove the Ref from the list of ensured Refs and put in the list of set Refs. |
183+
| E | E | The second is a no-op. |
184+
| E | C | These are independent. Both operations will be in effect. |
185+
186+
I don't know anywhere these interactions are talked about. Determining them from the code takes work.
187+
C->S is easy -- it throws an exception and there is actually a comment in the code. S->E requires looking inside a double-nested conditional and seeing what _doesn't_ happen.
188+
189+
At mininum, a `LockingTransaction` will need to track:
190+
191+
- `ensureds` - the set of ensured Refs
192+
- `sets` - the set of `ref-set` / `altered` Refs
193+
- `vals` - a dictionary mapping a Ref its in-transaction value
194+
- `commutes` - a dictionary mapping a Ref to a list of commute actions
195+
196+
The operations do the following:
197+
198+
| Op | Effect |
199+
|:---:|:-------|
200+
| C | Add the action to the list of actions for the Ref. |
201+
| E | If the Ref is in `sets` do nothing; else add to `ensureds`. |
202+
| S | If the Ref is in `commutes`, throw an invalid operation exception; otherwise, if the Ref is in `ensureds`, remove it. Add the Ref to `sets` and record the new value for it in `vals`. |
203+
204+
## Let's lock this down
205+
206+
A `Ref` is quite a bit simpler. Primarily, it holds the history list of values and lock to control access to that list so we don't have multiple transactions tromping all over the list with no discipline. In addition, as mentioned above it holds the 'stamp' of a transaction that currently has primacy on changing its value. The lock is a reader-writer lock. It supports multiple readers and only one writer. New readers can lock if there is no write lock. A write-lock can be acquired only if it is free (no readers and no writer).
207+
208+
The operations do the following:
209+
210+
| Op | Locking |
211+
|:---:|:-------|
212+
| C | Get a read lock, update data structures, release lock. |
213+
| E | If already ensured, do nothing. Else: Get a read lock. If someone has set the value on the Ref after our snapshot timestamp, release the lock and cause a retry. If the Ref has a stamp on it, release the lock; if the stamper is not us, cause a retry. otherwise, add the Ref to `ensureds` and _DO NOT RELEASE THE LOCK_ |
214+
| S | If the Ref is in `ensureds` release the read lock. (Anything in `ensureds` has acquired a read lock.) Try to get a write lock. Do updates, throw, whatever needs to be done. Release the write lock if you acquired it. |
215+
216+
217+
Note that for E, when successful, we are holding a read lock. Other people can read, but no one can write. This leads to the problem mentioned in CLJ-2301.
218+
219+
Note that for S, we said 'try to get a write lock'. The operation to `LT.TryWriteLock` tries to acquire the write lock with a timeout. if the timeout elapses, it throws a retry exception -- we have resource contention here and need to retry.
220+
This operation also checks for an existing stamp of another transaction that is still running and does barge-or-(block-and-bail) as mentioned above.
221+
222+
When do the read locks on ensured Refs get released? That happens at the end of each iteration of the (re)try loop, whether we have successfully committed or been thrown into doing a retry.
223+
224+
## Finally ...
225+
226+
We are dealing with a multi-threaded computational structure with significant chance of resource contention that results in retrying operations. The control flow is implicit and spread across maybe a dozen methods. We have locks being acquired in one place and release far away (both temporally and in the code). There are a few comments. I did not know through the first five readings of the code how key the comment -- "// The set of Refs holding read locks." -- actually was; it really meant what it said.
227+
228+
Enough analysis. Time to [look at the code]({{site.baseurl}}{% post_url 2024-12-26-STM-code %}).

_drafts/2024-12-26-STM-code.md

Lines changed: 180 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -295,4 +295,183 @@ type LTInfo(initState: LTState, startPoint: int64) =
295295
s = LTState.Running || s = LTState.Committing
296296
```
297297

298-
One will not that the
298+
The the state is an `LTState` value, we store it as an `int64`.
299+
We are careful to make changes to the `status` field using `Interlocked` methods bacause other threads may mutate this field.
300+
Unfortunately, the `Interlocked` methods won't work on `LTState` values, even though they are represented as `int64` values.
301+
I decided to stick with using the `LTState` enumeration rather than integer constants,
302+
trading a little more complexity here for clarity in the rest of the program.
303+
The `EnumOfValue` calls take care of translating from an `int64` back to an `LTState`.
304+
305+
Now we can define the `LockingTransaction` class.
306+
307+
```F#
308+
[<Sealed>]
309+
type LockingTransaction private () =
310+
...
311+
```
312+
313+
Let's get some bookkeeping out of the way.
314+
We will need to generate timestamps.
315+
A new timestamp is needed each time an LT attempts to execute a transaction, including each retry.
316+
We also generate a new timestamp for each commit point.
317+
The timestamp is a 64-bit integer, we use an `AtomicLong` to generate them as multiple threads will be accessing the static field holding the most recent value.
318+
The transaction will hold a _read point_, the timestamp assigned as it begins execution (including for each retry).
319+
We keep track of its first-assigned read point, the _start point_, as a marker of its age relative to other transactions.
320+
321+
322+
```F#
323+
// The current point.
324+
// Used to provide a total ordering on transactions for the purpose of determining preference on transactions when there are conflicts.
325+
// Transactions consume a point for init, for each retry, and on commit if writing.
326+
static member private lastPoint : AtomicLong = AtomicLong()
327+
328+
// The point at the start of the current retry (or first try).
329+
let mutable readPoint : int64 = 0L
330+
331+
// The point at the start of the transaction.
332+
let mutable startPoint : int64 = 0L
333+
334+
// Get a new read point value.
335+
member this.getReadPoint() =
336+
readPoint <- LockingTransaction.lastPoint.incrementAndGet()
337+
338+
// Get a commit point value.
339+
static member getCommitPoint() =
340+
LockingTransaction.lastPoint.incrementAndGet()
341+
```
342+
343+
For age computations, we also keep track of the `System.Ticks` value at the start of the transaction.
344+
345+
```F#
346+
// The system ticks at the start of the transaction.
347+
let mutable startTime : int64 = 0L
348+
```
349+
350+
We keep track of the transaction's state in an `LTInfo` object.
351+
This will be changed on every retry, hence the `mutable`.
352+
353+
```F#
354+
// The state of the transaction.
355+
// This is an option so we can zero it out when the transaction is done.
356+
let mutable info : LTInfo option = None
357+
358+
member _.Info = info
359+
```
360+
361+
We need to detect if a transaction is running on the current thread and access it.
362+
We use thread-local storage for this. Note that the type is `LockingTransaction option`.
363+
364+
```F#
365+
// The transaction running on the current thread. (Thread-local.)
366+
[<DefaultValue;ThreadStatic>]
367+
static val mutable private currentTransaction : LockingTransaction option
368+
```
369+
370+
We provide several accessors. `getRunning` returns a `LockingTransaction option`.
371+
`getEx` returns a `LockingTransaction` or throws an exception if there is no transaction running.
372+
And we have a simple boolean check of existence.
373+
374+
```F#
375+
// Get the transaction running on this thread (throw exception if no transaction).
376+
static member getEx() =
377+
let transOpt = LockingTransaction.currentTransaction
378+
match transOpt with
379+
| None ->
380+
raise <| InvalidOperationException("No transaction running")
381+
| Some t ->
382+
match t.Info with
383+
| None ->
384+
raise <| InvalidOperationException("No transaction running")
385+
| Some info -> t
386+
387+
// Get the transaction running on this thread (or None if no transaction).
388+
static member getRunning() =
389+
let transOpt = LockingTransaction.currentTransaction
390+
match transOpt with
391+
| None -> None
392+
| Some t ->
393+
match t.Info with
394+
| None -> None
395+
| Some info -> transOpt
396+
397+
// Is there a transaction running on this thread?
398+
static member isRunning() = LockingTransaction.getRunning().IsSome
399+
```
400+
401+
Note the special case of seeing an LT with no `LTInfo`.
402+
This is a transaction that has been stopped and is working on cleanup.
403+
We haven't gotten rid of it yet but we are working on it -- it is not running.
404+
405+
### Running a transaction
406+
407+
The primary entry point for running a transaction is the `runInTransaction` static method.
408+
Basically, it creates a new `LockingTransaction` object, sets it as the current transaction, runs the supplied code, and then tries to commit.
409+
However, if a transaction is already running on the current thread, we will join the existing transaction.
410+
This happens if the body of an outer `dosync` calls `dosync` again, either directly or indirectly.
411+
412+
```F#
413+
static member runInTransaction(fn:IFn) : obj =
414+
let transOpt = LockingTransaction.currentTransaction
415+
match transOpt with
416+
| None ->
417+
// no transaction running. Start one, put it on the thread, and run the code.
418+
let newTrans = LockingTransaction()
419+
LockingTransaction.currentTransaction <- Some newTrans
420+
try
421+
newTrans.run(fn)
422+
finally
423+
LockingTransaction.currentTransaction <- None
424+
| Some t ->
425+
match t.Info with
426+
// I'm not sure when this case would happen. We have a transaction object that is effectively uninitialized.
427+
// We use 'run' to take care of intializing it.
428+
| None -> t.run(fn)
429+
// In this case, the transaction is properly running. We can just invoke the body.
430+
| _ -> fn.invoke()
431+
```
432+
433+
So `runInTransaction` takes care of seeing if we are already in a transaction and creating / registering one if not.
434+
The real action is in `run`:
435+
436+
```F#
437+
member this.run(fn:IFn) : obj =
438+
let mutable finished = false
439+
let mutable ret = null
440+
let locked = ResizeArray<Ref>()
441+
let notify = ResizeArray<LTNotify>()
442+
443+
let mutable i = 0
444+
445+
while not finished && i < RetryLimit do
446+
try
447+
try
448+
this.getReadPoint()
449+
450+
if i = 0 then
451+
startPoint <- readPoint
452+
startTime <- int64 Environment.TickCount
453+
454+
let newLTInfo = LTInfo(LTState.Running, startPoint)
455+
456+
info <- Some <| newLTInfo
457+
ret <- fn.invoke()
458+
459+
// CODE TO COMMIT COMES HERE
460+
461+
with
462+
| :? RetryEx -> ()
463+
| ex when not (LockingTransaction.containsNestedRetryEx(ex)) -> reraise()
464+
finally
465+
// CODE TO CLEAN UP AFTER EACH ATTEMPT, whether successful or not
466+
467+
468+
i <- i + 1
469+
470+
if not finished then
471+
raise <| InvalidOperationException("Transaction failed after reaching retry limit")
472+
473+
ret
474+
```
475+
476+
The `finished` flag will be set to `true` in the commit code if the transaction is successful. We'll see that code in just a moment.
477+

0 commit comments

Comments
 (0)