Skip to content

Commit 0334cfb

Browse files
committed
Add documentation for informal Ruby memory model and synchronisation extensions.
1 parent 3f56ea0 commit 0334cfb

File tree

1 file changed

+356
-0
lines changed

1 file changed

+356
-0
lines changed

doc/synchronization.md

Lines changed: 356 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,356 @@
1+
> Quotations are used for notes.
2+
3+
> This document is work-in-progress.
4+
> Intentions of this effort and document are: to sum up the behavior
5+
> of Ruby in concurrent and parallel environment, initiate discussion,
6+
> identify problems in the document, find flaws in the Ruby
7+
> implementations if any, suggest what has to be enhanced in Ruby itself
8+
> and cooperate towards the goal in all implementations (using
9+
> `concurrent-ruby` as compatibility layer).
10+
11+
# Synchronization
12+
13+
This layer provides tools to write concurrent abstractions independent on any
14+
particular Ruby implementation. It is build on top of Ruby memory model which is
15+
also described here. `concurrent-ruby` abstractions are build using this layer.
16+
17+
**Why?** Ruby is great expressive language, but it lacks in support for
18+
concurrent and parallel computation. It's hoped that this document will provide
19+
ground steps for Ruby to become as good in this area as in others.
20+
21+
Without memory model and this layer it's very hard to write concurrent
22+
abstractions for Ruby. To write a proper concurrent abstraction it often means
23+
to reimplement it more than once for different Ruby runtimes, which is very
24+
time-consuming and error-prone.
25+
26+
# Ruby memory model
27+
28+
Ruby memory model is a framework allowing to reason about programs in concurrent
29+
and parallel environment. It allows to identify what is and what is not a [race
30+
condition](https://en.wikipedia.org/wiki/Race_condition). Memory model is also a
31+
contract: if a program is written without the race conditions it behaves in
32+
[sequential consistent](https://en.wikipedia.org/wiki/Sequential_consistency)
33+
manner.
34+
35+
Sources:
36+
37+
- [Java memory model](http://www.cs.umd.edu/~pugh/java/memoryModel/),
38+
and its [FAQ](http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html)
39+
- [atomic<> Weapons 1](https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2)
40+
and
41+
[2](https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2)
42+
- [JRuby's wiki page about concurrency](https://github.com/jruby/jruby/wiki/Concurrency-in-jruby)
43+
44+
This memory model was created by: comparing
45+
[MRI](https://www.ruby-lang.org/en/), [JRuby](http://jruby.org/),
46+
[JRuby+Truffle](https://github.com/jruby/jruby/wiki/Truffle),
47+
[Rubinius](http://rubini.us/); taking account limitations of the implementations
48+
or their platforms; inspiration drawn from other existing memory models (Java,
49+
C++11). This is not a formal model.
50+
51+
Key properties are:
52+
53+
- **volatility (V)** - Any written value is immediately visible to any
54+
subsequent volatile reads including all writes leading to this value. (Same
55+
meaning as in Java.)
56+
- **atomicity (A)** - Operation is either done or not as a whole.
57+
- **serialized (S)** - Operations are serialized in some order (they
58+
cannot disappear). This is a new property not mentioned in other memory
59+
models, since Java and C++ do not have dynamically defined fields. All
60+
operations on one line in the table are serialized with each other.
61+
62+
### Core behavior:
63+
64+
| Operation | V | A | S | Notes |
65+
|:----------|:-:|:-:|:-:|:-----|
66+
| local variable read/write/definition | - | x | x | Local variables are determined during parsing, they are not usually dinamically added (with exception of `local_variable_set`). Therefore definition is quite rare. |
67+
| instance variable read/write/(un)definition | - | x | x ||
68+
| class variable read/write/(un)definition | x | x | x ||
69+
| global variable read/write/definition | x | x | x | un-define us not possible currently. |
70+
| constant variable read/write/(un)definition | x | x | x ||
71+
| `Thread` local variable read/write/definition | - | x | x | un-define us not possible currently. |
72+
| `Fiber` local variable read/write/definition | - | x | x | un-define us not possible currently. |
73+
| method creation/redefinition/removal | x | x | x ||
74+
| include/extend | x | x | x | If `AClass` is included `AModule`, `AClass` gets all `AModule`'s methods at once. |
75+
76+
77+
Notes:
78+
79+
- Variable read reads value from preexisting variable.
80+
- Variable definition creates new variable (operation is serialized with
81+
writes, implies an update cannot be lost).
82+
- Module/Class definition is actually constant definition. It is defined
83+
instantly, however its methods are then processed sequentially.
84+
- `||=`, `+=`, etc. are actually two operations read and write which implies
85+
that it's not an atomic operation. See volatile variables
86+
with compare-and-set.
87+
- Method invocation does not have any special properties that includes
88+
object initialization.
89+
90+
Implementation differences from the model:
91+
92+
- MRI: everything is volatile.
93+
- JRuby: `Thread` and `Fiber` local variables are volatile. Instance
94+
variables are volatile on x86 and people may un/intentionally depend
95+
on the fact.
96+
- Class variables require investigation.
97+
98+
### Source loading:
99+
100+
| Operation | V | A | S | Notes |
101+
|:----------|:-:|:-:|:-:|:-----|
102+
| requiring | x | x | x | File will not be required twice, classes and modules are still defined gradually. |
103+
| autoload | x | x | x | Only one autoload at a time. |
104+
105+
Notes:
106+
107+
- Beware of requiring and autoloading in concurrent programs, it's possible to
108+
see partially defined classes. Eager loading or blocking until class is
109+
fully loaded has should be used to mitigate.
110+
111+
### Core classes
112+
113+
`Mutex`, `Monitor`, `Queue` have to work correctly on each implementation. Ruby
114+
implementations should not crash when e.g. Array is used in parallel environment
115+
but it may loose updates etc.
116+
117+
> TODO: This section needs more work: e.g. Thread.raise and similar is an open
118+
> issue, better not to be used.
119+
120+
### Standard libraries
121+
122+
Standard libraries were written for MRI so unless they are rewritten in
123+
particular Ruby implementation they may contain hidden problems. Therefore it's
124+
better to assume that they are not safe.
125+
126+
> TODO: This section needs more work.
127+
128+
# Extensions
129+
130+
The above described memory model is quite weak, e.g. A thread-safe immutable
131+
object cannot be created. It requires final or volatile instance variables.
132+
133+
## Final instance variable
134+
135+
Objects inherited from `Synchronization::Object` provide a way how to ensure
136+
that all instance variables that are set only once in constructor (therefore
137+
effectively final) are safely published to all readers (assuming proper
138+
construction - object instance does not escape during construction).
139+
140+
``` ruby
141+
class ImmutableTreeNode < Concurrent::Synchronization::Object
142+
# mark this class to publish final instance variables safely
143+
safe_initialization!
144+
145+
def initialize(left, right)
146+
# Call super to allow proper initialization.
147+
super()
148+
# By convention final instance variables have CamelCase names
149+
# to distinguish them from ordinary instance variables.
150+
@Left = left
151+
@Right = right
152+
end
153+
154+
# Define thread-safe readers.
155+
def left
156+
# No need to synchronize or otherwise protect, it's already
157+
# guaranteed to be visible.
158+
@Left
159+
end
160+
161+
def right
162+
@Right
163+
end
164+
end
165+
```
166+
167+
Once `safe_initialization!` is called on a class it transitively applies to all
168+
its children.
169+
170+
> It's implemented by adding `new`, when `safe_initialization!` is called, as
171+
> follows:
172+
>
173+
> ``` ruby
174+
> def self.new(*)
175+
> object = super
176+
> ensure
177+
> object.ensure_ivar_visibility! if object
178+
> end
179+
> ```
180+
>
181+
> therefore `new` should not be overridden.
182+
183+
## Volatile instance variable
184+
185+
`Synchronization::Object` children can have volatile instance variables. A Ruby
186+
library cannot alter meaning of `@a_name` expression therefore when a
187+
`attr_volatile :a_name` is called, declaring the instance variable named
188+
`a_name` to be volatile, it creates method accessors.
189+
190+
> However there is Ruby [issue](https://redmine.ruby-lang.org/issues/11539)
191+
> filled to address this.
192+
193+
``` ruby
194+
# Simple counter with cheap reads.
195+
class Counter < Concurrent::Synchronization::Object
196+
# Declare instance variable value to be volatile and its
197+
# reader and writer to be private. `attr_volatile` returns
198+
# names of created methods.
199+
private *attr_volatile(:value)
200+
safe_initialization!
201+
202+
def initialize(value)
203+
# Call super to allow proper initialization.
204+
super()
205+
# Create a reentrant lock instance held in final ivar
206+
# to be able to protect writer.
207+
@Lock = Concurrent::Synchronization::Lock.new
208+
# volatile write
209+
self.value = value
210+
end
211+
212+
# Very cheap reader of the Counter's current value, just a volatile read.
213+
def count
214+
# volatile read
215+
value
216+
end
217+
218+
# Safely increment the value without loosing updates
219+
# (as it would happen with just += used).
220+
def increment(add)
221+
# Wrap he two volatile operations to make them atomic.
222+
@Lock.synchronize do
223+
# volatile write and read
224+
self.value = self.value + add
225+
end
226+
end
227+
end
228+
```
229+
230+
> This is currently planned to be migrated to a module to be able to add
231+
> volatile fields any object not just `Synchronization::Object` children. The
232+
> instance variable itself is named `@"volatile_#(name)"` to distinguish it and
233+
> to prevent direct access by name.
234+
235+
## Volatile instance variable with compare-and-set
236+
237+
Some concurrent abstractions may need to do compare-and-set on the volatile
238+
instance variables to avoid synchronization, then `attr_volatile_with_cas` is
239+
used.
240+
241+
``` ruby
242+
# Simplified clojure's Atom implementation
243+
class Atom < Concurrent::Synchronization::Object
244+
safe_initialization!
245+
# Make all methods private
246+
private *attr_volatile_with_cas(:value)
247+
# with exception of reader
248+
public :value
249+
250+
def initialize(value, validator = -> (v) { true })
251+
# Call super to allow proper initialization.
252+
super()
253+
# volatile write
254+
self.value = value
255+
@Validator = validator
256+
end
257+
258+
# Allows to swap values computed from an old_value with function
259+
# without using blocking synchronization.
260+
def swap(*args, &function)
261+
loop do
262+
old_value = self.value # volatile read
263+
begin
264+
# compute new value
265+
new_value = function.call(old_value, *args)
266+
# return old_value if validation fails
267+
break old_value unless valid?(new_value)
268+
# return new_value only if compare-and-set is successful
269+
# on value instance variable, otherwise repeat
270+
break new_value if compare_and_set_value(old_value, new_value)
271+
rescue
272+
break old_value
273+
end
274+
end
275+
end
276+
277+
private
278+
279+
def valid?(new_value)
280+
@Validator.call(new_value) rescue false
281+
end
282+
end
283+
```
284+
285+
`attr_volatile_with_cas` defines five methods for a given instance variable
286+
name. For name `value` they are:
287+
288+
``` ruby
289+
self.value #=> the_value
290+
self.value=(new_value) #=> new_value
291+
self.swap_value(new_value) #=> old_value
292+
self.compare_and_set_value(expected, new_value) #=> true || false
293+
self.update_value(&function) #=> function.call(old_value)
294+
```
295+
296+
Three of them were used in the example above.
297+
298+
> Current implementation relies on final instance variables where a instance of
299+
> `AtomicReference` is held to provide compare-and-set operations. That creates
300+
> extra indirection which is hoped to be removed over time when better
301+
> implementation will become available in Ruby implementations.
302+
303+
## Concurrent Ruby Notes
304+
305+
### Locks
306+
307+
Concurrent Ruby also has an internal extension of `Object` called
308+
`LockableObject`, which provides same synchronization primitives as Java's
309+
Object: `synchronize(&block)`, `wait(timeout = nil)`,
310+
`wait_until(timeout = nil, &condition)`, `signal`, `broadcast`. This class is
311+
intended for internal use in `concurrent-ruby` only and it does not support
312+
subclassing (since it cannot protect its lock from its children, for more
313+
details see [this article](http://wiki.apidesign.org/wiki/Java_Monitor)). It has
314+
minimal interface to be able to use directly locking available on given
315+
platforms.
316+
317+
For non-internal use there is `Lock` and `Condition` implementation in
318+
`Synchronization` namespace, a condition can be obtained with `new_condition`
319+
method on `Lock`. So far their implementation is naive and requires more work.
320+
API is not expected to change.
321+
322+
### Method names conventions
323+
324+
Methods starting with `ns_` are marking methods that are not using
325+
synchronization by themselves, they have to be used inside synchronize block.
326+
They are usually used in pairs to separate the synchronization from behavior and
327+
to allow to call methods in the same object without double locking.
328+
329+
``` ruby
330+
class Node
331+
# ...
332+
def left
333+
synchronize { ns_left }
334+
end
335+
336+
def left
337+
synchronize { ns_left }
338+
end
339+
340+
def to_a
341+
# avoids double locking
342+
synchronize { [ns_left, ns_right] }
343+
end
344+
345+
private
346+
347+
def ns_left
348+
@left
349+
end
350+
351+
def ns_right
352+
@right
353+
end
354+
# ...
355+
end
356+
```

0 commit comments

Comments
 (0)