|
1 |
| -> Quotations are used for notes. |
2 |
| -
|
3 |
| -> This is an outdated version of |
4 |
| -> [this document](https://docs.google.com/document/d/1pVzU8w_QF44YzUCCab990Q_WZOdhpKolCIHaiXG-sPw/edit?usp=sharing) |
5 |
| -> maintained on Google documents. |
6 |
| -
|
7 |
| -> This document is work-in-progress. |
8 |
| -> Intentions of this effort and document are: to summarize the behavior |
9 |
| -> of Ruby in concurrent and parallel environment, initiate discussion, |
10 |
| -> identify problems in the document, find flaws in the Ruby |
11 |
| -> implementations if any, suggest what has to be enhanced in Ruby itself |
12 |
| -> and cooperate towards the goal in all implementations (using |
13 |
| -> `concurrent-ruby` as compatibility layer). |
14 |
| -> |
15 |
| -> It's not intention of this effort to introduce high-level concurrency |
16 |
| -> abstractions like actors to the language, but rather to improve low-level |
17 |
| -> concurrency support to add many more concurrency abstractions through gems. |
18 |
| -
|
19 | 1 | # Synchronization
|
20 | 2 |
|
21 |
| -This layer provides tools to write concurrent abstractions independent of any |
22 |
| -particular Ruby implementation. It is built on top of the Ruby memory model |
23 |
| -which is also described here. `concurrent-ruby` abstractions are build using |
24 |
| -this layer. |
25 |
| - |
26 |
| -**Why?** Ruby is great expressive language, but it lacks in support for |
27 |
| -well-defined low-level concurrent and parallel computation. It's hoped that this |
28 |
| -document will provide ground steps for Ruby to become as good in this area as |
29 |
| -in others. |
30 |
| - |
31 |
| -Without a memory model and this layer it's very hard to write concurrent |
32 |
| -abstractions for Ruby. To write a proper concurrent abstraction it often means |
33 |
| -to reimplement it more than once for different Ruby runtimes, which is very |
34 |
| -time-consuming and error-prone. |
35 |
| - |
36 |
| -# Ruby memory model |
37 |
| - |
38 |
| -The Ruby memory model is a framework allowing to reason about programs in |
39 |
| -concurrent and parallel environment. It defines what variable writes can be |
40 |
| -observed by a particular variable read, which is essential to be able to |
41 |
| -determine if a program is correct. It is achieved by defining what subset of |
42 |
| -all possible program execution orders is allowed. |
43 |
| - |
44 |
| -A memory model sources: |
45 |
| - |
46 |
| -- [Java memory model](http://www.cs.umd.edu/~pugh/java/memoryModel/), |
47 |
| - and its [FAQ](http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html) |
48 |
| -- [Java Memory Model Pragmatics](http://shipilev.net/blog/2014/jmm-pragmatics/) |
49 |
| -- [atomic<> Weapons 1](https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2) |
50 |
| -and |
51 |
| -[2](https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2) |
52 |
| - |
53 |
| -Concurrent behavior sources of Ruby implementations: |
54 |
| - |
55 |
| -- Source codes. |
56 |
| -- [JRuby's wiki page](https://github.com/jruby/jruby/wiki/Concurrency-in-jruby) |
57 |
| -- [Rubinius's wiki page](http://rubini.us/doc/en/systems/concurrency/) |
58 |
| - |
59 |
| -> A similar document for MRI was not found. Key fact about MRI is GVL (Global |
60 |
| -> VM lock) which ensures that only one thread can interpret a Ruby code at any |
61 |
| -> given time. When the GVL is handed from one thread to another a mutex is |
62 |
| -> released by first and acquired by the second thread implying that everything |
63 |
| -> done by first thread is visible to second thread. See |
64 |
| -> [thread_pthread.c](https://github.com/ruby/ruby/blob/ruby_2_2/thread_pthread.c#L101-L107) |
65 |
| -> and |
66 |
| -> [thread_win32.c](https://github.com/ruby/ruby/blob/ruby_2_2/thread_win32.c#L95-L100). |
67 |
| -
|
68 |
| -This memory model was created by: comparing |
69 |
| -[MRI](https://www.ruby-lang.org/en/), [JRuby](http://jruby.org/), |
70 |
| -[JRuby+Truffle](https://github.com/jruby/jruby/wiki/Truffle), |
71 |
| -[Rubinius](http://rubini.us/); taking account limitations of the implementations |
72 |
| -or their platforms; inspiration drawn from other existing memory models (Java, |
73 |
| -C++11). This is not a formal model. |
74 |
| - |
75 |
| -Key properties are: |
76 |
| - |
77 |
| -- **volatility (V)** - A written value is immediately visible to any |
78 |
| - subsequent volatile read of the same variable on any Thread. (It has same |
79 |
| - meaning as in Java.) |
80 |
| -- **atomicity (A)** - Operation is either done or not as a whole. |
81 |
| -- **serialized (S)** - Operations are serialized in some order (they |
82 |
| - cannot disappear). This is a new property not mentioned in other memory |
83 |
| - models, since Java and C++ do not have dynamically defined fields. All |
84 |
| - operations on one line in a row of the tables bellow are serialized with |
85 |
| - each other. |
86 |
| - |
87 |
| -### Core behavior: |
88 |
| - |
89 |
| -| Operation | V | A | S | Notes | |
90 |
| -|:----------|:-:|:-:|:-:|:-----| |
91 |
| -| local variable read/write/definition | - | x | x | Local variables are determined during parsing, they are not usually dynamically added (with exception of `local_variable_set`). Therefore definition is quite rare. | |
92 |
| -| instance variable read/write/(un)definition | - | x | x | Newly defined instance variables have to become visible eventually. | |
93 |
| -| class variable read/write/(un)definition | x | x | x || |
94 |
| -| global variable read/write/definition | x | x | x | un-define us not possible currently. | |
95 |
| -| constant variable read/write/(un)definition | x | x | x || |
96 |
| -| `Thread` local variable read/write/definition | - | x | x | un-define is not possible currently. | |
97 |
| -| `Fiber` local variable read/write/definition | - | x | x | un-define is not possible currently. | |
98 |
| -| method creation/redefinition/removal | x | x | x || |
99 |
| -| include/extend | x | x | x | If `AClass` is included `AModule`, `AClass` gets all `AModule`'s methods at once. | |
100 |
| - |
101 |
| - |
102 |
| -Notes: |
103 |
| - |
104 |
| -- Variable read reads value from preexisting variable. |
105 |
| -- Variable definition creates new variable (operation is serialized with |
106 |
| - writes, implies an update cannot be lost). |
107 |
| -- A Module or a Class definition is actually a constant definition. |
108 |
| - The definition is atomic, it assigns the Module or the Class to the |
109 |
| - constant, then its methods are defined atomically one by one. |
110 |
| -- `||=`, `+=`, etc. are actually two operations read and write which implies |
111 |
| - that it's not an atomic operation. See volatile variables |
112 |
| - with compare-and-set. |
113 |
| -- Method invocation does not have any special properties that includes |
114 |
| - object initialization. |
115 |
| - |
116 |
| -Current Implementation differences from the model: |
117 |
| - |
118 |
| -- MRI: everything is volatile. |
119 |
| -- JRuby: `Thread` and `Fiber` local variables are volatile. Instance |
120 |
| - variables are volatile on x86 and people may un/intentionally depend |
121 |
| - on the fact. |
122 |
| -- Class variables require investigation. |
123 |
| - |
124 |
| -> TODO: updated with specific versions of the implementations. |
125 |
| -
|
126 |
| -### Threads |
127 |
| - |
128 |
| -> TODO: add description of `Thread.new`, `#join`, etc. |
129 |
| -
|
130 |
| -### Source loading: |
131 |
| - |
132 |
| -| Operation | V | A | S | Notes | |
133 |
| -|:----------|:-:|:-:|:-:|:-----| |
134 |
| -| requiring | x | x | x | File will not be required twice, classes and modules are still defined gradually. | |
135 |
| -| autoload | x | x | - | Only one autoload at a time for a given constant, others will be blocked until first triggered autoload is done. Different constants may be loaded concurrently. | |
136 |
| - |
137 |
| -Notes: |
138 |
| - |
139 |
| -- Beware of requiring and autoloading in concurrent programs, it's possible to |
140 |
| - see partially defined classes. Eager loading or blocking until class is |
141 |
| - fully loaded should be used to mitigate. |
142 |
| - |
143 |
| -### Core classes |
144 |
| - |
145 |
| -`Mutex`, `Monitor`, `Queue` have to work correctly on each implementation. Ruby |
146 |
| -implementation VMs should not crash when for example `Array` or `Hash` is used |
147 |
| -in parallel environment but it may loose updates, or raise Exceptions. (If |
148 |
| -`Array` or `Hash` were synchronized it would have too much overhead when used |
149 |
| -in a single thread.) |
150 |
| - |
151 |
| -> `concurrent-ruby` contains synchronized versions of `Array` and `Hash` and |
152 |
| -> other thread-safe data structure. |
153 |
| -
|
154 |
| -> TODO: This section needs more work: e.g. Thread.raise and similar is an open |
155 |
| -> issue, better not to be used. |
156 |
| -
|
157 |
| -### Standard libraries |
158 |
| - |
159 |
| -Standard libraries were written for MRI so unless they are rewritten in |
160 |
| -particular Ruby implementation they may contain hidden problems. Therefore it's |
161 |
| -better to assume that they are not safe. |
162 |
| - |
163 |
| -> TODO: This section needs more work. |
164 |
| -
|
165 |
| -# Extensions |
166 |
| - |
167 |
| -The above described memory model is quite weak, e.g. A thread-safe immutable |
168 |
| -object cannot be created. It requires final or volatile instance variables. |
169 |
| - |
170 |
| -## Final instance variable |
171 |
| - |
172 |
| -Objects inherited from `Synchronization::Object` provide a way how to ensure |
173 |
| -that all instance variables that are set only once in constructor (therefore |
174 |
| -effectively final) are safely published to all readers (assuming proper |
175 |
| -construction - object instance does not escape during construction). |
176 |
| - |
177 |
| -``` ruby |
178 |
| -class ImmutableTreeNode < Concurrent::Synchronization::Object |
179 |
| - # mark this class to publish final instance variables safely |
180 |
| - safe_initialization! |
181 |
| - |
182 |
| - def initialize(left, right) |
183 |
| - # Call super to allow proper initialization. |
184 |
| - super() |
185 |
| - # By convention final instance variables have CamelCase names |
186 |
| - # to distinguish them from ordinary instance variables. |
187 |
| - @Left = left |
188 |
| - @Right = right |
189 |
| - end |
190 |
| - |
191 |
| - # Define thread-safe readers. |
192 |
| - def left |
193 |
| - # No need to synchronize or otherwise protect, it's already |
194 |
| - # guaranteed to be visible. |
195 |
| - @Left |
196 |
| - end |
197 |
| - |
198 |
| - def right |
199 |
| - @Right |
200 |
| - end |
201 |
| -end |
202 |
| -``` |
203 |
| - |
204 |
| -Once `safe_initialization!` is called on a class it transitively applies to all |
205 |
| -its children. |
206 |
| - |
207 |
| -> It's implemented by adding `new`, when `safe_initialization!` is called, as |
208 |
| -> follows: |
209 |
| -> |
210 |
| -> ``` ruby |
211 |
| -> def self.new(*) |
212 |
| -> object = super |
213 |
| -> ensure |
214 |
| -> object.ensure_ivar_visibility! if object |
215 |
| -> end |
216 |
| -> ``` |
217 |
| -> |
218 |
| -> therefore `new` should not be overridden. |
219 |
| -
|
220 |
| -## Volatile instance variable |
221 |
| -
|
222 |
| -`Synchronization::Object` children can have volatile instance variables. A Ruby |
223 |
| -library cannot alter meaning of `@a_name` expression therefore when a |
224 |
| -`attr_volatile :a_name` is called, declaring the instance variable named |
225 |
| -`a_name` to be volatile, it creates method accessors. |
226 |
| -
|
227 |
| -> However there is Ruby [issue](https://redmine.ruby-lang.org/issues/11539) |
228 |
| -> filed to address this. |
229 |
| -
|
230 |
| -``` ruby |
231 |
| -# Simple counter with cheap reads. |
232 |
| -class Counter < Concurrent::Synchronization::Object |
233 |
| - # Declare instance variable value to be volatile and its |
234 |
| - # reader and writer to be private. `attr_volatile` returns |
235 |
| - # names of created methods. |
236 |
| - private *attr_volatile(:value) |
237 |
| - safe_initialization! |
238 |
| -
|
239 |
| - def initialize(value) |
240 |
| - # Call super to allow proper initialization. |
241 |
| - super() |
242 |
| - # Create a reentrant lock instance held in final ivar |
243 |
| - # to be able to protect writer. |
244 |
| - @Lock = Concurrent::Synchronization::Lock.new |
245 |
| - # volatile write |
246 |
| - self.value = value |
247 |
| - end |
248 |
| -
|
249 |
| - # Very cheap reader of the Counter's current value, just a volatile read. |
250 |
| - def count |
251 |
| - # volatile read |
252 |
| - value |
253 |
| - end |
254 |
| -
|
255 |
| - # Safely increments the value without loosing updates |
256 |
| - # (as it would happen with just += used). |
257 |
| - def increment(add) |
258 |
| - # Wrap the two volatile operations to make them atomic. |
259 |
| - @Lock.synchronize do |
260 |
| - # volatile write and read |
261 |
| - self.value = self.value + add |
262 |
| - end |
263 |
| - end |
264 |
| -end |
265 |
| -``` |
266 |
| -
|
267 |
| -> This is currently planned to be migrated to a module to be able to add |
268 |
| -> volatile fields any object not just `Synchronization::Object` children. The |
269 |
| -> instance variable itself is named `"@volatile_#{name}"` to distinguish it and |
270 |
| -> to prevent direct access by name. |
271 |
| -
|
272 |
| -## Volatile instance variable with compare-and-set |
273 |
| - |
274 |
| -Some concurrent abstractions may need to do compare-and-set on the volatile |
275 |
| -instance variables to avoid synchronization, then `attr_volatile_with_cas` is |
276 |
| -used. |
277 |
| - |
278 |
| -``` ruby |
279 |
| -# Simplified clojure's Atom implementation |
280 |
| -class Atom < Concurrent::Synchronization::Object |
281 |
| - safe_initialization! |
282 |
| - # Make all methods private |
283 |
| - private *attr_volatile_with_cas(:value) |
284 |
| - # with exception of reader |
285 |
| - public :value |
286 |
| - |
287 |
| - def initialize(value, validator = -> (v) { true }) |
288 |
| - # Call super to allow proper initialization. |
289 |
| - super() |
290 |
| - # volatile write |
291 |
| - self.value = value |
292 |
| - @Validator = validator |
293 |
| - end |
294 |
| - |
295 |
| - # Allows to swap values computed from an old_value with function |
296 |
| - # without using blocking synchronization. |
297 |
| - def swap(*args, &function) |
298 |
| - loop do |
299 |
| - old_value = self.value # volatile read |
300 |
| - begin |
301 |
| - # compute new value |
302 |
| - new_value = function.call(old_value, *args) |
303 |
| - # return old_value if validation fails |
304 |
| - break old_value unless valid?(new_value) |
305 |
| - # return new_value only if compare-and-set is successful |
306 |
| - # on value instance variable, otherwise repeat |
307 |
| - break new_value if compare_and_set_value(old_value, new_value) |
308 |
| - rescue |
309 |
| - break old_value |
310 |
| - end |
311 |
| - end |
312 |
| - end |
313 |
| - |
314 |
| - private |
315 |
| - |
316 |
| - def valid?(new_value) |
317 |
| - @Validator.call(new_value) rescue false |
318 |
| - end |
319 |
| -end |
320 |
| -``` |
321 |
| - |
322 |
| -`attr_volatile_with_cas` defines five methods for a given instance variable |
323 |
| -name. For name `value` they are: |
324 |
| - |
325 |
| -``` ruby |
326 |
| -self.value #=> the_value |
327 |
| -self.value=(new_value) #=> new_value |
328 |
| -self.swap_value(new_value) #=> old_value |
329 |
| -self.compare_and_set_value(expected, new_value) #=> true || false |
330 |
| -self.update_value(&function) #=> function.call(old_value) |
331 |
| -``` |
332 |
| - |
333 |
| -Three of them were used in the example above. |
| 3 | +[This document](https://docs.google.com/document/d/1pVzU8w_QF44YzUCCab990Q_WZOdhpKolCIHaiXG-sPw/edit?usp=sharing) |
| 4 | +is moved to Google documents. It will be moved here once final and stabilized. |
334 | 5 |
|
335 |
| -> Current implementation relies on final instance variables where a instance of |
336 |
| -> `AtomicReference` is held to provide compare-and-set operations. That creates |
337 |
| -> extra indirection which is hoped to be removed over time when better |
338 |
| -> implementation will become available in Ruby implementations. The |
339 |
| -> instance variable itself is named `"@VolatileCas#{camelized name}"` to |
340 |
| -> distinguish it and to prevent direct access by name. |
0 commit comments