11module ThreadSafe
22 module Util
3- # A Ruby port of the Doug Lea's jsr166e.Striped64 class version 1.6 available in public domain.
4- # Original source code available here: http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/jsr166e/Striped64.java?revision=1.6
3+ # A Ruby port of the Doug Lea's jsr166e.Striped64 class version 1.6
4+ # available in public domain.
55 #
6- # Class holding common representation and mechanics for classes supporting dynamic striping on 64bit values.
6+ # Original source code available here:
7+ # http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/jsr166e/Striped64.java?revision=1.6
78 #
8- # This class maintains a lazily-initialized table of atomically
9- # updated variables, plus an extra +base+ field. The table size
10- # is a power of two. Indexing uses masked per-thread hash codes.
11- # Nearly all methods on this class are private, accessed directly
12- # by subclasses.
9+ # Class holding common representation and mechanics for classes supporting
10+ # dynamic striping on 64bit values.
1311 #
14- # Table entries are of class +Cell+; a variant of AtomicLong padded
15- # to reduce cache contention on most processors. Padding is
16- # overkill for most Atomics because they are usually irregularly
17- # scattered in memory and thus don't interfere much with each
18- # other. But Atomic objects residing in arrays will tend to be
19- # placed adjacent to each other, and so will most often share
20- # cache lines (with a huge negative performance impact) without
12+ # This class maintains a lazily-initialized table of atomically updated
13+ # variables, plus an extra +base+ field. The table size is a power of two.
14+ # Indexing uses masked per-thread hash codes. Nearly all methods on this
15+ # class are private, accessed directly by subclasses.
16+ #
17+ # Table entries are of class +Cell+; a variant of AtomicLong padded to
18+ # reduce cache contention on most processors. Padding is overkill for most
19+ # Atomics because they are usually irregularly scattered in memory and thus
20+ # don't interfere much with each other. But Atomic objects residing in
21+ # arrays will tend to be placed adjacent to each other, and so will most
22+ # often share cache lines (with a huge negative performance impact) without
2123 # this precaution.
2224 #
23- # In part because +Cell+s are relatively large, we avoid creating
24- # them until they are needed. When there is no contention, all
25- # updates are made to the +base+ field. Upon first contention (a
26- # failed CAS on +base+ update), the table is initialized to size 2.
27- # The table size is doubled upon further contention until
28- # reaching the nearest power of two greater than or equal to the
29- # number of CPUS. Table slots remain empty (+nil+) until they are
25+ # In part because +Cell+s are relatively large, we avoid creating them until
26+ # they are needed. When there is no contention, all updates are made to the
27+ # +base+ field. Upon first contention (a failed CAS on +base+ update), the
28+ # table is initialized to size 2. The table size is doubled upon further
29+ # contention until reaching the nearest power of two greater than or equal
30+ # to the number of CPUS. Table slots remain empty (+nil+) until they are
3031 # needed.
3132 #
32- # A single spinlock (+busy+) is used for initializing and
33- # resizing the table, as well as populating slots with new +Cell+s.
34- # There is no need for a blocking lock: When the lock is not
35- # available, threads try other slots (or the base). During these
36- # retries, there is increased contention and reduced locality,
37- # which is still better than alternatives.
33+ # A single spinlock (+busy+) is used for initializing and resizing the
34+ # table, as well as populating slots with new +Cell+s. There is no need for
35+ # a blocking lock: When the lock is not available, threads try other slots
36+ # (or the base). During these retries, there is increased contention and
37+ # reduced locality, which is still better than alternatives.
3838 #
39- # Per-thread hash codes are initialized to random values.
40- # Contention and/or table collisions are indicated by failed
41- # CASes when performing an update operation (see method
42- # +retry_update+). Upon a collision, if the table size is less than
43- # the capacity, it is doubled in size unless some other thread
44- # holds the lock. If a hashed slot is empty, and lock is
45- # available, a new +Cell+ is created. Otherwise, if the slot
46- # exists, a CAS is tried. Retries proceed by "double hashing",
47- # using a secondary hash (XorShift) to try to find a
48- # free slot.
39+ # Per-thread hash codes are initialized to random values. Contention and/or
40+ # table collisions are indicated by failed CASes when performing an update
41+ # operation (see method +retry_update+). Upon a collision, if the table size
42+ # is less than the capacity, it is doubled in size unless some other thread
43+ # holds the lock. If a hashed slot is empty, and lock is available, a new
44+ # +Cell+ is created. Otherwise, if the slot exists, a CAS is tried. Retries
45+ # proceed by "double hashing", using a secondary hash (XorShift) to try to
46+ # find a free slot.
4947 #
50- # The table size is capped because, when there are more threads
51- # than CPUs, supposing that each thread were bound to a CPU,
52- # there would exist a perfect hash function mapping threads to
53- # slots that eliminates collisions. When we reach capacity, we
54- # search for this mapping by randomly varying the hash codes of
55- # colliding threads. Because search is random, and collisions
56- # only become known via CAS failures, convergence can be slow,
57- # and because threads are typically not bound to CPUS forever,
58- # may not occur at all. However, despite these limitations,
59- # observed contention rates are typically low in these cases.
48+ # The table size is capped because, when there are more threads than CPUs,
49+ # supposing that each thread were bound to a CPU, there would exist a
50+ # perfect hash function mapping threads to slots that eliminates collisions.
51+ # When we reach capacity, we search for this mapping by randomly varying the
52+ # hash codes of colliding threads. Because search is random, and collisions
53+ # only become known via CAS failures, convergence can be slow, and because
54+ # threads are typically not bound to CPUS forever, may not occur at all.
55+ # However, despite these limitations, observed contention rates are
56+ # typically low in these cases.
6057 #
61- # It is possible for a +Cell+ to become unused when threads that
62- # once hashed to it terminate, as well as in the case where
63- # doubling the table causes no thread to hash to it under
64- # expanded mask. We do not try to detect or remove such cells,
65- # under the assumption that for long-running instances, observed
66- # contention levels will recur, so the cells will eventually be
58+ # It is possible for a +Cell+ to become unused when threads that once hashed
59+ # to it terminate, as well as in the case where doubling the table causes no
60+ # thread to hash to it under expanded mask. We do not try to detect or
61+ # remove such cells, under the assumption that for long-running instances,
62+ # observed contention levels will recur, so the cells will eventually be
6763 # needed again; and for short-lived ones, it does not matter.
6864 class Striped64
6965 # Padded variant of AtomicLong supporting only raw accesses plus CAS.
@@ -85,8 +81,8 @@ def cas_computed
8581
8682 extend Volatile
8783 attr_volatile :cells , # Table of cells. When non-null, size is a power of 2.
88- :base , # Base value, used mainly when there is no contention, but also as a fallback during table initialization races. Updated via CAS.
89- :busy # Spinlock (locked via CAS) used when resizing and/or creating Cells.
84+ :base , # Base value, used mainly when there is no contention, but also as a fallback during table initialization races. Updated via CAS.
85+ :busy # Spinlock (locked via CAS) used when resizing and/or creating Cells.
9086
9187 alias_method :busy? , :busy
9288
0 commit comments