Skip to content

Commit 1ab6c12

Browse files
committed
Merge #16986: doc: Doxygen-friendly CuckooCache comments
7aad3b6 doc: Doxygen-friendly CuckooCache comments (Jon Layton) Pull request description: Similar theme to #16947. - `invalid`, `contains` now appear in Doxygen docs - `setup` refers to correct argument name `b` - Argument references in `code blocks ` - Lists markdown conformant, uniform line endings Tested with `make docs` ACKs for top commit: laanwj: ACK 7aad3b6 practicalswift: ACK 7aad3b6 Tree-SHA512: 70b38c10e534bad9c6ffcd88cc7a4797644afba5956d47a6c7cc655fcd5857a91f315d6da60e28ce9678d420ed4a51e22267eb8b89e26002b99cad63373dd349
2 parents badca85 + 7aad3b6 commit 1ab6c12

File tree

2 files changed

+55
-54
lines changed

2 files changed

+55
-54
lines changed

src/cuckoocache.h

Lines changed: 50 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -14,42 +14,40 @@
1414
#include <vector>
1515

1616

17-
/** namespace CuckooCache provides high performance cache primitives
17+
/** High-performance cache primitives.
1818
*
1919
* Summary:
2020
*
21-
* 1) bit_packed_atomic_flags is bit-packed atomic flags for garbage collection
21+
* 1. @ref bit_packed_atomic_flags is bit-packed atomic flags for garbage collection
2222
*
23-
* 2) cache is a cache which is performant in memory usage and lookup speed. It
24-
* is lockfree for erase operations. Elements are lazily erased on the next
25-
* insert.
23+
* 2. @ref cache is a cache which is performant in memory usage and lookup speed. It
24+
* is lockfree for erase operations. Elements are lazily erased on the next insert.
2625
*/
2726
namespace CuckooCache
2827
{
29-
/** bit_packed_atomic_flags implements a container for garbage collection flags
28+
/** @ref bit_packed_atomic_flags implements a container for garbage collection flags
3029
* that is only thread unsafe on calls to setup. This class bit-packs collection
3130
* flags for memory efficiency.
3231
*
33-
* All operations are std::memory_order_relaxed so external mechanisms must
32+
* All operations are `std::memory_order_relaxed` so external mechanisms must
3433
* ensure that writes and reads are properly synchronized.
3534
*
36-
* On setup(n), all bits up to n are marked as collected.
35+
* On setup(n), all bits up to `n` are marked as collected.
3736
*
3837
* Under the hood, because it is an 8-bit type, it makes sense to use a multiple
3938
* of 8 for setup, but it will be safe if that is not the case as well.
40-
*
4139
*/
4240
class bit_packed_atomic_flags
4341
{
4442
std::unique_ptr<std::atomic<uint8_t>[]> mem;
4543

4644
public:
47-
/** No default constructor as there must be some size */
45+
/** No default constructor, as there must be some size. */
4846
bit_packed_atomic_flags() = delete;
4947

5048
/**
5149
* bit_packed_atomic_flags constructor creates memory to sufficiently
52-
* keep track of garbage collection information for size entries.
50+
* keep track of garbage collection information for `size` entries.
5351
*
5452
* @param size the number of elements to allocate space for
5553
*
@@ -68,7 +66,7 @@ class bit_packed_atomic_flags
6866
};
6967

7068
/** setup marks all entries and ensures that bit_packed_atomic_flags can store
71-
* at least size entries
69+
* at least `b` entries.
7270
*
7371
* @param b the number of elements to allocate space for
7472
* @post bit_set, bit_unset, and bit_is_set function properly forall x. x <
@@ -84,19 +82,18 @@ class bit_packed_atomic_flags
8482

8583
/** bit_set sets an entry as discardable.
8684
*
87-
* @param s the index of the entry to bit_set.
85+
* @param s the index of the entry to bit_set
8886
* @post immediately subsequent call (assuming proper external memory
8987
* ordering) to bit_is_set(s) == true.
90-
*
9188
*/
9289
inline void bit_set(uint32_t s)
9390
{
9491
mem[s >> 3].fetch_or(1 << (s & 7), std::memory_order_relaxed);
9592
}
9693

97-
/** bit_unset marks an entry as something that should not be overwritten
94+
/** bit_unset marks an entry as something that should not be overwritten.
9895
*
99-
* @param s the index of the entry to bit_unset.
96+
* @param s the index of the entry to bit_unset
10097
* @post immediately subsequent call (assuming proper external memory
10198
* ordering) to bit_is_set(s) == false.
10299
*/
@@ -105,26 +102,26 @@ class bit_packed_atomic_flags
105102
mem[s >> 3].fetch_and(~(1 << (s & 7)), std::memory_order_relaxed);
106103
}
107104

108-
/** bit_is_set queries the table for discardability at s
105+
/** bit_is_set queries the table for discardability at `s`.
109106
*
110-
* @param s the index of the entry to read.
111-
* @returns if the bit at index s was set.
107+
* @param s the index of the entry to read
108+
* @returns true if the bit at index `s` was set, false otherwise
112109
* */
113110
inline bool bit_is_set(uint32_t s) const
114111
{
115112
return (1 << (s & 7)) & mem[s >> 3].load(std::memory_order_relaxed);
116113
}
117114
};
118115

119-
/** cache implements a cache with properties similar to a cuckoo-set
116+
/** @ref cache implements a cache with properties similar to a cuckoo-set.
120117
*
121-
* The cache is able to hold up to (~(uint32_t)0) - 1 elements.
118+
* The cache is able to hold up to `(~(uint32_t)0) - 1` elements.
122119
*
123120
* Read Operations:
124-
* - contains(*, false)
121+
* - contains() for `erase=false`
125122
*
126123
* Read+Erase Operations:
127-
* - contains(*, true)
124+
* - contains() for `erase=true`
128125
*
129126
* Erase Operations:
130127
* - allow_erase()
@@ -141,10 +138,10 @@ class bit_packed_atomic_flags
141138
*
142139
* User Must Guarantee:
143140
*
144-
* 1) Write Requires synchronized access (e.g., a lock)
145-
* 2) Read Requires no concurrent Write, synchronized with the last insert.
146-
* 3) Erase requires no concurrent Write, synchronized with last insert.
147-
* 4) An Erase caller must release all memory before allowing a new Writer.
141+
* 1. Write requires synchronized access (e.g. a lock)
142+
* 2. Read requires no concurrent Write, synchronized with last insert.
143+
* 3. Erase requires no concurrent Write, synchronized with last insert.
144+
* 4. An Erase caller must release all memory before allowing a new Writer.
148145
*
149146
*
150147
* Note on function names:
@@ -177,7 +174,7 @@ class cache
177174
mutable std::vector<bool> epoch_flags;
178175

179176
/** epoch_heuristic_counter is used to determine when an epoch might be aged
180-
* & an expensive scan should be done. epoch_heuristic_counter is
177+
* & an expensive scan should be done. epoch_heuristic_counter is
181178
* decremented on insert and reset to the new number of inserts which would
182179
* cause the epoch to reach epoch_size when it reaches zero.
183180
*/
@@ -194,24 +191,25 @@ class cache
194191
uint32_t epoch_size;
195192

196193
/** depth_limit determines how many elements insert should try to replace.
197-
* Should be set to log2(n)*/
194+
* Should be set to log2(n).
195+
*/
198196
uint8_t depth_limit;
199197

200198
/** hash_function is a const instance of the hash function. It cannot be
201199
* static or initialized at call time as it may have internal state (such as
202200
* a nonce).
203-
* */
201+
*/
204202
const Hash hash_function;
205203

206204
/** compute_hashes is convenience for not having to write out this
207205
* expression everywhere we use the hash values of an Element.
208206
*
209207
* We need to map the 32-bit input hash onto a hash bucket in a range [0, size) in a
210-
* manner which preserves as much of the hash's uniformity as possible. Ideally
208+
* manner which preserves as much of the hash's uniformity as possible. Ideally
211209
* this would be done by bitmasking but the size is usually not a power of two.
212210
*
213211
* The naive approach would be to use a mod -- which isn't perfectly uniform but so
214-
* long as the hash is much larger than size it is not that bad. Unfortunately,
212+
* long as the hash is much larger than size it is not that bad. Unfortunately,
215213
* mod/division is fairly slow on ordinary microprocessors (e.g. 90-ish cycles on
216214
* haswell, ARM doesn't even have an instruction for it.); when the divisor is a
217215
* constant the compiler will do clever tricks to turn it into a multiply+add+shift,
@@ -223,10 +221,10 @@ class cache
223221
* somewhat complicated and the result is still slower than other options:
224222
*
225223
* Instead we treat the 32-bit random number as a Q32 fixed-point number in the range
226-
* [0,1) and simply multiply it by the size. Then we just shift the result down by
227-
* 32-bits to get our bucket number. The result has non-uniformity the same as a
224+
* [0, 1) and simply multiply it by the size. Then we just shift the result down by
225+
* 32-bits to get our bucket number. The result has non-uniformity the same as a
228226
* mod, but it is much faster to compute. More about this technique can be found at
229-
* http://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
227+
* http://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/ .
230228
*
231229
* The resulting non-uniformity is also more equally distributed which would be
232230
* advantageous for something like linear probing, though it shouldn't matter
@@ -237,8 +235,8 @@ class cache
237235
* 32*32->64 multiply, which means the operation is reasonably fast even on a
238236
* typical 32-bit processor.
239237
*
240-
* @param e the element whose hashes will be returned
241-
* @returns std::array<uint32_t, 8> of deterministic hashes derived from e
238+
* @param e The element whose hashes will be returned
239+
* @returns Deterministic hashes derived from `e` uniformly mapped onto the range [0, size)
242240
*/
243241
inline std::array<uint32_t, 8> compute_hashes(const Element& e) const
244242
{
@@ -252,14 +250,14 @@ class cache
252250
(uint32_t)(((uint64_t)hash_function.template operator()<7>(e) * (uint64_t)size) >> 32)}};
253251
}
254252

255-
/* end
256-
* @returns a constexpr index that can never be inserted to */
253+
/** invalid returns a special index that can never be inserted to
254+
* @returns the special constexpr index that can never be inserted to */
257255
constexpr uint32_t invalid() const
258256
{
259257
return ~(uint32_t)0;
260258
}
261259

262-
/** allow_erase marks the element at index n as discardable. Threadsafe
260+
/** allow_erase marks the element at index `n` as discardable. Threadsafe
263261
* without any concurrent insert.
264262
* @param n the index to allow erasure of
265263
*/
@@ -268,7 +266,7 @@ class cache
268266
collection_flags.bit_set(n);
269267
}
270268

271-
/** please_keep marks the element at index n as an entry that should be kept.
269+
/** please_keep marks the element at index `n` as an entry that should be kept.
272270
* Threadsafe without any concurrent insert.
273271
* @param n the index to prioritize keeping
274272
*/
@@ -336,7 +334,7 @@ class cache
336334
*
337335
* @param new_size the desired number of elements to store
338336
* @returns the maximum number of elements storable
339-
**/
337+
*/
340338
uint32_t setup(uint32_t new_size)
341339
{
342340
// depth_limit must be at least one otherwise errors can occur.
@@ -360,7 +358,7 @@ class cache
360358
* negligible compared to the size of the elements.
361359
*
362360
* @param bytes the approximate number of bytes to use for this data
363-
* structure.
361+
* structure
364362
* @returns the maximum number of elements storable (see setup()
365363
* documentation for more detail)
366364
*/
@@ -376,18 +374,19 @@ class cache
376374
* It drops the last tried element if it runs out of depth before
377375
* encountering an open slot.
378376
*
379-
* Thus
377+
* Thus:
380378
*
379+
* ```
381380
* insert(x);
382381
* return contains(x, false);
382+
* ```
383383
*
384384
* is not guaranteed to return true.
385385
*
386386
* @param e the element to insert
387387
* @post one of the following: All previously inserted elements and e are
388388
* now in the table, one previously inserted element is evicted from the
389389
* table, the entry attempted to be inserted is evicted.
390-
*
391390
*/
392391
inline void insert(Element e)
393392
{
@@ -416,9 +415,9 @@ class cache
416415
/** Swap with the element at the location that was
417416
* not the last one looked at. Example:
418417
*
419-
* 1) On first iteration, last_loc == invalid(), find returns last, so
418+
* 1. On first iteration, last_loc == invalid(), find returns last, so
420419
* last_loc defaults to locs[0].
421-
* 2) On further iterations, where last_loc == locs[k], last_loc will
420+
* 2. On further iterations, where last_loc == locs[k], last_loc will
422421
* go to locs[k+1 % 8], i.e., next of the 8 indices wrapping around
423422
* to 0 if needed.
424423
*
@@ -439,17 +438,19 @@ class cache
439438
}
440439
}
441440

442-
/* contains iterates through the hash locations for a given element
441+
/** contains iterates through the hash locations for a given element
443442
* and checks to see if it is present.
444443
*
445444
* contains does not check garbage collected state (in other words,
446445
* garbage is only collected when the space is needed), so:
447446
*
447+
* ```
448448
* insert(x);
449449
* if (contains(x, true))
450450
* return contains(x, false);
451451
* else
452452
* return true;
453+
* ```
453454
*
454455
* executed on a single thread will always return true!
455456
*
@@ -458,7 +459,7 @@ class cache
458459
* contains returns a bool set true if the element was found.
459460
*
460461
* @param e the element to check
461-
* @param erase
462+
* @param erase whether to attempt setting the garbage collect flag
462463
*
463464
* @post if erase is true and the element is found, then the garbage collect
464465
* flag is set

src/test/cuckoocache_tests.cpp

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@
1010

1111
/** Test Suite for CuckooCache
1212
*
13-
* 1) All tests should have a deterministic result (using insecure rand
13+
* 1. All tests should have a deterministic result (using insecure rand
1414
* with deterministic seeds)
15-
* 2) Some test methods are templated to allow for easier testing
15+
* 2. Some test methods are templated to allow for easier testing
1616
* against new versions / comparing
17-
* 3) Results should be treated as a regression test, i.e., did the behavior
17+
* 3. Results should be treated as a regression test, i.e., did the behavior
1818
* change significantly from what was expected. This can be OK, depending on
1919
* the nature of the change, but requires updating the tests to reflect the new
2020
* expected behavior. For example improving the hit rate may cause some tests
@@ -82,9 +82,9 @@ static double test_cache(size_t megabytes, double load)
8282
*
8383
* Examples:
8484
*
85-
* 1) at load 0.5, we expect a perfect hit rate, so we multiply by
85+
* 1. at load 0.5, we expect a perfect hit rate, so we multiply by
8686
* 1.0
87-
* 2) at load 2.0, we expect to see half the entries, so a perfect hit rate
87+
* 2. at load 2.0, we expect to see half the entries, so a perfect hit rate
8888
* would be 0.5. Therefore, if we see a hit rate of 0.4, 0.4*2.0 = 0.8 is the
8989
* normalized hit rate.
9090
*

0 commit comments

Comments
 (0)