@@ -59,6 +59,7 @@ is distributed under the [ISC license](LICENSE.md).
5959 - [ A transactional LRU cache] ( #a-transactional-lru-cache )
6060 - [ Programming with primitive operations] ( #programming-with-primitive-operations )
6161- [ Designing lock-free algorithms with k-CAS] ( #designing-lock-free-algorithms-with-k-cas )
62+ - [ Understand performance] ( #understand-performance )
6263 - [ Minimize accesses] ( #minimize-accesses )
6364 - [ Prefer compound accesses] ( #prefer-compound-accesses )
6465 - [ Log updates optimistically] ( #log-updates-optimistically )
@@ -1103,6 +1104,39 @@ that it allows developing lock-free algorithms compositionally. In the following
11031104sections we discuss a number of basic tips and approaches for making best use of
11041105k-CAS.
11051106
1107+ ### Understand performance
1108+
1109+ It is possible to convert imperative sequential data structures to lock-free
1110+ data structures [ almost] ( #beware-of-torn-reads ) just by using
1111+ [ shared memory locations] ( https://ocaml-multicore.github.io/kcas/doc/kcas/Kcas/Loc/ )
1112+ and wrapping everything inside
1113+ [ transactions] ( https://ocaml-multicore.github.io/kcas/doc/kcas/Kcas/Xt/ ) , but
1114+ doing so will likely not lead to good performance.
1115+
1116+ On the other hand, if you have a non-blocking data structure implemented using
1117+ plain ` Atomic ` s, then simply replacing ` Atomic ` with
1118+ [ ` Loc ` ] ( https://ocaml-multicore.github.io/kcas/doc/kcas/Kcas/Loc/ ) you should
1119+ get a data structure that works the same and will take somewhat more memory and
1120+ operates somewhat more slowly. However, adding transactional operations simply
1121+ by wrapping all accesses of a non-blocking data structure implementation will
1122+ likely not lead to well performing transactional operations.
1123+
1124+ [ Shared memory locations] ( https://ocaml-multicore.github.io/kcas/doc/kcas/Kcas/Loc/ )
1125+ take more memory than ordinary mutable fields or mutable references and mutating
1126+ operations on shared memory locations allocate. The
1127+ [ transaction mechanism] ( https://ocaml-multicore.github.io/kcas/doc/kcas/Kcas/Xt/ )
1128+ also allocates and adds lookup overhead to accesses. Updating multiple locations
1129+ in a transaction is more expensive than updating individual locations
1130+ atomically. Contention can cause transactions to retry and perform poorly.
1131+
1132+ With that said, it is possible to create composable and reasonably well
1133+ performing data structures using ** kcas** . If a ** kcas** based data structure is
1134+ performing much worse than a similar lock-free or lock-based data structure,
1135+ then there is likely room to improve. Doing so will require good understanding
1136+ of and careful attention to algorithmic details, such as which accessed need to
1137+ be performed transactionally and which do not, operation of the transaction
1138+ mechanism, and performance of individual low level operations.
1139+
11061140### Minimize accesses
11071141
11081142Accesses of shared memory locations inside transactions consult the transaction
0 commit comments