@@ -113,7 +113,7 @@ references:
113113
114114# Introduction
115115
116- As part of the project to reduce ` cardano-node ` ’s memory use[ @utxo-db ] , by
116+ As part of the project to reduce ` cardano-node ` ’s memory use [ @utxo-db ] , by
117117storing the bulk of the ledger state on disk (colloquially known as
118118"UTxO-HD"[ ^ 1 ] ), a high-performance disk backend has been developed as an
119119arm’s-length project by Well-Typed LLP on behalf of Intersect MBO[ ^ 2 ] . The
@@ -133,7 +133,7 @@ meets all its performance requirements, including stretch targets.
133133
134134[ ^ 2 ] : And previously on behalf of Input Output Global, Inc. (IOG).
135135
136- The backend is implemented as a Haskell library called ` lsm-tree ` [ @lsm-tree ] ,
136+ The backend is implemented as a Haskell library called ` lsm-tree ` [ @lsm-tree ] ,
137137which provides efficient on-disk key–value storage using log-structured
138138merge-trees, or LSM-trees for short. An LSM-tree is a data structure for
139139key–value mappings that is optimized for large tables with a high insertion
@@ -144,15 +144,15 @@ community as well.
144144
145145Currently, a UTxO-HD ` cardano-node ` already exists, but it is an MVP that uses
146146off-the-shelf database software (LMDB) to store a part of the ledger state on
147- disk[ @utxo-db-api ] . Though the LMDB-based solution is suitable for the current
147+ disk [ @utxo-db-api ] . Though the LMDB-based solution is suitable for the current
148148state of the Cardano blockchain, it is not suitable to achieve Cardano’s
149- long-term business requirements[ @utxo-db , Section 3] , such as high throughput
149+ long-term business requirements [ @utxo-db , Section 3] , such as high throughput
150150with limited system resources. The goal of ` lsm-tree ` is to pave the way for
151151achieving said business requirements, providing the necessary foundation on
152152which technologies like Ouroboros Leios can build.
153153
154154Prior to development, an analysis was conducted, leading to a comprehensive
155- requirements document[ @utxo-db-lsm ] outlining the functional and non-functional
155+ requirements document [ @utxo-db-lsm ] outlining the functional and non-functional
156156(performance) requirements for the ` lsm-tree ` library. The requirements document
157157includes recommendations for the disk backend to meet the following criteria:
158158
@@ -173,19 +173,19 @@ It should be noted that the requirements of the `lsm-tree` component were
173173specified in isolation from the consensus layer and ` cardano-node ` , but these
174174requirements were of course chosen with the larger system in mind. This report
175175only reviews the development of ` lsm-tree ` as a standalone component, while
176- integration notes are provided in an accompanying document[ @integration-notes ] .
176+ integration notes are provided in an accompanying document [ @integration-notes ] .
177177Integration of ` lsm-tree ` with the consensus layer will happen as a separate
178178phase of the UTxO-HD project.
179179
180180Readers are advised to familiarise themselves with the API of the library by
181181reading through the Haddock documentation of the public API. A version of the
182182Haddock documentation that tracks the ` main ` branch of the repository is hosted
183- using GitHub Pages[ @lsm-tree-api-docs ] . There are two modules that make up the
183+ using GitHub Pages [ @lsm-tree-api-docs ] . There are two modules that make up the
184184public API: the ` Database.LSMTree ` module contains the full-featured public API,
185185whereas the ` Database.LSMTree.Simple ` module offers a simplified version that is
186186aimed at new users and use cases that do not require advanced features.
187187Additional documentation can be found in the package
188- description[ @lsm-tree-package-desc ] . This and the simple module should be good
188+ description [ @lsm-tree-package-desc ] . This and the simple module should be good
189189places to start at before moving on to the full-featured module.
190190
191191The version of the library that is used as the basis for this report is tagged
@@ -270,15 +270,15 @@ for each completed feature.
270270
271271In the final stages, we reviewed and improved the public API, tests, benchmarks,
272272documentation and library packaging. We constructed the final deliverables, such
273- as this report and additional integration notes[ @integration-notes ] , which
273+ as this report and additional integration notes [ @integration-notes ] , which
274274should guide the integration of ` lsm-tree ` with the consensus layer. In
275275April 2025, we reached the final milestone.
276276
277277# Functional requirements
278278
279279This section outlines the functional requirements for the ` lsm-tree ` library and
280280how they are satisfied. These requirements are described in the original
281- requirements document[ @utxo-db-lsm , Section 18.2] .
281+ requirements document [ @utxo-db-lsm , Section 18.2] .
282282
283283Several requirements specify that an appropriate test should demonstrate the
284284desired functionality. Though the internals of the library are extensively
@@ -324,7 +324,7 @@ The tests are written in three styles:
324324> interface used by the existing consensus layer for its on-disk backends.
325325
326326For the analysis of this functional requirement, we use a fixed version of the
327- ` ouroboros-consensus ` repository[ @ouroboros-consensus ] . This version can be
327+ ` ouroboros-consensus ` repository [ @ouroboros-consensus ] . This version can be
328328checked out using the following commands:
329329
330330``` sh
@@ -334,7 +334,7 @@ git checkout 9d41590555954c511d5f81682ccf7bc963659708
334334```
335335
336336The consensus interface that has to be implemented using ` lsm-tree ` is given by
337- the ` LedgerTablesHandle ` record type[ @ouroboros-consensus-LedgerTablesHandle ] .
337+ the ` LedgerTablesHandle ` record type [ @ouroboros-consensus-LedgerTablesHandle ] .
338338This type provides an abstract view on the table storage, so that the rest of
339339the consensus layer does not have to concern itself with the concrete
340340implementation of that storage, be it based on ` lsm-tree ` or not;
@@ -348,7 +348,7 @@ However, this is considered out of scope for the current phase of the UTxO-HD
348348project.
349349
350350Currently, the consensus layer has one implementation of table storage for the
351- ledger, which stores all data in main memory[ @ouroboros-consensus-InMemory ] .
351+ ledger, which stores all data in main memory [ @ouroboros-consensus-InMemory ] .
352352This implementation preserves much of the behaviour of a pre-UTxO-HD node. A
353353closer look at it shows that there are two pieces of implementation-specific
354354functionality that are not covered by the ` LedgerTablesHandle ` record: creating
@@ -397,7 +397,7 @@ accounted for it by using `Maybe Int` as the return type of `tablesSize`.
397397
398398The analysis above offers a simplified view on how the `lsm- tree` and consensus
399399interfaces fit together; so this report is accompanied by integration
400- notes[@ integration- notes] that provide further guidance. These notes include,
400+ notes [@ integration- notes] that provide further guidance. These notes include,
401401for example, an explanation of the need to store a session context in the ledger
402402database. However , implementation details like these are not considered to be
403403blockers for the integration efforts, as there are clear paths forward.
@@ -416,7 +416,7 @@ We generally advise to prefer the bulk operations over the elementary ones. On
416416Linux systems, lookups in particular will better utilise the storage bandwidth
417417when the bulk version is used, especially in a concurrent setting. This is due
418418to the method used to perform batches of I / O , which employs the `blockio- uring`
419- package[@ blockio- uring]: submitting many batches of I / O concurrently will lead
419+ package [@ blockio- uring]: submitting many batches of I / O concurrently will lead
420420to many I / O requests being in flight at once, so that the SSD bandwidth can be
421421saturated. This is particularly relevant for the consensus layer, which will
422422have to employ concurrent batching to meet higher performance targets, for
@@ -436,12 +436,12 @@ about as expensive as if the blobs’ contents were included in the values.
436436A naive implementation of updates entails latency spikes due to table merging,
437437but the `lsm- tree` library can avoid such spikes by spreading out I / O over time,
438438using an incremental merge algorithm: the algorithm that we prototyped at the
439- start of the `lsm- tree` project[@ lsm- tree- prototype]. Avoiding latency spikes is
440- essential for `cardano- node` because `cardano- node` is a real- time system, which
441- has to respond to input promptly. The use of the incremental merge algorithm
442- does not improve the time complexity of updates as such, but it turns the
443- * amortised* time complexity of the naive solution into a * worst- case * time
444- complexity.
439+ start of the `lsm- tree` project [@ lsm- tree- prototype]. Avoiding latency spikes
440+ is essential for `cardano- node` because `cardano- node` is a real- time system,
441+ which has to respond to input promptly. The use of the incremental merge
442+ algorithm does not improve the time complexity of updates as such, but it turns
443+ the * amortised* time complexity of the naive solution into a * worst- case * time
444+ complexity.
445445
446446## Requirement 3
447447
@@ -646,7 +646,7 @@ simulation provided by the `fs-sim` package.
646646We made some smaller changes to `fs- api` and `fs- sim` to facilitate the
647647development of `lsm- tree`. Furthermore , we created an extension to `HasFS `
648648called `HasBlockIO `. It captures both the submission of batches of I / O , for
649- example using `blockio- uring`[@ blockio- uring], and some functionality unrelated
649+ example using `blockio- uring` [@ blockio- uring], and some functionality unrelated
650650to batching that is nonetheless useful for `lsm- tree`. The latter could
651651eventually be included in `fs- api` and `fs- sim`.
652652
@@ -719,7 +719,7 @@ exception-safe, then this machinery should help achieve this goal.
719719
720720This section outlines the performance requirements for the ` lsm-tree ` library
721721and how they are satisfied. These requirements are described in the original
722- requirements document[ @utxo-db-lsm , Section 18.3] and are reproduced in full
722+ requirements document [ @utxo-db-lsm , Section 18.3] and are reproduced in full
723723[ below] ( #the-requirements-in-detail ) .
724724
725725The summary of our performance results is that we can meet all the the targets:
0 commit comments