Skip to content

Commit c268113

Browse files
ch1boyveshauser
authored andcommitted
Sketch risks and mitigation chapter
Also drops AI-refined risk summaries from impact analysis in favor of two key threat scenarios.
1 parent d50d43c commit c268113

File tree

1 file changed

+21
-125
lines changed

1 file changed

+21
-125
lines changed

docs/leios-design/README.md

Lines changed: 21 additions & 125 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,9 @@ Ouroboros Leios is introducing _(high-)throughput_ as a third key property and i
4141

4242
As it was the case for the Praos variant of Ouroboros (TODO: cite shelley network-design), the specification embodied in the published and peer-reviewed paper for Ouroboros Leios (TODO: cite leios paper) was not intended to be directly implementable. This was confirmed during initial R&D and feasibility studies, which identified several unsolved problems with the fully concurrent block production design proposed in the paper. The latest design presented in CIP-164, also known as "Linear Leios", focuses on the core idea of better utilizing resources in between the necessary "calm" periods of the Praos protocol and presents an immediately implementable design.
4343

44+
> [!WARNING]
45+
> TODO: (re-)introduce the main protocol flow of Leios?
46+
4447
> [!WARNING]
4548
> TODO: Notes on what could go here
4649
> - Node is a concurrent, reactive (real-time) system
@@ -609,138 +612,31 @@ Genesis (Ouroboros Genesis) enables nodes to bootstrap from the genesis block wi
609612
> [!WARNING]
610613
> TODO: Introduce chapter as being the bridge between changes and implementation plan; also, these are only selected aspects that inform the implementation (and not cover principal risks to the protocol or things that are avoided by design)
611614
615+
> [!NOTE]
616+
> Alternative: Move this chapter between Introduction/Overview and Architecture/Changes? Understanding the key threats does not require intimate understanding of node-level components, but having the key threats enumerated allows us to reference them when discussing details in the architecture chapter.
617+
612618
## Key threats
613619

614620
> [!WARNING]
615-
> TODO: Selection of key threats that further inform the design and/or implementation plan. Incorporate / reference the [threat model](./threat-model.md)?
616-
617-
> [!CAUTION]
618-
> FIXME: The following content is AI generated and based on the CIP + impact analysis. Reduce number of "key threats"
619-
620-
#### RSK-LeiosPraosContentionGC
621-
622-
**Description:** Leios components allocating in the same GHC heap as Praos might increase GC pauses, delaying RB diffusion.
623-
624-
**Impact:** HIGH - Could violate Praos $\Delta$ assumptions and compromise chain security.
625-
626-
**Mitigation Strategies:**
627-
1. **Early validation via EXP-LeiosLedgerDbAnalyser:**
628-
- Measure GC behavior for realistic EB transaction sequences
629-
- Quantify mutator time and GC overhead
630-
- Establish safe EB size limits
631-
632-
2. **Process isolation (if needed):**
633-
- Separate Leios validation into dedicated process
634-
- UTxO-HD-like IPC for ledger state access
635-
- Accept overhead cost for GC isolation
636-
637-
3. **Monitoring and alerting:**
638-
- Instrument GC statistics in node telemetry
639-
- Alert on anomalous GC pause times
640-
- Adaptive EB production throttling
641-
642-
#### RSK-LeiosPraosContentionDiskBandwidth
643-
644-
**Description:** Simultaneous Leios writes (EBs, votes, transactions) and Praos/Ledger operations might saturate disk I/O.
645-
646-
**Impact:** MEDIUM-HIGH - Especially with UTxO-HD where ledger state is on disk.
647-
648-
**Mitigation Strategies:**
649-
1. **Rate limiting:**
650-
- Limit Leios disk write rate with back-pressure to network
651-
- Priority I/O scheduling for Praos operations
652-
653-
2. **Buffering and batching:**
654-
- Memory buffer for EB writes before flushing
655-
- Batch vote storage writes
656-
657-
3. **Validation via EXP-LeiosDiffusionOnly:**
658-
- Measure disk I/O under ATK-LeiosProtocolBurst
659-
- Quantify impact on Praos operations
660-
- Tune buffer sizes and rate limits
661-
662-
#### RSK-LeiosLedgerOverheadLatency
663-
664-
**Description:** Processing 15000% of a Praos block worth of transactions in bursts might introduce unexpected latency.
621+
> TODO: Selection of key threats and attacks that further inform the design and/or implementation plan. Incorporate / reference the full [threat model](../threat-model.md)
665622
666-
**Impact:** MEDIUM - Affects vote timing and certificate generation.
623+
### Protocol bursts
667624

668-
**Mitigation Strategies:**
669-
1. **Benchmarking via EXP-LeiosLedgerDbAnalyser:**
670-
- Process realistic EB-sized transaction sequences
671-
- Measure both full validation and reapplication
672-
- Profile CPU and memory pressure
673-
674-
2. **Implementation optimization:**
675-
- Lazy evaluation where safe
676-
- Transaction validation result caching
677-
- Parallel validation of independent transactions (future)
678-
679-
#### RSK-TimingViolation
680-
681-
**Description:** Network conditions might violate timing assumptions ($\Delta_\text{EB}$, $\Delta_\text{RB}$, etc.).
682-
683-
**Impact:** CRITICAL - Could compromise Praos security properties.
684-
685-
**Mitigation Strategies:**
686-
1. **Conservative parameterization:**
687-
- Use 95th percentile network measurements
688-
- Add safety margins to all timing parameters
689-
- Start with conservative values and tighten based on empirical data
690-
691-
2. **Monitoring and detection:**
692-
- Track diffusion times in telemetry
693-
- Alert on timing violations
694-
- Adaptive parameter adjustment (future)
695-
696-
3. **Testnet validation:**
697-
- Run adversarial scenarios on testnet
698-
- Measure worst-case diffusion times
699-
- Validate timing assumptions under load
700-
701-
#### RSK-CertificateVulnerability
702-
703-
**Description:** BLS signature scheme vulnerabilities or implementation bugs could compromise vote integrity.
704-
705-
**Impact:** CRITICAL - Could enable invalid EB certification.
706-
707-
**Mitigation Strategies:**
708-
1. **Formal verification:**
709-
- Agda specification of certificate validation
710-
- Property-based testing of BLS operations
711-
- Cross-implementation test vectors
712-
713-
2. **External audit:**
714-
- Third-party cryptographic review
715-
- Penetration testing of voting system
716-
- Bug bounty program
717-
718-
3. **Gradual rollout:**
719-
- Enable voting but don't rely on certificates initially
720-
- Parallel validation in early testnets
721-
- Incremental trust in cryptographic components
722-
723-
#### RSK-SPOAdoption
724-
725-
**Description:** SPOs might not adopt Leios due to increased operational complexity or resource requirements.
726-
727-
**Impact:** MEDIUM - Reduced decentralization or delayed rollout.
728-
729-
**Mitigation Strategies:**
730-
1. **Clear documentation and tooling:**
731-
- Step-by-step upgrade guides
732-
- Automated BLS key generation and registration
733-
- Monitoring dashboards for Leios-specific metrics
625+
> [!WARNING]
626+
> TODO: important because
627+
> - was a prominent case in research
628+
> - acknowledges the wealth of data to be processed
629+
> - motivates freshest-first delivery / prioritization between praos and leios traffic, and experiments/features revolving around this
630+
> - reference/include/move related RSK-.. items from impact analysis
734631
735-
2. **Phased rollout:**
736-
- Initial testnet deployment with early adopter SPOs
737-
- Mainnet activation only after >80% testnet participation
738-
- Fallback mechanisms if adoption is insufficient
632+
### Data withholding
739633

740-
3. **Incentive alignment:**
741-
- Ensure Leios improves SPO profitability at scale
742-
- No penalties for non-participation in early phases
743-
- Clear communication of long-term economic benefits
634+
> [!WARNING]
635+
> TODO: important because
636+
> - can be done from stake- and network-based attackers
637+
> - trivially impacts high-throughput because no certifications happening
638+
> - however, more advanced, potential avenue to attack blockchain safety (impact praos security argument) when carefully partitioning the network
639+
> - motivates validation of optimistic and worst-case diffusion paths
744640
745641
## Assumptions to validate early
746642

0 commit comments

Comments
 (0)