diff --git a/website/docs/developers/future-work.md b/website/docs/developers/future-work.md new file mode 100644 index 000000000..c12815a6b --- /dev/null +++ b/website/docs/developers/future-work.md @@ -0,0 +1,84 @@ +--- +sidebar_position: 1 +title: Overview +description: + Overview of planned enhancements and design proposals for the Mina Rust node +slug: /developers/future-work +--- + +# Future Work + +This section outlines planned enhancements and design proposals for the Mina +Rust node that are not yet implemented but are critical for the project's +continued development and mainnet readiness. + +## Overview + +The Mina Rust node is an active development project with several key areas +identified for future enhancement. These proposals represent thoughtful designs +for addressing current limitations and preparing the node for production +deployment at scale. + +## Key Areas + +### P2P Networking Evolution + +The [P2P Evolution Plan](p2p-evolution) builds on the successful pull-based +design already implemented for webnodes. Key planned improvements include: + +- **QUIC Transport Integration**: Adding QUIC as an alternative transport to + WebRTC for improved performance and reduced complexity +- **Block Propagation Optimization**: Reducing bandwidth usage by sending only + block headers and missing data +- **OCaml Node Integration**: Potential unification of networking layers across + Mina implementations + +### Persistent Storage + +The [Persistence Design](persistence-design) addresses one of the most critical +requirements for mainnet readiness. Current challenges include: + +- **Memory Usage**: The entire ledger is currently kept in memory, creating + scalability issues +- **Startup Time**: Nodes must reconstruct the full ledger, which is + time-consuming +- **SNARK Verification**: Redundant verification work across blocks and pools + +## Implementation Status + +**Important**: All items in this section are design proposals only and are not +yet implemented. These documents serve as: + +- **Technical Specifications**: Detailed designs for future implementation +- **Discussion Starting Points**: Basis for technical discussions and + refinements +- **Roadmap Guidance**: Priority areas for development effort + +## Contribution + +These design proposals benefit from community review and input. Developers +interested in contributing to these areas should: + +1. **Review the designs** thoroughly to understand the proposed approaches +2. **Check the + [project dashboard](https://github.com/orgs/o1-labs/projects/24/)** for + current development status and active work items +3. **Provide feedback** on technical feasibility and implementation details +4. **Participate in discussions** about priorities and trade-offs +5. **Contribute to implementation** when development begins + +The designs may evolve based on community feedback, technical constraints, and +changing requirements as the Mina ecosystem develops. + +## Timeline Considerations + +While these documents provide detailed implementation plans, actual development +timelines depend on: + +- **Resource allocation** and team priorities +- **Community needs** and feedback +- **Technical dependencies** and prerequisites +- **Coordination requirements** with other Mina implementations + +The goal is to ensure these enhancements are implemented thoughtfully and with +proper consideration for the broader Mina ecosystem. diff --git a/website/docs/developers/getting-started.mdx b/website/docs/developers/getting-started.mdx index ecdfddced..580451256 100644 --- a/website/docs/developers/getting-started.mdx +++ b/website/docs/developers/getting-started.mdx @@ -23,6 +23,12 @@ import NixDevelop from "!!raw-loader!./scripts/setup/nix-develop.sh"; Welcome to Mina Rust Node development! This guide will help you set up your development environment and build the Mina Rust Node from source. +This documentation covers the current implementation, architecture, and +networking approaches. The [**Future Work**](future-work) section outlines +planned enhancements and design proposals for upcoming features like persistent +storage and P2P protocol evolution that are not yet implemented but are critical +for the node's continued development and mainnet readiness. + ## Prerequisites ### System Requirements diff --git a/website/docs/developers/p2p-evolution.md b/website/docs/developers/p2p-evolution.md new file mode 100644 index 000000000..00e86a3d9 --- /dev/null +++ b/website/docs/developers/p2p-evolution.md @@ -0,0 +1,293 @@ +--- +sidebar_position: 5 +title: P2P Evolution Plan +description: Evolution plan for Mina's P2P networking layer +slug: /developers/p2p-evolution +--- + +# P2P Layer Evolution Plan + +This document outlines the evolution plan for Mina's P2P networking layer, +building on the successful pull-based design already implemented for the Mina +Rust node webnodes. The idea of using QUIC as a transport was originally +proposed by George in his "Networking layer 2.0" document. + +**Status**: The pull-based P2P protocol is implemented and operational. This +document proposes enhancements including QUIC transport, block propagation +optimizations, and integration with the OCaml node to create a unified +networking layer across all Mina implementations. Coordination with OCaml Mina +team required for ecosystem-wide adoption. + +## Current State + +### The Problem: Divergent P2P Architectures + +The Mina ecosystem currently has divergent P2P implementations: + +1. **Mina (OCaml) nodes** + - Use libp2p exclusively via external Golang helper process (`libp2p_helper`) + - Push-based GossipSub protocol + - Known weaknesses in network performance and scalability + +2. **The Mina Rust node** + - Support both libp2p (for OCaml compatibility) AND pull-based WebRTC + - Must internally normalize between push and pull models, adding complexity + - Webnodes use WebRTC exclusively and require Rust nodes as bridges to libp2p + network + - Maintenance burden of supporting two different protocol designs + +This creates significant complexity: + +- The Mina Rust node maintains two protocol implementations +- Webnodes cannot directly communicate with OCaml nodes +- Different security and performance characteristics +- Inconsistent behavior and debugging challenges + +## Vision: Unified Pull-Based P2P Layer + +The goal is to evolve the Mina Rust node's pull-based P2P design to improve +webnode networking immediately and potentially become the universal networking +layer for all Mina nodes (both Rust and OCaml), with multiple transport options. +Full ecosystem adoption would require coordination and agreement with the OCaml +Mina team. + +### Core Design Principles + +The pull-based model addresses fundamental problems in traditional push-based +systems: + +#### Problems with Push-Based Systems + +- **Resource Exhaustion**: Message queues grow unboundedly during high traffic +- **Message Loss**: Dropped messages break eventual consistency +- **DDOS Vulnerability**: Attackers can flood nodes with messages +- **Fairness Issues**: Some peers can monopolize resources + +#### Pull-Based Advantages + +- **Flow Control**: Recipients control message flow through permits +- **Resource Protection**: Processing required before requesting next message +- **Eventual Consistency**: Guaranteed message delivery and processing +- **Fairness**: Equal resource allocation across peers + +## Evolution Phases + +### Phase 1: Enhanced WebRTC Implementation (Current) + +**Status**: ✅ Complete + +- Pull-based messaging with WebRTC transport +- Multiple signaling methods (HTTP, relay-based) +- Channel isolation per protocol type +- Efficient pool propagation +- NAT traversal and encryption + +### Phase 2: QUIC Transport Integration + +**Goals**: + +- Add QUIC as alternative transport to WebRTC +- Maintain pull-based protocol semantics +- Improve performance and reduce complexity + +**Benefits**: + +- **Simplified NAT Traversal**: QUIC handles NAT better than WebRTC setup +- **Lower Latency**: Reduced connection establishment time +- **Better Multiplexing**: Native stream multiplexing without complex setup +- **Standardized Protocol**: Well-defined, battle-tested transport + +**Implementation**: + +- QUIC streams map to current WebRTC data channels +- Same pull-based messaging protocol +- Gradual rollout alongside existing WebRTC + +### Phase 3: Block Propagation Optimization + +**Current Challenge**: Blocks contain redundant data (transactions, SNARKs) +already in local pools. + +**Solution**: + +- Send block headers + merkle proofs + missing data only +- Nodes reconstruct full blocks from local pools +- Dramatic reduction in block transmission size +- Faster propagation across network + +**Benefits**: + +- Reduced bandwidth usage +- Lower memory overhead +- Faster block propagation +- Improved scalability + +### Phase 4: OCaml Node Integration (Future) + +**Vision**: Enable OCaml nodes to use pull-based protocol + +**Approach Options**: + +1. **FFI Integration** + - Bind Rust P2P implementation to OCaml + - Gradual migration from libp2p + - Maintains OCaml node architecture + +2. **Protocol Standardization** + - Define language-agnostic pull-based protocol specification + - OCaml native implementation + - Both implementations interoperate + +3. **Hybrid Bridge** + - Enhanced bridge between protocols + - Improved push-to-pull translation + - Maintains backward compatibility + +## Technical Implementation Details + +### Transport Layer Abstraction + +```rust +trait Transport { + async fn connect(&self, addr: Address) -> Result; + async fn listen(&self, addr: Address) -> Result; +} + +impl Transport for WebRtcTransport { ... } +impl Transport for QuicTransport { ... } +``` + +### Protocol Compatibility + +Pull-based protocol remains transport-agnostic: + +- Same message formats +- Same flow control semantics +- Same channel abstractions +- Transport selection via configuration + +### Migration Strategy + +1. **Parallel Operation**: Run both transports simultaneously +2. **Gradual Adoption**: Nodes advertise transport capabilities +3. **Preference System**: Prefer QUIC when both peers support it +4. **Fallback Support**: Maintain WebRTC for compatibility + +## Performance Expectations + +### QUIC Benefits Over WebRTC + +- **Connection Time**: ~50% reduction in handshake time +- **Memory Usage**: Lower per-connection overhead +- **CPU Usage**: Reduced encryption/decryption overhead +- **Multiplexing**: More efficient stream management + +### Block Propagation Improvements + +- **Size Reduction**: 60-80% smaller block messages +- **Propagation Speed**: 2-3x faster across network +- **Resource Usage**: Significant reduction in bandwidth and parsing + +## Ecosystem Integration + +### Webnode Improvements + +- Direct QUIC connections without complex WebRTC setup +- Better performance behind restrictive networks +- Simplified debugging and monitoring + +### OCaml Node Benefits (Future) + +- Access to optimized pull-based protocol +- Improved network performance +- Unified P2P behavior across implementations + +### Network-Wide Effects + +- More efficient resource utilization +- Better resistance to network attacks +- Improved consistency guarantees +- Enhanced scalability + +## Implementation Timeline + +### Immediate (Current Release Cycle) + +- ✅ WebRTC pull-based implementation +- ✅ Multi-transport abstraction foundation + +### Short Term (Next 2-3 Releases) + +- QUIC transport implementation +- Block propagation optimization +- Performance benchmarking + +### Medium Term (6-12 Months) + +- Production QUIC deployment +- Advanced block reconstruction +- Protocol refinements based on real-world usage + +### Long Term (12+ Months) + +- OCaml integration exploration +- Protocol standardization +- Ecosystem-wide adoption planning + +## Success Metrics + +### Technical Metrics + +- Connection establishment time reduction +- Block propagation latency improvement +- Bandwidth usage reduction +- Memory and CPU usage optimization + +### Network Health + +- Improved consensus convergence time +- Reduced network partitions +- Better handling of high-traffic periods +- Enhanced resistance to attacks + +### Developer Experience + +- Simplified debugging +- Unified protocol behavior +- Better monitoring and observability +- Reduced maintenance burden + +## Risks and Mitigation + +### Technical Risks + +- **QUIC Implementation Complexity**: Mitigate with gradual rollout and + extensive testing +- **Transport Compatibility**: Maintain WebRTC fallback during transition +- **Protocol Changes**: Ensure backward compatibility during evolution + +### Ecosystem Risks + +- **Adoption Resistance**: Demonstrate clear benefits before proposing ecosystem + changes +- **Fragmentation**: Maintain compatibility with existing implementations +- **Coordination Complexity**: Start with Mina Rust node-only improvements + +### Mitigation Strategies + +- Incremental rollout with feature flags +- Comprehensive testing across different network conditions +- Close coordination with stakeholders +- Clear migration paths and documentation + +## Conclusion + +The P2P layer evolution builds on the Mina Rust node's successful pull-based +design to create a more efficient, secure, and unified networking layer for the +Mina ecosystem. While immediate improvements benefit the Mina Rust node and +webnodes, the long-term vision of ecosystem-wide adoption would require +coordination with the OCaml Mina team and careful migration planning. + +The phased approach allows for immediate improvements while keeping future +integration possibilities open, ensuring that the Mina network can evolve toward +better performance and consistency regardless of implementation language. diff --git a/website/docs/developers/persistence-design.md b/website/docs/developers/persistence-design.md new file mode 100644 index 000000000..eeac3d582 --- /dev/null +++ b/website/docs/developers/persistence-design.md @@ -0,0 +1,295 @@ +--- +sidebar_position: 3 +title: Persistence Design +description: Design proposal for persistent ledger storage +slug: /developers/persistence-design +--- + +# Persistence Design (Not Yet Implemented) + +This document outlines the proposed design for persisting the Mina ledger and +other critical state to disk, reducing memory usage and enabling faster node +restarts. + +**Status**: Not yet implemented - this is a design proposal only. + +**Critical for Mainnet**: This is one of the most important changes required to +make the webnode mainnet-ready. + +## Overview + +Currently, the Mina Rust node keeps the entire ledger in memory, which creates +scalability issues for mainnet deployment where the ledger can be large. A +persistent storage solution is needed to: + +- Reduce memory usage for both server-side nodes and webnodes +- Enable faster node restarts by avoiding full ledger reconstruction +- Deduplicate SNARK verification work across blocks and pools +- Support partial ledger storage for light clients + +## Design Reference + +A draft design for the persistence database is outlined in +[Issue #522](https://github.com/o1-labs/mina-rust/issues/522), which proposes an +approach for efficiently storing, updating, and retrieving accounts and hashes. + +**Note**: There is a very old implementation for on-disk storage in +`ledger/src/ondisk/` that was never used - a lightweight key-value store +implemented to avoid the RocksDB dependency. + +**Database Design Resources**: For those implementing persistence, "Database +Internals" and "Designing Data-Intensive Applications" are excellent books on +database design and implementation. + +## Key Design Principles + +Based on [Issue #522](https://github.com/o1-labs/openmina/issues/522), the +persistence design follows these principles: + +1. **Simplicity First**: The design prioritizes simplicity over optimal + performance +2. **Fixed-Size Storage**: Most data (except zkApp accounts) uses fixed-size + slots for predictable access patterns +3. **Sequential Account Creation**: Mina creates accounts sequentially, filling + leaves from left to right in the Merkle tree, enabling an append-only design +4. **Selective Persistence**: Only epoch ledgers and the root ledger need + persistence; masks can remain in-memory +5. **Infrequent Updates**: Root ledger updates occur only when the transition + frontier root moves +6. **Hashes in Memory**: All Merkle tree hashes remain in RAM for quick access +7. **Recoverable**: Data corruption is not catastrophic as ledgers can be + reconstructed from the network + +## Problems to be Solved + +### Memory Usage + +**Current State**: The entire ledger is kept in memory, which can be substantial +on mainnet: + +- Account data includes balances, nonces, zkApp state +- Merkle tree structure for cryptographic proofs +- Multiple ledger versions for different blockchain heights + +**Solution**: Move account data to persistent storage while keeping frequently +accessed data (like Merkle tree hashes) in memory. + +### Startup Time + +**Current State**: Nodes must reconstruct the full ledger from genesis or sync +from peers, which is time-consuming. + +**Solution**: Persist confirmed ledger states to enable fast startup by loading +from disk rather than network reconstruction. + +### SNARK Verification Deduplication + +**Current State**: The same SNARK work may be verified multiple times across +different blocks and transaction pools. + +**Solution**: Cache verification results persistently to avoid redundant +computation. + +## Proposed Architecture + +### Storage Layers + +#### 1. Root Ledger Storage + +- **Purpose**: Store the confirmed ledger at the root of the transition frontier +- **Update frequency**: Only when transition frontier advances +- **Data**: Account balances, nonces, zkApp state +- **Access pattern**: Random reads, infrequent writes + +#### 2. Epoch Ledger Storage + +- **Purpose**: Store ledger snapshots for staking epoch calculations +- **Update frequency**: Once per epoch +- **Data**: Complete ledger state at epoch boundaries +- **Access pattern**: Sequential reads during epoch transitions + +#### 3. Verification Cache + +- **Purpose**: Store SNARK verification results +- **Update frequency**: High during block processing +- **Data**: Verification status keyed by work specification +- **Access pattern**: High read/write frequency + +### Data Structures + +#### Account Storage Format + +```rust +struct PersistedAccount { + public_key: PublicKey, // 32 bytes + balance: u64, // 8 bytes + nonce: u32, // 4 bytes + delegate: Option, // 33 bytes (1 + 32) + voting_for: StateHash, // 32 bytes + zkapp_state: Option, // Variable size + // ... other fields +} +``` + +#### Index Structure + +- **Account Index**: Maps public keys to storage locations +- **Merkle Index**: Maps tree positions to account locations +- **Height Index**: Maps blockchain heights to ledger versions + +### Memory vs Disk Trade-offs + +#### Keep in Memory + +- **Merkle Tree Hashes**: Fast cryptographic proof generation +- **Recent Transactions**: Active processing requirements +- **Connection State**: Network and consensus data +- **Indices**: Fast lookup structures + +#### Move to Disk + +- **Account Data**: Large, infrequently accessed in bulk +- **Historical Ledgers**: Epoch snapshots and old states +- **Verification Cache**: Large datasets with locality + +## Implementation Strategy + +### Phase 1: Foundation + +1. **Storage Interface**: Define abstract storage traits +2. **Account Serialization**: Implement efficient encoding/decoding +3. **Index Management**: Create lookup structures +4. **Testing Framework**: Comprehensive test suite + +### Phase 2: Basic Persistence + +1. **Root Ledger Storage**: Implement basic account persistence +2. **Startup Recovery**: Load ledger from disk on startup +3. **Incremental Updates**: Efficient account modifications +4. **Corruption Recovery**: Handle storage failures gracefully + +### Phase 3: Advanced Features + +1. **Epoch Ledgers**: Historical snapshot storage +2. **Verification Cache**: SNARK result persistence +3. **Compaction**: Optimize storage usage over time +4. **Partial Loading**: Support for light client scenarios + +### Phase 4: Optimization + +1. **Performance Tuning**: Optimize for real-world usage patterns +2. **Memory Management**: Fine-tune memory vs disk balance +3. **Concurrent Access**: Support multiple readers/writers +4. **Monitoring**: Add persistence-related metrics + +## Technical Considerations + +### Storage Backend Options + +#### File-Based Storage + +**Pros**: Simple, no external dependencies, full control **Cons**: Must +implement indexing, compression, concurrent access + +#### Embedded Database (e.g., RocksDB) + +**Pros**: Battle-tested, efficient indexing, concurrent access **Cons**: +Additional dependency, larger binary size + +#### Custom Key-Value Store + +**Pros**: Optimized for Mina's specific needs, lightweight **Cons**: More +development effort, needs thorough testing + +### Consistency Guarantees + +- **Atomic Updates**: Ensure ledger state changes are atomic +- **Crash Recovery**: Handle interruptions during writes +- **Checksum Validation**: Detect storage corruption +- **Version Management**: Track ledger version compatibility + +### Performance Requirements + +- **Read Latency**: Account lookups must remain fast +- **Write Throughput**: Handle block processing rates +- **Memory Usage**: Significant reduction from current levels +- **Startup Time**: Faster than network reconstruction + +## Migration Strategy + +### Development Phase + +1. **Parallel Implementation**: Build alongside current in-memory system +2. **Feature Flags**: Enable persistence selectively +3. **Testing**: Extensive testing with mainnet data +4. **Benchmarking**: Performance comparison with current system + +### Deployment Phase + +1. **Opt-in**: Initially optional for testing +2. **Gradual Rollout**: Enable for specific node types +3. **Full Migration**: Make persistence default +4. **Legacy Support**: Maintain fallback to in-memory mode + +## Success Metrics + +### Memory Usage + +- **Target**: 50-80% reduction in memory usage +- **Measurement**: RSS and heap size monitoring +- **Threshold**: Must support mainnet ledger sizes + +### Performance + +- **Startup Time**: <5 minutes for full ledger load +- **Query Latency**: <1ms for account lookups +- **Block Processing**: No degradation in processing speed + +### Reliability + +- **Data Integrity**: Zero data loss during normal operation +- **Crash Recovery**: <30 seconds to restore consistent state +- **Storage Corruption**: Graceful degradation and recovery + +## Risks and Mitigation + +### Technical Risks + +- **Performance Degradation**: Mitigate with extensive benchmarking +- **Data Corruption**: Implement checksums and validation +- **Storage Space**: Monitor and optimize storage usage + +### Operational Risks + +- **Migration Complexity**: Provide clear upgrade paths +- **Backup Requirements**: Document backup and recovery procedures +- **Monitoring Needs**: Add persistence-specific observability + +## Related Work + +### Existing Implementations + +- **OCaml Node**: Uses RocksDB for ledger persistence +- **Other Blockchains**: Study approaches from Ethereum, Bitcoin +- **Database Systems**: Learn from established database designs + +### Design References + +- [Issue #522](https://github.com/o1-labs/mina-rust/issues/522): Original + persistence design proposal +- [Ledger Implementation](https://o1-labs.github.io/mina-rust/api-docs/ledger/): + Current in-memory ledger code +- [Database Internals](https://databass.dev/): Database design principles +- [DDIA](https://dataintensive.net/): Data-intensive application patterns + +## Conclusion + +Implementing persistent storage is critical for the Mina Rust node's mainnet +readiness. The proposed design balances simplicity with performance, enabling +significant memory usage reduction while maintaining the fast query performance +required for blockchain operations. + +The phased implementation approach allows for careful validation and +optimization, ensuring that persistence improves rather than degrades node +performance. Success in this area will enable the Mina Rust node to scale to +mainnet requirements and support a broader range of deployment scenarios. diff --git a/website/sidebars.ts b/website/sidebars.ts index 96831516e..101bddf9e 100644 --- a/website/sidebars.ts +++ b/website/sidebars.ts @@ -85,6 +85,15 @@ const sidebars: SidebarsConfig = { 'developers/libp2p', ], }, + { + type: 'category', + label: 'Future Work', + items: [ + 'developers/future-work', + 'developers/p2p-evolution', + 'developers/persistence-design', + ], + }, ], // Sidebar for researchers - focus on protocol and cryptography