diff --git a/.github/workflows/markdown-format.yml b/.github/workflows/markdown-format.yml new file mode 100644 index 000000000..ebe97a5b3 --- /dev/null +++ b/.github/workflows/markdown-format.yml @@ -0,0 +1,26 @@ +name: Markdown Format Check + +on: + push: + branches: [ main, develop ] + pull_request: + +jobs: + markdown-format: + name: Check Markdown Formatting + runs-on: ubuntu-latest + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Setup Node.js + uses: actions/setup-node@v4 + with: + node-version: '20' + cache: 'npm' + + - name: Install dependencies + run: npm install + + - name: Check markdown formatting + run: make check-md diff --git a/.gitignore b/.gitignore index 78d9e6a0c..171c99ac4 100644 --- a/.gitignore +++ b/.gitignore @@ -2,3 +2,4 @@ /node/testing/res/ .DS_Store *.profraw +node_modules \ No newline at end of file diff --git a/.prettierignore b/.prettierignore new file mode 100644 index 000000000..adaa8d1f4 --- /dev/null +++ b/.prettierignore @@ -0,0 +1,40 @@ +# Dependencies +node_modules/ +**/node_modules/ + +# Build outputs +target/ +**/target/ +dist/ +build/ + +# Generated files +**/*.generated.md +**/*.lock + +# Frontend specific +frontend/src/assets/ +frontend/public/ +frontend/dist/ + +# Test files +**/test-output/ +**/coverage/ + +# Docker +Dockerfile* +docker-compose*.yml + +# Cargo.toml and other config files +*.toml +*.lock +*.log + +# Binary files +*.bin +*.wasm +*.so +*.dylib + +# Documentation that should not be auto-formatted +CHANGELOG.md \ No newline at end of file diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index c5bc94c50..4653ab74f 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -22,12 +22,14 @@ pub enum P2pAction { ... } ``` + link to the definition of the node("root") action: [openmina_node::Action](node/src/action.rs). ## Enabling Condition Each [Action](#action) must implement the trait + ```Rust pub trait EnablingCondition { /// Enabling condition for the Action. @@ -39,39 +41,38 @@ pub trait EnablingCondition { } ``` -`is_enabled(state, time)` must return `false`, if action doesn't make sense given -the current state and, optionally, time. +`is_enabled(state, time)` must return `false`, if action doesn't make sense +given the current state and, optionally, time. For example, message action from peer that isn't connected or we don't know about in the state, must not be enabled. -Or timeout action. If according to state, duration for timeout hasn't -passed, then we shouldn't enable that timeout action. +Or timeout action. If according to state, duration for timeout hasn't passed, +then we shouldn't enable that timeout action. -Thanks to enabling condition, if properly written, impossible/unexpected -state transition can be made impossible. +Thanks to enabling condition, if properly written, impossible/unexpected state +transition can be made impossible. #### Avoiding code duplication -We can also utilize this property to avoid code duplication, by simply -trying to dispatch an action instead of checking something in the effects -before dispatch, knowing that enabling condition will filter out actions -that shouldn't happen. +We can also utilize this property to avoid code duplication, by simply trying to +dispatch an action instead of checking something in the effects before dispatch, +knowing that enabling condition will filter out actions that shouldn't happen. So for example for checking [timeouts](#timeouts), in the `CheckTimeoutsAction` -effects, we can simply dispatch timeout action. If timeout duration -hasn't passed yet, then action will be dropped, if it has passed, then -action will be dispatched as expected. +effects, we can simply dispatch timeout action. If timeout duration hasn't +passed yet, then action will be dropped, if it has passed, then action will be +dispatched as expected. #### Did action get dispatched -If it's important to know in the effects by dispatcher, if action was -enabled and got dispatched, it can be checked by checking the return value -of the `store.dispatch(..)` call. -It will return `true` if action did get dispatched, otherwise `false`. +If it's important to know in the effects by dispatcher, if action was enabled +and got dispatched, it can be checked by checking the return value of the +`store.dispatch(..)` call. It will return `true` if action did get dispatched, +otherwise `false`. -Sometimes we want to dispatch one action or the other. In such case we -can write: +Sometimes we want to dispatch one action or the other. In such case we can +write: ```Rust if !store.dispatch(A) { @@ -81,17 +82,18 @@ if !store.dispatch(A) { ## Reducer -Responsible for state management. Only function that's able to change -the [state](node/src/state.rs) is the reducer. +Responsible for state management. Only function that's able to change the +[state](node/src/state.rs) is the reducer. Takes current `State` and an `Action` and computes new `State`. Pseudocode: `reducer(state, action) -> state` -We don't really need to take state immutably, in JavaScript/frontend -it makes sense, but in our case, it doesn't. +We don't really need to take state immutably, in JavaScript/frontend it makes +sense, but in our case, it doesn't. So reducer now looks like this: + ```Rust type Reducer = fn(&mut State, &Action); ``` @@ -102,15 +104,16 @@ Main reducer function that gets called on every action: ## Effects(side-effects) Effects are a way to: + 1. interact with the [Service](#service). 1. manage [control-flow](#control-flow). -`Effects` run after every action and triggers side-effects (calls to the -service or dispatches some other action). +`Effects` run after every action and triggers side-effects (calls to the service +or dispatches some other action). -It has access to global `State`, as well as services. -It can't mutate `Action` or the `State` directly. It can however dispatch -another action, which can in turn mutate the state. +It has access to global `State`, as well as services. It can't mutate `Action` +or the `State` directly. It can however dispatch another action, which can in +turn mutate the state. ```Rust type Effects = fn(&mut Store, &Action); @@ -121,14 +124,14 @@ Main effects function that gets called on every action: ## Service -Services are mostly just IO or computationally heavy tasks that we want -to run in another thread. +Services are mostly just IO or computationally heavy tasks that we want to run +in another thread. -- `Service` should have a minimal state! As a rule of thumb, - anything that can be serialized, should go inside our global `State`. -- Logic in service should be minimal. No decision-making should be done - there. It should mostly only act as a common interface for interacting - with the outside platform (OS, Browser, etc...). +- `Service` should have a minimal state! As a rule of thumb, anything that can + be serialized, should go inside our global `State`. +- Logic in service should be minimal. No decision-making should be done there. + It should mostly only act as a common interface for interacting with the + outside platform (OS, Browser, etc...). ## State @@ -143,8 +146,8 @@ and calls `effects` after it. ## State Machine Inputs -State machine's execution is fully predictable/deterministic. If it -receives same inputs in the same order, it's behaviour will be the same. +State machine's execution is fully predictable/deterministic. If it receives +same inputs in the same order, it's behaviour will be the same. State machine has 3 kinds of inputs: @@ -152,44 +155,50 @@ State machine has 3 kinds of inputs: 1. Time, which is an extension of every action that gets dispatched. see: [ActionMeta](deps/redux-rs/src/action/action_meta.rs) + 1. Synchronously returned values by services. - Idially this one should be avoided as much as possible, because it - introduces non-determinism in the `effects` function. + Idially this one should be avoided as much as possible, because it introduces + non-determinism in the `effects` function. -Thanks to this property, state machine's inputs can be [recorded](node/src/recorder/recorder.rs), -to then be [replayed](cli/src/commands/replay/replay_state_with_input_actions.rs), -which is very useful for debugging. +Thanks to this property, state machine's inputs can be +[recorded](node/src/recorder/recorder.rs), to then be +[replayed](cli/src/commands/replay/replay_state_with_input_actions.rs), which is +very useful for debugging. ## Event An [Event](node/src/event_source/event.rs), represents all types of data/input comming from outside the state machine (mostly from services). -It is wrapped by [EventSourceNewEventAction](node/src/event_source/event_source_actions.rs), -and dispatched when new event arrives. +It is wrapped by +[EventSourceNewEventAction](node/src/event_source/event_source_actions.rs), and +dispatched when new event arrives. -Only events can be dispatched outside the state machine. For case when -there isn't any events, waiting for events will timeout and +Only events can be dispatched outside the state machine. For case when there +isn't any events, waiting for events will timeout and [EventSourceWaitTimeoutAction](node/src/event_source/event_source_actions.rs) will be dispatched. This way we won't get stuck forever waiting for events. ## Control Flow -[event_source](node/src/event_source/event_source_actions.rs) actions -are the only "root" actions that get dispatched. The rest get dispatched -directly or indirectly by event's effects. We could have one large -reducer/effects function for each of the event and it would work as it -does now, but it would be very hard to write and debug. +[event_source](node/src/event_source/event_source_actions.rs) actions are the +only "root" actions that get dispatched. The rest get dispatched directly or +indirectly by event's effects. We could have one large reducer/effects function +for each of the event and it would work as it does now, but it would be very +hard to write and debug. -By having "non-root"(or effect) actions, we make it easier to represent -complex logic and all the state transitions. +By having "non-root"(or effect) actions, we make it easier to represent complex +logic and all the state transitions. There are 2 types of effects: + 1. **Local Effects** Effects which make sense in local scope/subslice of the state machine. - Ideally/Mostly such functions are written as `effects` function on the action: + Ideally/Mostly such functions are written as `effects` function on the + action: + ```Rust impl P2pConnectionOutgoingSuccessAction { pub fn effects(self, _: &ActionMeta, store: &mut Store) @@ -199,86 +208,87 @@ There are 2 types of effects: } } ``` + 1. **Global Effects** Effects that don't fit in the local scope. - For example, we could receive rpc request on p2p layer to send our - current best tip. Since [p2p](p2p/) is in a separate crate, it's not even - possible to answer that rpc there, as in that crate, we only have - partial view (p2p part) of the state. But we do have that access - in [openmina-node](node/) crate, so we write effects to respond to that - rpc [in there](https://github.com/openmina/openmina/blob/f6bde2138157dcdacd4baa0cd07c22506dc2a7c0/node/src/p2p/p2p_effects.rs#L517). + For example, we could receive rpc request on p2p layer to send our current + best tip. Since [p2p](p2p/) is in a separate crate, it's not even possible to + answer that rpc there, as in that crate, we only have partial view (p2p part) + of the state. But we do have that access in [openmina-node](node/) crate, so + we write effects to respond to that rpc + [in there](https://github.com/openmina/openmina/blob/f6bde2138157dcdacd4baa0cd07c22506dc2a7c0/node/src/p2p/p2p_effects.rs#L517). Examples of the flow: + - [Sync staged ledger](node/src/transition_frontier/sync/ledger/staged) ## Timeouts -To ensure that the node is low in complexity, easy to test, to reason -about and debug, all timeouts must be triggered inside state machine. +To ensure that the node is low in complexity, easy to test, to reason about and +debug, all timeouts must be triggered inside state machine. -If timeouts happen in the service, then it's beyond our full control -during testing, which will limit testing possibilities and will make -tests flaky. +If timeouts happen in the service, then it's beyond our full control during +testing, which will limit testing possibilities and will make tests flaky. -We have a special action: [CheckTimeoutsAction](node/src/action.rs). -It triggers bunch of actions which will check timeouts and if timeout is -detected, timeout action will be dispatched which will trigger other effects. +We have a special action: [CheckTimeoutsAction](node/src/action.rs). It triggers +bunch of actions which will check timeouts and if timeout is detected, timeout +action will be dispatched which will trigger other effects. # Mental Model Tips -When working with this architecture, conventional means of reading and -writing the code goes out the window. A lot of code that we write, -won't be a state machine, but the code that is, completely differs from -what developers are used to. In order to be productive with it, mental -model needs to be adjusted. +When working with this architecture, conventional means of reading and writing +the code goes out the window. A lot of code that we write, won't be a state +machine, but the code that is, completely differs from what developers are used +to. In order to be productive with it, mental model needs to be adjusted. For example, when developer needs to understand the logic written in a -particular state machine, it might be tempting to use conventional methods, -like finding a starting point in the code and trying to follow the code. -One who'll attempt that, will quickly realise that the process is -extremely taxing and almost impossible. The amount of things you need -to keep in memory, number of jumps you need to perform, makes it very -difficult. This hinders one's productivity and in the end, most likely, -whatever change developer makes in the code will be buggy, due to -partial understanding obtained through tedious process. - -Instead we need to shift our mental model and change the overall way we -process the code written with this architecture. We need to leverage -the way state machines are written, in order to minimize the cognitive load. -Instead of trying to follow the code, we should start with the state -of the state machine, look at it carefully and try to deduce the flow -using it. Our deductions just based on state will have flaws and holes, -but we can fix that by filtering through actions and finding ones based -on name, which most likely relates to those holes. By checking those -and their enabling conditions, we will have a more clearer idea about -the state machine. Sometimes it still might not be enough and we will -need to check (only relevant) pieces of the reducer and effects. +particular state machine, it might be tempting to use conventional methods, like +finding a starting point in the code and trying to follow the code. One who'll +attempt that, will quickly realise that the process is extremely taxing and +almost impossible. The amount of things you need to keep in memory, number of +jumps you need to perform, makes it very difficult. This hinders one's +productivity and in the end, most likely, whatever change developer makes in the +code will be buggy, due to partial understanding obtained through tedious +process. + +Instead we need to shift our mental model and change the overall way we process +the code written with this architecture. We need to leverage the way state +machines are written, in order to minimize the cognitive load. Instead of trying +to follow the code, we should start with the state of the state machine, look at +it carefully and try to deduce the flow using it. Our deductions just based on +state will have flaws and holes, but we can fix that by filtering through +actions and finding ones based on name, which most likely relates to those +holes. By checking those and their enabling conditions, we will have a more +clearer idea about the state machine. Sometimes it still might not be enough and +we will need to check (only relevant) pieces of the reducer and effects. Main reason why conventional method doesn't work, is because nothing is -abstracted/hidden away. When you follow the code, you follow actual -execution that the cpu will perform. We have a single threaded state machine, -responsible for business logic + managing concurrent/parallel processes, -so following it's flow is like attempting to jump into async executor -code at every `.await` point. +abstracted/hidden away. When you follow the code, you follow actual execution +that the cpu will perform. We have a single threaded state machine, responsible +for business logic + managing concurrent/parallel processes, so following it's +flow is like attempting to jump into async executor code at every `.await` +point. ## Importance of State -The `State` of the state machine, sits at the core of everything. It is -the first thing we carefully design, actions with enabling conditions, -effects and reducers come later. +The `State` of the state machine, sits at the core of everything. It is the +first thing we carefully design, actions with enabling conditions, effects and +reducers come later. -`State` is supposed to be a declarative way to describe a flow, then -enabling conditions, reducers and effects enable that flow. Even if we -remove all the rest of the code, simply having a `State` definition should -be enough to get a general idea about the purpose and the flow of the -state machine. Each sub-state/sub-statemachine must follow this rule. +`State` is supposed to be a declarative way to describe a flow, then enabling +conditions, reducers and effects enable that flow. Even if we remove all the +rest of the code, simply having a `State` definition should be enough to get a +general idea about the purpose and the flow of the state machine. Each +sub-state/sub-statemachine must follow this rule. -E.g. if we look at the state of the [snark_pool_candidate](node/src/snark_pool/candidate/snark_pool_candidate_state.rs) +E.g. if we look at the state of the +[snark_pool_candidate](node/src/snark_pool/candidate/snark_pool_candidate_state.rs) state machine, which is responsible for processing received snark work. (added comments represent thought process while reading the state). + ```Rust pub enum SnarkPoolCandidateState { // some info received regarding the candidate? @@ -324,135 +334,153 @@ pub enum SnarkPoolCandidateState { ... ``` -Above is just a demo to show how much we can deduce just by looking at -the state. Of course it isn't enough to have a full understanding of the -individual state machine, and it might leave some holes, but they can -easily be filled by looking at the actions most related to those holes. +Above is just a demo to show how much we can deduce just by looking at the +state. Of course it isn't enough to have a full understanding of the individual +state machine, and it might leave some holes, but they can easily be filled by +looking at the actions most related to those holes. Examples of the holes left by above is: -1. State machine starts with snark work info already received. How do we - know we even need that snark? Where is the filtering happening? - To find out we can look for the action name that would be "notifying" - this state machine that info was received. If we check, it's - `SnarkPoolCandidateInfoReceivedAction`. If we check it's enabling - condition, we will see that unneeded snarks will get filtered out there. +1. State machine starts with snark work info already received. How do we know we + even need that snark? Where is the filtering happening? + + To find out we can look for the action name that would be "notifying" this + state machine that info was received. If we check, it's + `SnarkPoolCandidateInfoReceivedAction`. If we check it's enabling condition, + we will see that unneeded snarks will get filtered out there. + 2. Do snarks get verified one by one? Or is there batching involved? - To find that out, we can look for actions regarding work verification - in this state machine. We can see `SnarkPoolCandidateWorkVerifyPendingAction`, - it's definition being: - ```Rust - pub struct SnarkPoolCandidateWorkVerifyPendingAction { - pub peer_id: PeerId, - pub job_ids: Vec, - pub verify_id: SnarkWorkVerifyId, - } - ``` + To find that out, we can look for actions regarding work verification in this + state machine. We can see `SnarkPoolCandidateWorkVerifyPendingAction`, it's + definition being: + + ```Rust + pub struct SnarkPoolCandidateWorkVerifyPendingAction { + pub peer_id: PeerId, + pub job_ids: Vec, + pub verify_id: SnarkWorkVerifyId, + } + ``` - We can clearly see from above that we have an array of `job_ids`, - while only having a single `verify_id`. Meaning we do have batching. + We can clearly see from above that we have an array of `job_ids`, while only + having a single `verify_id`. Meaning we do have batching. - Also we can see `peer_id` there, which might not be obvious why it's - there. To understand that and how snarks are batched together, we - need to check out a place where - `SnarkPoolCandidateWorkVerifyPendingAction` gets dispatched(in the effects). + Also we can see `peer_id` there, which might not be obvious why it's there. + To understand that and how snarks are batched together, we need to check out + a place where `SnarkPoolCandidateWorkVerifyPendingAction` gets dispatched(in + the effects). ## Designing new state machine #### Where it belongs -When creating a new state machine, first we need to figure out where -it belongs. Whether we should add new statemachine in the root dir of -the node, or if it needs to be a sub-statemachine. +When creating a new state machine, first we need to figure out where it belongs. +Whether we should add new statemachine in the root dir of the node, or if it +needs to be a sub-statemachine. Once we decide, we need to make a new module there. #### Designing state -Once we have that down, we need to think about the flow and logic we -wish to introduce and use that in order to carefully craft the -definition of the state. This is where the most amount of thought needs -to go to. If we start with the state and make it represent the flow, -we will make our new state machine: - -1. Easy to debug, since state represents the flow and if we want to debug - it, we can easily follow the flow by observing state transitions. -2. Easy to read/process, since a lot of information will be conveyed - just with state definition. -3. Minimized or non-existent impossible/duplicate states, since state - represents the actual flow, we can use it to restrict the flow with - enabling conditions as much as possible. - -When designing a state, above expectations must be taken into account. -If `State` doesn't represent the flow and hides it, it will take other -developers much longer to process the code and impossible states could -become an issue. +Once we have that down, we need to think about the flow and logic we wish to +introduce and use that in order to carefully craft the definition of the state. +This is where the most amount of thought needs to go to. If we start with the +state and make it represent the flow, we will make our new state machine: + +1. Easy to debug, since state represents the flow and if we want to debug it, we + can easily follow the flow by observing state transitions. +2. Easy to read/process, since a lot of information will be conveyed just with + state definition. +3. Minimized or non-existent impossible/duplicate states, since state represents + the actual flow, we can use it to restrict the flow with enabling conditions + as much as possible. + +When designing a state, above expectations must be taken into account. If +`State` doesn't represent the flow and hides it, it will take other developers +much longer to process the code and impossible states could become an issue. #### Designing actions and enabling conditions -Actions should be a reflection of the `State`. After designing the state, -what actions we need to create should be clear, since we will simply be -adding actions which will cause those state transitions we described above. +Actions should be a reflection of the `State`. After designing the state, what +actions we need to create should be clear, since we will simply be adding +actions which will cause those state transitions we described above. -Most of the action names should match state transition names. E.g. -If we have state transition: `SnarkPoolCandidateState::WorkVerifyPending { .. }`, -action which causes state to transition to that specific state, should be -named: `SnarkPoolCandidateWorkVerifyPendingAction`. That way it's easy -for developer reading it later on, to filter through actions more easily. -Actions not following this pattern should be as rare as possible as they -will need special attention while going through the code, in order to not -miss anything. +Most of the action names should match state transition names. E.g. If we have +state transition: `SnarkPoolCandidateState::WorkVerifyPending { .. }`, action +which causes state to transition to that specific state, should be named: +`SnarkPoolCandidateWorkVerifyPendingAction`. That way it's easy for developer +reading it later on, to filter through actions more easily. Actions not +following this pattern should be as rare as possible as they will need special +attention while going through the code, in order to not miss anything. -Action's enabling condition should be as limiting as possible, in order -to avoid impossible state transitions, which could break the node. +Action's enabling condition should be as limiting as possible, in order to avoid +impossible state transitions, which could break the node. #### Designing reducers -Most of the time, reducers goal will simply be to facilitate state -transitions and nothing more. Pretty much grabbing data from one enum -variant and moving it to another, transforming any values that will need it. +Most of the time, reducers goal will simply be to facilitate state transitions +and nothing more. Pretty much grabbing data from one enum variant and moving it +to another, transforming any values that will need it. -In order to do those transitions however, we have to destructure current -enum variant and extract fields from there, so we need to make sure that -enabling condition guarantees that the action won't be triggered, unless -our current state variant is indeed what we expect in the reducer. +In order to do those transitions however, we have to destructure current enum +variant and extract fields from there, so we need to make sure that enabling +condition guarantees that the action won't be triggered, unless our current +state variant is indeed what we expect in the reducer. #### Designing effects -Simpler the effects are, better it is. Mostly they should just declare -what actions may be dispatched after the current action. If checks need -to be done before dispatching some action, those checks belong in the -enabling condition, not the effects. Simpler and smaller the effects are, -easier it is to traverse them. +Simpler the effects are, better it is. Mostly they should just declare what +actions may be dispatched after the current action. If checks need to be done +before dispatching some action, those checks belong in the enabling condition, +not the effects. Simpler and smaller the effects are, easier it is to traverse +them. ## Substate access, Queued Reducer-Dispatch, and Callbacks -The state machine is being refactored to make the code easier to follow and work with. While the core concepts remain mostly unchanged, the organization is evolving. +The state machine is being refactored to make the code easier to follow and work +with. While the core concepts remain mostly unchanged, the organization is +evolving. -**Note**: For reference on the direction of these changes, see [this document](https://github.com/openmina/state_machine_exp/blob/main/node/README.md). +**Note**: For reference on the direction of these changes, see +[this document](https://github.com/openmina/state_machine_exp/blob/main/node/README.md). **What is new**: -- `SubstateAccess` trait: Specifies how specific substate slices are obtained from a parent state. -- `Substate` context: Provides fine-grained control over state and dispatcher access, ensuring clear separation of concerns. -- `Dispatcher`: Manages the queuing and execution of actions, allowing reducers to queue additional actions for dispatch after the current state update phase. -- `Callback` handlers: Facilitate flexible control flow by enabling actions to specify follow-up actions at dispatch time, reducing coupling between components and making control flow more local. -- *Stateful* vs *Effectful* actions: - - Stateful actions update the state and dispatch other actions. These are processed by `reducer` functions. - - Effectful actions are very thing layers over services that expose them as actions and can dispatch callback actions. These are processed by `effects` functions. + +- `SubstateAccess` trait: Specifies how specific substate slices are obtained + from a parent state. +- `Substate` context: Provides fine-grained control over state and dispatcher + access, ensuring clear separation of concerns. +- `Dispatcher`: Manages the queuing and execution of actions, allowing reducers + to queue additional actions for dispatch after the current state update phase. +- `Callback` handlers: Facilitate flexible control flow by enabling actions to + specify follow-up actions at dispatch time, reducing coupling between + components and making control flow more local. +- _Stateful_ vs _Effectful_ actions: + - Stateful actions update the state and dispatch other actions. These are + processed by `reducer` functions. + - Effectful actions are very thing layers over services that expose them as + actions and can dispatch callback actions. These are processed by `effects` + functions. ### New-Style Reducers -New-style reducers accept a `Substate` context as their first argument instead of the state they act on. +New-style reducers accept a `Substate` context as their first argument instead +of the state they act on. This substate context provides the reducer function with access to: + - A mutable reference to the substate that the reducer will mutate. - An immutable reference to the global state. - A mutable reference to a `Dispatcher`. -The reducer function cannot access both the substate and the dispatcher/global state references simultaneously. This enforces a separation between the state update phase and the further action dispatching phase. +The reducer function cannot access both the substate and the dispatcher/global +state references simultaneously. This enforces a separation between the state +update phase and the further action dispatching phase. -This setup allows us to combine, the reducer function and the effect handler function into one, removing a level of flow indirection while keeping the phases separate. +This setup allows us to combine, the reducer function and the effect handler +function into one, removing a level of flow indirection while keeping the phases +separate. ```rust impl WatchedAccountsState { @@ -515,10 +543,15 @@ pub fn reducer( ### Effectful Actions -Actions and their handling code are divided into two categories: *stateful* actions and *effectful* actions. +Actions and their handling code are divided into two categories: _stateful_ +actions and _effectful_ actions. -- **Stateful Actions**: These actions update the state and have a `reducer` function. They closely resemble the traditional state machine code, and most of the state machine logic should reside here. -- **Effectful Actions**: These actions involve calling external services and have an `effects` function. They should serve as thin layers for handling service interactions. +- **Stateful Actions**: These actions update the state and have a `reducer` + function. They closely resemble the traditional state machine code, and most + of the state machine logic should reside here. +- **Effectful Actions**: These actions involve calling external services and + have an `effects` function. They should serve as thin layers for handling + service interactions. Example effectful action: @@ -556,13 +589,23 @@ impl TransitionFrontierGenesisEffectfulAction { ### Callbacks -Callbacks are a new construct that permit the uncoupling of state machine components by enabling the dynamic composition of actions and their sequencing. +Callbacks are a new construct that permit the uncoupling of state machine +components by enabling the dynamic composition of actions and their sequencing. -With callbacks, a caller (handling actions of type `A`) can dispatch an action of type `B` that will produce a result. The action `B` includes callback values that specify how to return the result to `A`. When the result of processing `B` is ready (either further down the action chain or asynchronously from a service call), the callback is invoked with the result. +With callbacks, a caller (handling actions of type `A`) can dispatch an action +of type `B` that will produce a result. The action `B` includes callback values +that specify how to return the result to `A`. When the result of processing `B` +is ready (either further down the action chain or asynchronously from a service +call), the callback is invoked with the result. -This is particularly useful when implementing effectful actions to interact with services, but also for composing multiple components without introducing inter-dependencies (with callbacks we can avoid the *global effects* pattern that was described before in this document). +This is particularly useful when implementing effectful actions to interact with +services, but also for composing multiple components without introducing +inter-dependencies (with callbacks we can avoid the _global effects_ pattern +that was described before in this document). -Callback blocks are declared with the `redux::callback!` macro and are described by a uniquely named code block with a single input and a single output, which must produce an `Action` value as a result. +Callback blocks are declared with the `redux::callback!` macro and are described +by a uniquely named code block with a single input and a single output, which +must produce an `Action` value as a result. Example: @@ -614,16 +657,19 @@ For a given `LocalState` that is a substate of `State`: #### Implement `SubstateAccess` -Implement in `node/src/state.rs` the `SubstateAccess` trait for `State` if it is not defined already. For trivial cases use the `impl_substate_access!` macro. +Implement in `node/src/state.rs` the `SubstateAccess` trait for +`State` if it is not defined already. For trivial cases use the +`impl_substate_access!` macro. #### Update the `reducer` function Update the `reducer` function so that: 1. It is implemented as a method on `LocalState`. -2. It accepts as it's first argument `mut state_context: crate::Substate` instead of `&mut self`. +2. It accepts as it's first argument `mut state_context: crate::Substate` + instead of `&mut self`. 3. It obtains `state` by calling `state_context.get_state_mut()`. -3. All references to `self` are updated to instead reference `state`. +4. All references to `self` are updated to instead reference `state`. Example: @@ -660,7 +706,9 @@ impl ConsensusState { #### Move dispatches from `effects` to `reducer` -For each action that doesn't call a service in it's effect handler, delete it's body from the effect handler and move it to the end of the body of the reducer's match branch that handles that action: +For each action that doesn't call a service in it's effect handler, delete it's +body from the effect handler and move it to the end of the body of the reducer's +match branch that handles that action: Example: @@ -669,7 +717,7 @@ Example: pub fn consensus_effects(store: &mut Store, action: ConsensusActionWithMeta) { let (action, _) = action.split(); - + match action { ConsensusAction::BlockReceived { hash, block, .. } => { - let req_id = store.state().snark.block_verify.next_req_id(); @@ -725,7 +773,8 @@ Example: #### Update the reducer invocation in the parent reducer -Replace the call in the parent reducer so that it creates a new `Substate` instance. +Replace the call in the parent reducer so that it creates a new `Substate` +instance. Example: @@ -750,23 +799,33 @@ Example: #### Define effectful actions for service interactions -Actions that interact with services must be updated so that the interaction is performed by dispatching a new effectful action. No reducer function should be implemented for these new effectful actions. +Actions that interact with services must be updated so that the interaction is +performed by dispatching a new effectful action. No reducer function should be +implemented for these new effectful actions. Example: -See `node/src/transition_frontier/genesis{_effectful}`, `snark/src/block_verify{_effectful}` and `snark/src/work_verify{_effectful}`. +See `node/src/transition_frontier/genesis{_effectful}`, +`snark/src/block_verify{_effectful}` and `snark/src/work_verify{_effectful}`. #### Add callbacks There are 3 main situations in which callbacks are an improvement: - Passing them to effectful actions that will call a service -- Cross-component calls, to make the flow clearer and avoid inter-dependencies (eg. interactions between the transition frontier, and the p2p layer). -- Abstraction of lower level layers (e.g. higher level p2p abstractions over lower level tcp and mio implementations). +- Cross-component calls, to make the flow clearer and avoid inter-dependencies + (eg. interactions between the transition frontier, and the p2p layer). +- Abstraction of lower level layers (e.g. higher level p2p abstractions over + lower level tcp and mio implementations). -Example: when a block is received, the consensus state machine will dispatch an action to verify the block. This action will trigger an asynchronous snark verification process that will complete (or fail) some time in the future, and we are interested in its result. +Example: when a block is received, the consensus state machine will dispatch an +action to verify the block. This action will trigger an asynchronous snark +verification process that will complete (or fail) some time in the future, and +we are interested in its result. -The `SnarkBlockVerifyAction::Init` action gets updated with the addition of two callbacks, one that will be called after a successful verification, and another when an error occurs: +The `SnarkBlockVerifyAction::Init` action gets updated with the addition of two +callbacks, one that will be called after a successful verification, and another +when an error occurs: ```diff pub enum SnarkBlockVerifyAction { @@ -780,7 +839,10 @@ The `SnarkBlockVerifyAction::Init` action gets updated with the addition of two } ``` -The consensus reducer, after receiving a block, initializes the asynchronous block snark verification process specifying the callbacks, and sets the state to "pending". The dispatching of `SnarkBlockVerifyAction::Init` gets updated with the required callbacks: +The consensus reducer, after receiving a block, initializes the asynchronous +block snark verification process specifying the callbacks, and sets the state to +"pending". The dispatching of `SnarkBlockVerifyAction::Init` gets updated with +the required callbacks: ```diff match action { @@ -806,8 +868,10 @@ The consensus reducer, after receiving a block, initializes the asynchronous blo // ... ``` -Then when handling `SnarkBlockVerifyAction::Init` a job is added to the state, with the callbacks stored there. Then the effectful action that will interact with the service is dispatched (**NOTE:** not shown here, see `snark/src/block_verify_effectful/`). - +Then when handling `SnarkBlockVerifyAction::Init` a job is added to the state, +with the callbacks stored there. Then the effectful action that will interact +with the service is dispatched (**NOTE:** not shown here, see +`snark/src/block_verify_effectful/`). ```rust // when matching `SnarkBlockVerifyAction::Init` @@ -831,7 +895,9 @@ dispatcher.push(SnarkBlockVerifyEffectfulAction::Init { dispatcher.push(SnarkBlockVerifyAction::Pending { req_id: *req_id }); ``` -Finally, on the handling of the `SnarkBlockVerifyAction::Success` action, the internal state is updated, and the callback fetched and dispatched with the block hash as input. +Finally, on the handling of the `SnarkBlockVerifyAction::Success` action, the +internal state is updated, and the callback fetched and dispatched with the +block hash as input. ```rust let callback_and_arg = state.jobs.get_mut(*req_id).and_then(|req| { @@ -871,9 +937,16 @@ if store.dispatch(SomeAction) { } ``` -The equivalent with queueing is to use `dispatcher.push_if_enabled` which will return `true` if the enabling condition for that action returns `true`. This will work most of the time, but it is possible for the state to change between the time the action was enqueued and when it is finally going to be dispatched, so the enabling condition may not be `true` anymore. This means that the equivalence is not strict. +The equivalent with queueing is to use `dispatcher.push_if_enabled` which will +return `true` if the enabling condition for that action returns `true`. This +will work most of the time, but it is possible for the state to change between +the time the action was enqueued and when it is finally going to be dispatched, +so the enabling condition may not be `true` anymore. This means that the +equivalence is not strict. -A better approach is to add a callback to the first action. This ensures that the second action only happens when intended, avoiding the potential race condition of the state changing between enqueuing and dispatching. +A better approach is to add a callback to the first action. This ensures that +the second action only happens when intended, avoiding the potential race +condition of the state changing between enqueuing and dispatching. First a callback is added to the action: @@ -903,7 +976,8 @@ Then in the handling code is updated to dispatch the callback: } ``` -Finally the dispatching of that action is update to provide a callback that will return the same action that was inside the body of the conditional dispatch: +Finally the dispatching of that action is update to provide a callback that will +return the same action that was inside the body of the conditional dispatch: ```diff - if store.dispatch(LedgerWriteAction::Init { @@ -926,7 +1000,9 @@ Finally the dispatching of that action is update to provide a callback that will + }); ``` -In the above example the passed argument is not used, but for other callbacks it is useful. Consider this example where we need the block hash for the next action, which can be extracted from the data contained in the request: +In the above example the passed argument is not used, but for other callbacks it +is useful. Consider this example where we need the block hash for the next +action, which can be extracted from the data contained in the request: ```diff - if store.dispatch(LedgerWriteAction::Init { @@ -949,4 +1025,4 @@ In the above example the passed argument is not used, but for other callbacks it + } + ), + }); -``` \ No newline at end of file +``` diff --git a/CLAUDE.md b/CLAUDE.md index 83f97d106..3dea9b10d 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,16 +4,19 @@ This file helps understand and navigate the OpenMina codebase structure. ## Project Overview -OpenMina is a Rust implementation of the Mina Protocol, a lightweight -blockchain using zero-knowledge proofs. It follows a Redux-style state machine -architecture for predictable, debuggable behavior. +OpenMina is a Rust implementation of the Mina Protocol, a lightweight blockchain +using zero-knowledge proofs. It follows a Redux-style state machine architecture +for predictable, debuggable behavior. -*For detailed architecture documentation, see [`docs/handover/`](docs/handover/)* +_For detailed architecture documentation, see +[`docs/handover/`](docs/handover/)_ ## Architecture Overview ### State Machine Pattern + The codebase follows Redux principles: + - **State** - Centralized, immutable data structure - **Actions** - Events that trigger state changes - **Enabling Conditions** - Guards that prevent invalid state transitions @@ -22,7 +25,9 @@ The codebase follows Redux principles: - **Services** - Separate threads handling I/O and heavy computation ### Architecture Styles -- **New Style**: Unified reducers that handle both state updates and action dispatch + +- **New Style**: Unified reducers that handle both state updates and action + dispatch - **Old Style**: Separate reducer and effects files (being migrated) ## Project Structure @@ -30,6 +35,7 @@ The codebase follows Redux principles: ### Core Components **node/** - Main node logic + - `block_producer/` - Block production - `transaction_pool/` - Transaction mempool - `transition_frontier/` - Consensus and blockchain state @@ -40,6 +46,7 @@ The codebase follows Redux principles: - `service/` - Service implementations **p2p/** - Networking layer + - Dual transport: libp2p and WebRTC - Channel abstractions for message types - Peer discovery and connection management @@ -54,6 +61,7 @@ logic, staged ledger, scan state, and proof verification ## Code Organization ### File Patterns + - `*_state.rs` - State definitions - `*_actions.rs` - Action types - `*_reducer.rs` - State transitions @@ -62,6 +70,7 @@ logic, staged ledger, scan state, and proof verification - `summary.md` - Component documentation and technical debt notes ### Key Files + - `node/src/state.rs` - Global state structure - `node/src/action.rs` - Top-level action enum - `node/src/reducer.rs` - Main reducer dispatch @@ -77,14 +86,17 @@ logic, staged ledger, scan state, and proof verification ## Key Patterns ### Defensive Programming + - `bug_condition!` macro marks theoretically unreachable code paths - Used after enabling condition checks for invariant validation ### State Methods + - Complex logic extracted from reducers into state methods - Keeps reducers focused on orchestration ### Callbacks + - Enable decoupled component communication - Used for async operation completion @@ -107,6 +119,7 @@ find . -name "summary.md" -path "*/component/*" ## Component Documentation Each component directory contains a `summary.md` file documenting: + - Component purpose and responsibilities - Known technical debt - Implementation notes diff --git a/Makefile b/Makefile index e22d822c1..704434f8e 100644 --- a/Makefile +++ b/Makefile @@ -32,6 +32,12 @@ check-tx-fuzzing: ## Check the transaction fuzzing tools, requires nightly Rust check-format: ## Check code formatting cargo +nightly fmt -- --check +.PHONY: check-md +check-md: ## Check if markdown files are properly formatted + @echo "Checking markdown formatting..." + npx prettier --check "**/*.md" + @echo "Markdown format check completed." + .PHONY: clean clean: ## Clean build artifacts cargo clean @@ -40,6 +46,12 @@ clean: ## Clean build artifacts format: ## Format code using rustfmt cargo +nightly fmt +.PHONY: format-md +format-md: ## Format all markdown files to wrap at 80 characters + @echo "Formatting markdown files..." + npx prettier --write "**/*.md" + @echo "Markdown files have been formatted to 80 characters." + .PHONY: lint lint: ## Run linter (clippy) cargo clippy --all-targets -- -D warnings --allow clippy::mutable_key_type diff --git a/README.md b/README.md index c68515f76..ceead3f96 100644 --- a/README.md +++ b/README.md @@ -7,10 +7,14 @@ width="152px"> -![Beta][beta-badge] [![release-badge]][release-link] [![Changelog][changelog-badge]][changelog] [![Apache licensed]][Apache link] +![Beta][beta-badge] [![release-badge]][release-link] +[![Changelog][changelog-badge]][changelog] [![Apache licensed]][Apache link] -_The **Open Mina Node** is a fast and secure implementation of the Mina protocol in **Rust**._ -_Currently in **public beta**, join our [Discord community](https://discord.com/channels/484437221055922177/1290662938734231552) to help test future releases._ +_The **Open Mina Node** is a fast and secure implementation of the Mina protocol +in **Rust**._ +_Currently in **public beta**, join our +[Discord community](https://discord.com/channels/484437221055922177/1290662938734231552) +to help test future releases._ @@ -20,15 +24,20 @@ _Currently in **public beta**, join our [Discord community](https://discord.com/ ### Building from Source -- [Rust Node](/docs/building-from-source-guide.md#how-to-build-and-launch-a-node-from-source) and [Dashboards](./docs/building-from-source-guide.md#how-to-launch-the-ui) +- [Rust Node](/docs/building-from-source-guide.md#how-to-build-and-launch-a-node-from-source) + and [Dashboards](./docs/building-from-source-guide.md#how-to-launch-the-ui) - [Web Node](/docs/local-webnode.md) ### Run Node on Devnet via Docker -- [Non-Block Producing Node](/docs/alpha-testing-guide.md) Connect to peers and sync a node on the devnet; no devnet stake needed. -- [Block Producing Node](/docs/block-producer-guide.md) Produce blocks on the devnet; sufficient devnet stake needed. -- [Local Block Production Demo](/docs/local-demo-guide.md) Produce blocks on a custom local chain without devnet stake. -- [Devnet Archive Node](/docs/archive-node-guide.md) Run an archive node on devnet. +- [Non-Block Producing Node](/docs/alpha-testing-guide.md) Connect to peers and + sync a node on the devnet; no devnet stake needed. +- [Block Producing Node](/docs/block-producer-guide.md) Produce blocks on the + devnet; sufficient devnet stake needed. +- [Local Block Production Demo](/docs/local-demo-guide.md) Produce blocks on a + custom local chain without devnet stake. +- [Devnet Archive Node](/docs/archive-node-guide.md) Run an archive node on + devnet. Block production Node UI @@ -36,25 +45,27 @@ _Currently in **public beta**, join our [Discord community](https://discord.com/ ## Release Process -**This project is in beta**. We maintain a monthly release cycle, providing [updates every month](https://github.com/openmina/openmina/releases). - - +**This project is in beta**. We maintain a monthly release cycle, providing +[updates every month](https://github.com/openmina/openmina/releases). ## Core Features - **Mina Network**: Connect to peers, sync up, broadcast messages -- **Block Production**: Produces, validates, and applies blocks according to Mina's consensus. +- **Block Production**: Produces, validates, and applies blocks according to + Mina's consensus. - **SNARK Generation**: Produce SNARK proofs for transactions - **Debugging**: A block replayer that uses data from the archive nodes ## Repository Structure -- [core/](core) - Provides basic types needed to be shared across different components of the node. +- [core/](core) - Provides basic types needed to be shared across different + components of the node. - [ledger/](ledger) - Mina ledger implementation in Rust. - [snark/](snark) - Snark/Proof verification. - [p2p/](p2p) - P2p implementation for OpenMina node. - [node/](node) - Combines all the business logic of the node. - - [native/](node/native) - OS specific pieces of the node, which is used to run the node natively (Linux/Mac/Windows). + - [native/](node/native) - OS specific pieces of the node, which is used to + run the node natively (Linux/Mac/Windows). - [testing/](node/testing) - Testing framework for OpenMina node. - [cli/](cli) - OpenMina cli. - [frontend/](frontend) - OpenMina frontend. diff --git a/docker.md b/docker.md index 7c8e3e578..1ad8ab9b6 100644 --- a/docker.md +++ b/docker.md @@ -8,5 +8,5 @@ Images are available at the Docker Hub repository `openmina/openmina`. For the branch itself, two tags produced, one derived from the commit hash (first 8 chars), and another one is `latest`. -For a PR from a branch named `some/branch`, two tags are produced, one is `some-branch-` and another one is `some-branch-latest`. - +For a PR from a branch named `some/branch`, two tags are produced, one is +`some-branch-` and another one is `some-branch-latest`. diff --git a/docs/alpha-testing-guide.md b/docs/alpha-testing-guide.md index adaf7a9b9..f652dd14b 100644 --- a/docs/alpha-testing-guide.md +++ b/docs/alpha-testing-guide.md @@ -1,6 +1,8 @@ # Run Non-Block Producing Node on Devnet -This guide will walk you through running the **Alpha Rust Node** on Devnet using Docker. Follow these steps to set up the node and [Provide Feedback](#4-provide-feedback) on this Alpha release. +This guide will walk you through running the **Alpha Rust Node** on Devnet using +Docker. Follow these steps to set up the node and +[Provide Feedback](#4-provide-feedback) on this Alpha release. ## 1. Prerequisites @@ -11,8 +13,8 @@ Ensure you have **Docker** installed: ## 2. Download & Start the Node 1. **Download the Latest Release**: - - - Visit the [Open Mina Releases](https://github.com/openmina/openmina/releases). + - Visit the + [Open Mina Releases](https://github.com/openmina/openmina/releases). - Download the latest `openmina-vX.Y.Z-docker-compose.zip`. 2. **Extract the Files**: @@ -24,28 +26,32 @@ Ensure you have **Docker** installed: Additional optional parameters: - `OPENMINA_LIBP2P_EXTERNAL_IP` Sets your node’s external IP address to help other nodes find it. + `OPENMINA_LIBP2P_EXTERNAL_IP` Sets your node’s external IP address to help + other nodes find it. `OPENMINA_LIBP2P_PORT` Sets the port for Libp2p communication. -3. **Start the Node on Devnet and Save Logs**: - Start the node and save the logs for later analysis: +3. **Start the Node on Devnet and Save Logs**: Start the node and save the logs + for later analysis: ```bash docker compose up --pull always && docker compose logs > openmina-node.log ``` -4. **Access the Dashboard**: - Open `http://localhost:8070` in your browser. +4. **Access the Dashboard**: Open `http://localhost:8070` in your browser. The dashboard will show the syncing process in real time. image - > **1. Connecting to Peers:** The node connects to peers. You’ll see the number of connected, connecting, and disconnected peers grow. + > **1. Connecting to Peers:** The node connects to peers. You’ll see the + > number of connected, connecting, and disconnected peers grow. > - > **2. Fetching Ledgers:** The node downloads key data: Staking ledger, Next epoch ledger, and Snarked ledger. Progress bars show the download status. + > **2. Fetching Ledgers:** The node downloads key data: Staking ledger, Next + > epoch ledger, and Snarked ledger. Progress bars show the download status. > - > **3. Fetching & Applying Blocks:** The node downloads recent blocks to match the network’s current state. The dashboard tracks how many blocks are fetched and applied. + > **3. Fetching & Applying Blocks:** The node downloads recent blocks to + > match the network’s current state. The dashboard tracks how many blocks are + > fetched and applied. ## 3. Monitoring and troubleshooting @@ -68,10 +74,15 @@ docker compose up --pull always ## 4. Provide Feedback -This Alpha release is for testing purposes. Your feedback is essential. Follow these steps to report any issues: +This Alpha release is for testing purposes. Your feedback is essential. Follow +these steps to report any issues: -1. **Collect Logs**: Use the [commands above to save logs](#2-download--start-the-node) -2. **Visit Discord**: [Open Mina Discord Channel](https://discord.com/channels/484437221055922177/1290662938734231552/1290667779317305354) +1. **Collect Logs**: Use the + [commands above to save logs](#2-download--start-the-node) +2. **Visit Discord**: + [Open Mina Discord Channel](https://discord.com/channels/484437221055922177/1290662938734231552/1290667779317305354) 3. **Describe the Issue**: Briefly explain the problem and steps to reproduce it -4. **Attach Logs**: Discord allows files up to 25MB. If your logs are larger, use Google Drive or similar -5. **Include a Screenshot**: A dashboard screenshot provides details about node status, making it easier to diagnose the issue +4. **Attach Logs**: Discord allows files up to 25MB. If your logs are larger, + use Google Drive or similar +5. **Include a Screenshot**: A dashboard screenshot provides details about node + status, making it easier to diagnose the issue diff --git a/docs/archive-node-guide.md b/docs/archive-node-guide.md index 0bae18dad..2945a349f 100644 --- a/docs/archive-node-guide.md +++ b/docs/archive-node-guide.md @@ -1,16 +1,19 @@ # Run Archive Node on Devnet -This guide is intended for setting up archive nodes on **Mina Devnet** only. Do not use this guide for Mina Mainnet +This guide is intended for setting up archive nodes on **Mina Devnet** only. Do +not use this guide for Mina Mainnet ## Archive Mode Configuration -We start archive mode in openmina by setting one of the following flags along with their associated environment variables: +We start archive mode in openmina by setting one of the following flags along +with their associated environment variables: ### Archiver Process (`--archive-archiver-process`) Stores blocks in a database by receiving them directly from the openmina node **Required Environment Variables**: + - `OPENMINA_ARCHIVE_ADDRESS`: Network address for the archiver service ### Local Storage (`--archive-local-storage`) @@ -18,16 +21,20 @@ Stores blocks in a database by receiving them directly from the openmina node Stores blocks in the local filesystem **Required Environment Variables**: + - (None) **Optional Environment Variables**: -- `OPENMINA_ARCHIVE_LOCAL_STORAGE_PATH`: Custom path for block storage (default: ~/.openmina/archive-precomputed) + +- `OPENMINA_ARCHIVE_LOCAL_STORAGE_PATH`: Custom path for block storage (default: + ~/.openmina/archive-precomputed) ### GCP Storage (`--archive-gcp-storage`) Uploads blocks to a Google Cloud Platform bucket **Required Environment Variables**: + - `GCP_CREDENTIALS_JSON`: Service account credentials JSON - `GCP_BUCKET_NAME`: Target storage bucket name @@ -36,6 +43,7 @@ Uploads blocks to a Google Cloud Platform bucket Uploads blocks to an AWS S3 bucket **Required Environment Variables**: + - `AWS_ACCESS_KEY_ID`: IAM user access key - `AWS_SECRET_ACCESS_KEY`: IAM user secret key - `AWS_DEFAULT_REGION`: AWS region name @@ -44,17 +52,22 @@ Uploads blocks to an AWS S3 bucket ## Redundancy -The archive mode is designed to be redundant. We can combine the flags to have multiple options running simultaneously. +The archive mode is designed to be redundant. We can combine the flags to have +multiple options running simultaneously. ## Prerequisites -Ensure Docker and Docker Compose are installed on your system - [Docker Installation Guide](./docker-installation.md) +Ensure Docker and Docker Compose are installed on your system - +[Docker Installation Guide](./docker-installation.md) ## Docker compose setup (with archiver process) -The compose file sets up a PG database, the archiver process and the openmina node. The archiver process is responsible for storing the blocks in the database by receiving the blocks from the openmina node. +The compose file sets up a PG database, the archiver process and the openmina +node. The archiver process is responsible for storing the blocks in the database +by receiving the blocks from the openmina node. -See [docker-compose.archive.devnet.yml](../docker-compose.archive.devnet.yml) for more details. +See [docker-compose.archive.devnet.yml](../docker-compose.archive.devnet.yml) +for more details. ### Starting the setup diff --git a/docs/block-producer-guide.md b/docs/block-producer-guide.md index 166c34faa..5fc2413dc 100644 --- a/docs/block-producer-guide.md +++ b/docs/block-producer-guide.md @@ -1,13 +1,16 @@ # Run Block Producing Node on Devnet -This guide is intended for setting up block producer nodes on **Mina Devnet** only. -Do not use this guide for Mina Mainnet until necessary security audits are complete. +This guide is intended for setting up block producer nodes on **Mina Devnet** +only. +Do not use this guide for Mina Mainnet until necessary security audits are +complete. --- ## Prerequisites -Ensure Docker and Docker Compose are installed on your system - [Docker Installation Guide](./docker-installation.md) +Ensure Docker and Docker Compose are installed on your system - +[Docker Installation Guide](./docker-installation.md) ## Download & Start the Node @@ -25,18 +28,22 @@ Ensure Docker and Docker Compose are installed on your system - [Docker Installa 2. **Prepare Your Keys** - [Docker Compose](../docker-compose.block-producer.yml) references `openmina-workdir`. It stores a private key and logs for block production. - Place your block producer's private key into the `openmina-workdir` directory and name it `producer-key`: + [Docker Compose](../docker-compose.block-producer.yml) references + `openmina-workdir`. It stores a private key and logs for block production. + Place your block producer's private key into the `openmina-workdir` directory + and name it `producer-key`: ```bash cp /path/to/your/private_key openmina-workdir/producer-key ``` - Replace `/path/to/your/private_key` with the actual path to your private key file. + Replace `/path/to/your/private_key` with the actual path to your private key + file. 3. **Launch Block Producer** - Use `MINA_PRIVKEY_PASS` to set the private key password. Optionally, use `COINBASE_RECEIVER` to set a different coinbase receiver: + Use `MINA_PRIVKEY_PASS` to set the private key password. Optionally, use + `COINBASE_RECEIVER` to set a different coinbase receiver: ```bash env COINBASE_RECEIVER="YourWalletAddress" MINA_PRIVKEY_PASS="YourPassword" \ @@ -45,18 +52,24 @@ Ensure Docker and Docker Compose are installed on your system - [Docker Installa Optional parameters: - `OPENMINA_LIBP2P_EXTERNAL_IP` Sets your node’s external IP address to help other nodes find it. + `OPENMINA_LIBP2P_EXTERNAL_IP` Sets your node’s external IP address to help + other nodes find it. `OPENMINA_LIBP2P_PORT` Sets the port for Libp2p communication. 4. **Go to Dashboard** - Visit [http://localhost:8070](http://localhost:8070) to [monitor sync](http://localhost:8070/dashboard) and [block production](http://localhost:8070/block-production). + Visit [http://localhost:8070](http://localhost:8070) to + [monitor sync](http://localhost:8070/dashboard) and + [block production](http://localhost:8070/block-production). ### Access Logs -Logs are stored in `openmina-workdir` with filenames like `openmina.log.2024-10-14`, `openmina.log.2024-10-15`, etc. +Logs are stored in `openmina-workdir` with filenames like +`openmina.log.2024-10-14`, `openmina.log.2024-10-15`, etc. ### Provide Feedback -Collect logs from `openmina-workdir` and report issues on the [rust-node-testing](https://discord.com/channels/484437221055922177/1290662938734231552) discord channel. Include reproduction steps if possible. +Collect logs from `openmina-workdir` and report issues on the +[rust-node-testing](https://discord.com/channels/484437221055922177/1290662938734231552) +discord channel. Include reproduction steps if possible. diff --git a/docs/building-from-source-guide.md b/docs/building-from-source-guide.md index 231e77e8e..1df3192eb 100644 --- a/docs/building-from-source-guide.md +++ b/docs/building-from-source-guide.md @@ -1,6 +1,7 @@ # How to build and launch a node from source -This installation guide has been tested on Debian and Ubuntu and should work on most distributions of Linux. +This installation guide has been tested on Debian and Ubuntu and should work on +most distributions of Linux. ## Prerequisites @@ -73,7 +74,8 @@ nvm install 20.11.1 #### Windows -Download [Node.js v20.11.1](https://nodejs.org/) from the official website, open the installer and follow the prompts to complete the installation. +Download [Node.js v20.11.1](https://nodejs.org/) from the official website, open +the installer and follow the prompts to complete the installation. ### 2. Angular CLI v16.2.0 diff --git a/docs/docker-installation.md b/docs/docker-installation.md index d4e7ee635..d568055ab 100644 --- a/docs/docker-installation.md +++ b/docs/docker-installation.md @@ -10,46 +10,47 @@ 1. Set up Docker's apt repository: - ```bash - # Add Docker's official GPG key: - sudo apt-get update - sudo apt-get install ca-certificates curl - sudo install -m 0755 -d /etc/apt/keyrings - sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc - sudo chmod a+r /etc/apt/keyrings/docker.asc - - # Add the repository to Apt sources: - echo \ - "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ - $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ - sudo tee /etc/apt/sources.list.d/docker.list > /dev/null - sudo apt-get update - ``` + ```bash + # Add Docker's official GPG key: + sudo apt-get update + sudo apt-get install ca-certificates curl + sudo install -m 0755 -d /etc/apt/keyrings + sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc + sudo chmod a+r /etc/apt/keyrings/docker.asc + + # Add the repository to Apt sources: + echo \ + "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ + $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ + sudo tee /etc/apt/sources.list.d/docker.list > /dev/null + sudo apt-get update + ``` 2. Install the Docker packages: - ```bash - sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin - ``` + ```bash + sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin + ``` 3. Add your user to the `docker` group: - ```bash - sudo usermod -aG docker $USER - newgrp docker - ``` + ```bash + sudo usermod -aG docker $USER + newgrp docker + ``` 4. Verify the installation: - ```bash - docker run hello-world - ``` + ```bash + docker run hello-world + ``` --- ## Docker Installation on Windows -1. Download and Install [Docker Desktop for Windows](https://www.docker.com/products/docker-desktop/). +1. Download and Install + [Docker Desktop for Windows](https://www.docker.com/products/docker-desktop/). 2. Ensure Docker Desktop is running (check the system tray icon). @@ -57,9 +58,10 @@ ### Docker Installation on macOS -1. Download and Install [Docker Desktop for Mac](https://www.docker.com/products/docker-desktop/). +1. Download and Install + [Docker Desktop for Mac](https://www.docker.com/products/docker-desktop/). 2. Verify the installation in Terminal: - ```bash - docker --version - ``` + ```bash + docker --version + ``` diff --git a/docs/local-demo-guide.md b/docs/local-demo-guide.md index 487c0646b..f26e68815 100644 --- a/docs/local-demo-guide.md +++ b/docs/local-demo-guide.md @@ -1,16 +1,17 @@ # Run Block Producers on local network -Once you have completed the [pre-requisites](./docs/docker-installation.md) for your operating system, follow these steps: +Once you have completed the [pre-requisites](./docs/docker-installation.md) for +your operating system, follow these steps: ## Setup Option 1: Download Docker Compose Files from the Release 1. **Download the Docker Compose files:** - - - Go to the [Releases page](https://github.com/openmina/openmina/releases) of this repository. - - Download the latest `openmina-vX.Y.Z-docker-compose.zip` (or `.tar.gz`) file corresponding to the release version (available since v0.8.0). + - Go to the [Releases page](https://github.com/openmina/openmina/releases) of + this repository. + - Download the latest `openmina-vX.Y.Z-docker-compose.zip` (or `.tar.gz`) + file corresponding to the release version (available since v0.8.0). 2. **Extract the files:** - - Unzip or untar the downloaded file: ```bash unzip openmina-vX.Y.Z-docker-compose.zip diff --git a/docs/local-webnode.md b/docs/local-webnode.md index 9050798d4..8c902a6ec 100644 --- a/docs/local-webnode.md +++ b/docs/local-webnode.md @@ -3,6 +3,7 @@ ## Steps Install the rust and node: + ```sh # Install rustup and set the default Rust toolchain to 1.84 (newer versions work too) curl https://sh.rustup.rs -sSf | sh -s -- -y --default-toolchain 1.84 @@ -18,42 +19,49 @@ nvm install 21 ``` Clone the repo + ```sh git clone https://github.com/openmina/openmina.git cd openmina/ ``` Build the web node, `wasm-pack` command should take a bit, around 10min + ```sh -cargo install wasm-pack -cd node/web +cargo install wasm-pack +cd node/web rustup toolchain install nightly -rustup override set nightly -wasm-pack build --target web --out-dir pkg +rustup override set nightly +wasm-pack build --target web --out-dir pkg cp -rf pkg ../../frontend/src/assets/webnode ``` Download `circuit-blobs`, from the root of project run: + ```sh cd frontend/src/assets/webnode git clone --depth 1 https://github.com/openmina/circuit-blobs.git ``` -And create `web-node-secrets.json` in `frontend/src/assets/webnode` it should be in following format: +And create `web-node-secrets.json` in `frontend/src/assets/webnode` it should be +in following format: + ```json { - "publicKey": "B62qk1UDzvtw82kiSznZEtSdFUg9oW8di5p53cVr2FxDBzjv9bW2Wf6", - "privateKey": "EKLf3tJd7aKegmhr5qyghagM25LQ98Cu413f1a5e18ubUMgGZY8x" + "publicKey": "B62qk1UDzvtw82kiSznZEtSdFUg9oW8di5p53cVr2FxDBzjv9bW2Wf6", + "privateKey": "EKLf3tJd7aKegmhr5qyghagM25LQ98Cu413f1a5e18ubUMgGZY8x" } ``` These keys can be generated by running: + ```sh cargo run -r --bin openmina -- misc mina-key-pair ``` Install frontend dependencies and run the webnode: + ```sh cd frontend npm run start:webnode -``` \ No newline at end of file +``` diff --git a/docs/scan-state.md b/docs/scan-state.md index f7fb8a091..f89adef62 100644 --- a/docs/scan-state.md +++ b/docs/scan-state.md @@ -1,63 +1,82 @@ - # The Scan State -This is a data structure that queues transactions requiring transaction snark proofs and allows parallel processing of these transaction snarks by snark workers. +This is a data structure that queues transactions requiring transaction snark +proofs and allows parallel processing of these transaction snarks by snark +workers. -It is known as a _scan_ _state_ because it combines a scan-like operation with the updating of the Mina blockchain state. In functional programming, scans apply a specific operation to a sequence of elements, keeping track of the intermediate results at each step and producing a sequence of these accumulated values. +It is known as a _scan_ _state_ because it combines a scan-like operation with +the updating of the Mina blockchain state. In functional programming, scans +apply a specific operation to a sequence of elements, keeping track of the +intermediate results at each step and producing a sequence of these accumulated +values. -The sequence of elements is in this case a stream of incoming blocks of transactions, and the operation +The sequence of elements is in this case a stream of incoming blocks of +transactions, and the operation -The scan state represents a stack of binary trees (data structures in which each node has two children), where each node in the tree is a snark job to be completed by a snark worker. The scan state periodically returns a single proof from the top of a tree that attests to the correctness of all transactions at the base of the tree. The scan state defines the number of transactions in a block. +The scan state represents a stack of binary trees (data structures in which each +node has two children), where each node in the tree is a snark job to be +completed by a snark worker. The scan state periodically returns a single proof +from the top of a tree that attests to the correctness of all transactions at +the base of the tree. The scan state defines the number of transactions in a +block. -Currently, Mina allows 128 transactions to enter a block. A block producer, rather than completing the work themselves, may purchase the completed work from any snark workers from bids available in the snark pool. +Currently, Mina allows 128 transactions to enter a block. A block producer, +rather than completing the work themselves, may purchase the completed work from +any snark workers from bids available in the snark pool. **Possible states:** **todo** - SNARK is requested, but hasn’t been completed yet. -**pending** - SNARK for this transaction has been completed, but not included in the block +**pending** - SNARK for this transaction has been completed, but not included in +the block **done** - completed and included in the block -At the bottom of the binary tree are 128 SNARK jobs that may be attached to one of the following entries: +At the bottom of the binary tree are 128 SNARK jobs that may be attached to one +of the following entries: -**Payments** - Transfers of Mina funds (non-ZK App transactions) and stake delegations +**Payments** - Transfers of Mina funds (non-ZK App transactions) and stake +delegations -**Zk Apps** - Mina Protocol smart contracts powered by zk-SNARKs (ZK App transactions). +**Zk Apps** - Mina Protocol smart contracts powered by zk-SNARKs (ZK App +transactions). -**Coinbases** - the transactions that create (or issue) new tokens that did not exist before. These new tokens are used to pay the block producer reward. +**Coinbases** - the transactions that create (or issue) new tokens that did not +exist before. These new tokens are used to pay the block producer reward. **Fee Transfers** - Used to pay for SNARK work. -**Merges** - Merging two existing SNARK jobs into one (SNARKing two SNARK jobs together) +**Merges** - Merging two existing SNARK jobs into one (SNARKing two SNARK jobs +together) **Empty** - Slot for SNARK job is empty -In the scan state, two basic constants define its behavior: +In the scan state, two basic constants define its behavior: -**transaction_capacity_log_2** - the maximum number of transactions that can be included in a block +**transaction_capacity_log_2** - the maximum number of transactions that can be +included in a block -**work_delay** - a certain delay (in terms of blocks) that ensures there is enough time for the snark work to be completed by the snark workers. +**work_delay** - a certain delay (in terms of blocks) that ensures there is +enough time for the snark work to be completed by the snark workers. **The number of trees in the scan state is affected by these constants:** - ## Maximum Number of Transactions - ## TxnsMax=2^(TCL) **TxnsMax** - maximum number of transactions, **TCL** - transaction_capacity_log_2 -Transactions form the childless leaves of the binary tree. While this determines the max number of possible transactions, it will contain other actions that require proofs such as ZK Apps, Fee Transfers and Coinbases. - +Transactions form the childless leaves of the binary tree. While this determines +the max number of possible transactions, it will contain other actions that +require proofs such as ZK Apps, Fee Transfers and Coinbases. ## Maximum Number of Trees - -## TreesMax=(TCL+1)*WD, +## TreesMax=(TCL+1)\*WD, TreesMax = maximum number of trees, @@ -65,10 +84,8 @@ TCL = transaction_capacity_log_2, WD = work_delay - ## Maximum number of proofs (per block) - ## MNP = 2^(TN/TCL+1)-1 MNP = max_number_of_proofs, @@ -77,221 +94,227 @@ TN= number of transactions, TCL = transaction_capacity_log_2 -Each stack of binary trees represents one _state_ of the Mina blockchain. The Scan State’s binary trees are filled up with SNARK jobs, starting from their leafs to their root. - +Each stack of binary trees represents one _state_ of the Mina blockchain. The +Scan State’s binary trees are filled up with SNARK jobs, starting from their +leafs to their root. ### How the Scan State works -Here we have an illustrated example of how the scan state works. +Here we have an illustrated example of how the scan state works. -**Important: Note that this is a very simplified version in which we only use the concept of Payments as pending SNARK jobs, even though other actions may request SNARK work.** +**Important: Note that this is a very simplified version in which we only use +the concept of Payments as pending SNARK jobs, even though other actions may +request SNARK work.** -Currently, Mina’s scan state permits up to 128 transactions to be in a block, which means there may be up to 128 leaf nodes at the end of the binary tree. However, for the sake of explaining the concept, we only use a binary tree with a `max_no_of_transactions` = 8 (meaning there can be up to 8 transactions or other operations included in one block) and a `work_delay` = 1. +Currently, Mina’s scan state permits up to 128 transactions to be in a block, +which means there may be up to 128 leaf nodes at the end of the binary tree. +However, for the sake of explaining the concept, we only use a binary tree with +a `max_no_of_transactions` = 8 (meaning there can be up to 8 transactions or +other operations included in one block) and a `work_delay` = 1. -Also, note that the screenshots are taken from the Decentralized Snark Worker UI light mode, but by default the browser will open it up in dark mode. +Also, note that the screenshots are taken from the Decentralized Snark Worker UI +light mode, but by default the browser will open it up in dark mode. Possible states: **Todo** - SNARK is requested, but hasn’t been completed yet. -**ongoing** - There is a commitment to produce a snark, the SNARK is not completed yet +**ongoing** - There is a commitment to produce a snark, the SNARK is not +completed yet -**pending** - SNARK for this transaction has been completed (is ready in SNARK pool), but has not been included in a block +**pending** - SNARK for this transaction has been completed (is ready in SNARK +pool), but has not been included in a block **done** - completed and included in the block -**available jobs** - When a new block comes, the scan state is updated with new available jobs - - +**available jobs** - When a new block comes, the scan state is updated with new +available jobs 1. Empty Scan State At Genesis, the scan state is empty and looks like this: - - ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/28fe2b45-fffa-4e36-8ad3-ef257a509c58) - - -2. Tree 0 is filled with todo SNARK jobs for 8 transactions +2. Tree 0 is filled with todo SNARK jobs for 8 transactions We add 8 Todo SNARK jobs (requests for SNARK jobs): - - ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/99eb76eb-355e-4aed-a601-bc73837d2d6a) +In the bottom row, the color scheme describes the various types of transactions +for which we have requested SNARK proofs. In this example, there are 5 payments +(dark grey in light mode, white in dark mode), 2 fee transfers (purple) and 1 +coinbase (cyan in light mode, light blue in dark mode). -In the bottom row, the color scheme describes the various types of transactions for which we have requested SNARK proofs. In this example, there are 5 payments (dark grey in light mode, white in dark mode), 2 fee transfers (purple) and 1 coinbase (cyan in light mode, light blue in dark mode). - -Above are 8 yellow blocks representing todo SNARK jobs, i.e. SNARK jobs that have been requested, but have yet to be completed. - - - -3. Tree 1 is filled with todo SNARK jobs for 8 transactions - -Now another block of 8 todo SNARK jobs is added, filling the leaves of the binary Tree 2. Because there is a `work_delay` of 1, no SNARK work is required at this stage. +Above are 8 yellow blocks representing todo SNARK jobs, i.e. SNARK jobs that +have been requested, but have yet to be completed. +3. Tree 1 is filled with todo SNARK jobs for 8 transactions +Now another block of 8 todo SNARK jobs is added, filling the leaves of the +binary Tree 2. Because there is a `work_delay` of 1, no SNARK work is required +at this stage. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/3615e6d4-f053-4bf6-901e-49064af639f8) +At the bottom of the two binary trees, we again see the various types of +transactions for which we have requested SNARK proofs. In the new tree, there +are 2 payments , 2 fee transfers and 4 coinbases. -At the bottom of the two binary trees, we again see the various types of transactions for which we have requested SNARK proofs. In the new tree, there are 2 payments , 2 fee transfers and 4 coinbases. - -Above, we see that in Tree 1, some todo SNARK jobs are now pending (dark purple). This means that the SNARK job for that transaction has been completed, but it has not been included in a block yet. - - +Above, we see that in Tree 1, some todo SNARK jobs are now pending (dark +purple). This means that the SNARK job for that transaction has been completed, +but it has not been included in a block yet. -4. Tree 2 is filled with 8 todo SNARK jobs, Tree 0 gets first 4 pending merge jobs - -Another binary tree of SNARK jobs forms (Tree 3) and the transactions in Tree 1 are given SNARK proofs. Additionally, in Tree 1, four pending SNARK jobs are created for the merging of the existing 8 SNARKed transactions. Trees 1 and 2 have most of their SNARK jobs completed, but have not yet been added to a block: +4. Tree 2 is filled with 8 todo SNARK jobs, Tree 0 gets first 4 pending merge + jobs +Another binary tree of SNARK jobs forms (Tree 3) and the transactions in Tree 1 +are given SNARK proofs. Additionally, in Tree 1, four pending SNARK jobs are +created for the merging of the existing 8 SNARKed transactions. Trees 1 and 2 +have most of their SNARK jobs completed, but have not yet been added to a block: ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/fe19b898-6dc0-4266-90a5-01035223453e) +5. Tree 3 is filled with 8 todo SNARK jobs for its transactions, trees 0 and 1 + have pending merge jobs. - - -5. Tree 3 is filled with 8 todo SNARK jobs for its transactions, trees 0 and 1 have pending merge jobs. - -The fourth binary tree of SNARK jobs is filled with 8 new transactions, and Tree 2 has all of its pending SNARK jobs complete, but they are yet to be added to a block. In Tree 0 and Tree 1, the existing transactions are given SNARK proofs, plus four pending SNARK jobs are created for the merging of these transactions. - - +The fourth binary tree of SNARK jobs is filled with 8 new transactions, and Tree +2 has all of its pending SNARK jobs complete, but they are yet to be added to a +block. In Tree 0 and Tree 1, the existing transactions are given SNARK proofs, +plus four pending SNARK jobs are created for the merging of these transactions. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/1c13c745-8841-4bab-9acb-c31d53a23576) +6. Tree 4 is filled with todo SNARK jobs for its transactions, trees 0 and 1 + have pending merge jobs. +Tree 4 is filled with 8 Todo SNARK jobs for its transactions. Since there is a +work_delay of 1, this means that there is a ‘latency’ of two preceding trees +before the next phase of SNARK work or merging is performed. -6. Tree 4 is filled with todo SNARK jobs for its transactions, trees 0 and 1 have pending merge jobs. - -Tree 4 is filled with 8 Todo SNARK jobs for its transactions. Since there is a work_delay of 1, this means that there is a ‘latency’ of two preceding trees before the next phase of SNARK work or merging is performed. - -In practical terms, it means that Tree 2 will now have 4 Todo SNARK jobs to merge its transactions, while Tree 0 will have 2 Todo SNARK jobs to merge the 4 already once-merged SNARK jobs. - - +In practical terms, it means that Tree 2 will now have 4 Todo SNARK jobs to +merge its transactions, while Tree 0 will have 2 Todo SNARK jobs to merge the 4 +already once-merged SNARK jobs. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/d17c8b53-978e-4712-8d6e-6fd848896a79) +7. Tree 5 is filled with todo SNARK jobs for its transactions, Tree 0 has almost + all of its SNARKs merged. - - -7. Tree 5 is filled with todo SNARK jobs for its transactions, Tree 0 has almost all of its SNARKs merged. - -This process is repeated until the maximum number of trees (6 in this example) is reached. With each additional tree, merging SNARK work is performed on the n-2 preceding tree, since there is a work_delay of 1. If there is a greater time delay, then this number increases (for example a work_delay of 2, then it would be the n-3 preceding tree). - - +This process is repeated until the maximum number of trees (6 in this example) +is reached. With each additional tree, merging SNARK work is performed on the +n-2 preceding tree, since there is a work_delay of 1. If there is a greater time +delay, then this number increases (for example a work_delay of 2, then it would +be the n-3 preceding tree). ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/7c048c08-31cb-4eae-89fa-8a15f59b25d4) +8. Maximum number of trees is reached, SNARK work is performed on n-2 preceding + trees. - - -8. Maximum number of trees is reached, SNARK work is performed on n-2 preceding trees. - -Once the maximum number of trees is reached, merging SNARK work is performed on the remaining trees: - +Once the maximum number of trees is reached, merging SNARK work is performed on +the remaining trees: ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/8a70e56f-6a7c-4d55-882e-aeec771bfe0f) - -In the diagram below, Trees 0 and 1 have had their SNARKs merged the maximum amount of times, so SNARK work in those trees is complete. - +In the diagram below, Trees 0 and 1 have had their SNARKs merged the maximum +amount of times, so SNARK work in those trees is complete. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/2768f8b7-4714-406d-b10e-c11300b50e63) +Merge jobs of existing SNARKs are performed until the entire scan state consists +of SNARKs, meaning that all binary trees have had their transactions snarked, +and the SNARKs have been merged 3 times. -Merge jobs of existing SNARKs are performed until the entire scan state consists of SNARKs, meaning that all binary trees have had their transactions snarked, and the SNARKs have been merged 3 times. - -In this case, the merging happens 3 times = 8 transactions are added, then SNARKed, then merged into 4 SNARKs, which are then merged into 2 and, finally, 1: - - +In this case, the merging happens 3 times = 8 transactions are added, then +SNARKed, then merged into 4 SNARKs, which are then merged into 2 and, finally, +1: ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/da09bca8-2955-454f-85a7-d9acb53e679c) +When the Scan State looks as the picture above, it means that all binary trees +have been filled out with transactions, these transactions have been SNARKed, +and the SNARK proofs have been merged into SNARK proofs the maximum amount of +times. -When the Scan State looks as the picture above, it means that all binary trees have been filled out with transactions, these transactions have been SNARKed, and the SNARK proofs have been merged into SNARK proofs the maximum amount of times. - -Through the use of SNARK proofs and their merging, we end up with a very compressed and lightweight representation of the Mina blockchain. - +Through the use of SNARK proofs and their merging, we end up with a very +compressed and lightweight representation of the Mina blockchain. ## The latency issue -There is a significant delay (in terms of blocks) between when a transaction has been included in a block, and when a proof for that transaction is included. +There is a significant delay (in terms of blocks) between when a transaction has +been included in a block, and when a proof for that transaction is included. Let’s say a transaction is included in block no. 1 (not yet proved): - - ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/ccd45621-abbc-4c9b-9bc7-4ac34e07802d) - -this will fill a slot representing a transaction in a binary tree in the scan state: - +this will fill a slot representing a transaction in a binary tree in the scan +state: ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/afff4f74-435f-474d-9c27-4b3ad6a95187) +At some point a snark worker will pick up this job and start working on +completing the proof. -At some point a snark worker will pick up this job and start working on completing the proof. - -The worker then completes the proof and broadcasts it to other peers, as a result, the completed job may be added to their snark pools: - - +The worker then completes the proof and broadcasts it to other peers, as a +result, the completed job may be added to their snark pools: ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/d88cd86e-2788-4b68-92a6-154fcde36609) - -In the Scan State, the binary tree is updated to show that the SNARK for that transaction has been completed (is purple), but has yet to be added to a block (green). - +In the Scan State, the binary tree is updated to show that the SNARK for that +transaction has been completed (is purple), but has yet to be added to a block +(green). ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/fb7767e1-b8cc-4734-b33f-85553819dafd) +Eventually, a block producer will pick up this snark from their pool and include +it in a block. -Eventually, a block producer will pick up this snark from their pool and include it in a block. - -The block producer produces block no. 5, and includes the proof created for the transaction from block no. 1 - - +The block producer produces block no. 5, and includes the proof created for the +transaction from block no. 1 ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/a9b5ab38-0226-436a-bb39-d133799a73d5) - -This only means that the completed job has been added to a tree in the scan state (at the base), what needs to happen now is that this proof needs to be bubbled up to the top (through merges with other proofs): - - +This only means that the completed job has been added to a tree in the scan +state (at the base), what needs to happen now is that this proof needs to be +bubbled up to the top (through merges with other proofs): ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/bdb23b8e-3f16-4b89-a2bd-7f2eb8f7d88d) - -Once this completed job got added to the base of a tree, a new merge job was created (which will merge this proof with another base proof): - - +Once this completed job got added to the base of a tree, a new merge job was +created (which will merge this proof with another base proof): ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/bc53fb64-a9df-437a-ab98-ba103497379c) - -Eventually, some SNARK worker will pick it up, complete it, broadcast it, and some producer will include it in another block. This will then create another merge job, and the process needs to be repeated: - - +Eventually, some SNARK worker will pick it up, complete it, broadcast it, and +some producer will include it in another block. This will then create another +merge job, and the process needs to be repeated: ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/1c2b704a-828b-444b-b7e7-122a9d49da9c) - -Eventually, a producer produces a block into which it will include the merge proof at the top of the tree that contained the proof for the transaction we included in the first step. - +Eventually, a producer produces a block into which it will include the merge +proof at the top of the tree that contained the proof for the transaction we +included in the first step. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/772b567d-067e-4fd9-b374-84e2126a855f) +At this point, the snarked ledger hash is updated, and now the transaction +(along with others) is part of the snarked ledger. -At this point, the snarked ledger hash is updated, and now the transaction (along with others) is part of the snarked ledger. - -An alternative to having these trees would be that for each transaction proof we need to produce, we will need to wait for the previous transaction to be proved before we can prove the next. The result would be that the network would get stuck waiting for proofs to be ready all the time. Also, proofs would not be solvable in parallel (with the current setup we can have many snark workers solving different proofs that eventually get merged) - +An alternative to having these trees would be that for each transaction proof we +need to produce, we will need to wait for the previous transaction to be proved +before we can prove the next. The result would be that the network would get +stuck waiting for proofs to be ready all the time. Also, proofs would not be +solvable in parallel (with the current setup we can have many snark workers +solving different proofs that eventually get merged) ## Throughput -The issue with the latency results in a bottleneck for throughput. At a certain point, increasing throughput will cause the latency to skyrocket because there will be too many SNARK jobs waiting to be merged and added to the Snarked Ledger: +The issue with the latency results in a bottleneck for throughput. At a certain +point, increasing throughput will cause the latency to skyrocket because there +will be too many SNARK jobs waiting to be merged and added to the Snarked +Ledger: ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/ef75ebc9-e1e4-4ba4-b6a2-50324e559567) - diff --git a/docs/snark-work.md b/docs/snark-work.md index c0ce3a1bf..a6d5ed6a9 100644 --- a/docs/snark-work.md +++ b/docs/snark-work.md @@ -1,115 +1,103 @@ - ## Committing and producing new SNARKs -SNARK proofs are the backbone of the Mina blockchain and are used for verifying the validity of transactions, blocks and other SNARKs. We want to optimize the production of SNARKs so that the Mina blockchain can continue operating and expanding. - +SNARK proofs are the backbone of the Mina blockchain and are used for verifying +the validity of transactions, blocks and other SNARKs. We want to optimize the +production of SNARKs so that the Mina blockchain can continue operating and +expanding. -**This is an overview of SNARK workflows. Click on the picture for a higher resolution:** +**This is an overview of SNARK workflows. Click on the picture for a higher +resolution:** [![image](https://github.com/openmina/openmina/assets/60480123/f32f8d6c-c20a-4984-9cab-0dbdc5eec5b1)](https://raw.githubusercontent.com/openmina/openmina/docs/cleanup/docs/OpenMina%20%2B%20ZK%20Diagrams.png) - - ### Receiving a Block to Update Available Jobs -Since blocks contain both transactions and SNARKs, each new block updates not only the staged ledger, but also the scan state (which contains SNARK proofs). - +Since blocks contain both transactions and SNARKs, each new block updates not +only the staged ledger, but also the scan state (which contains SNARK proofs). image - -Via the GossipSub (P2P), a node receives a new block that contains transactions and SNARK proofs. - +Via the GossipSub (P2P), a node receives a new block that contains transactions +and SNARK proofs. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/02f74256-6ac4-420e-8762-bfb39c72d073) - -The work pool, which is a part of the modified SNARK pool and which contains the staged ledger and the scan state, is updated. The staged ledger includes the new blocks. The scan state is updated with the new jobs. - - +The work pool, which is a part of the modified SNARK pool and which contains the +staged ledger and the scan state, is updated. The staged ledger includes the new +blocks. The scan state is updated with the new jobs. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/ebb3446c-8a26-4c20-9dca-e395e75470e8) - - ### Receiving a Commitment from a Rust Node -We want to avoid wasting time and resources in SNARK generation, specifically, we want to prevent Snarkers from working on the same pending snark job. For that purpose, we have introduced the notion of a _commitment_, in which SNARK workers commit to generating a proof for a pending SNARK job. +We want to avoid wasting time and resources in SNARK generation, specifically, +we want to prevent Snarkers from working on the same pending snark job. For that +purpose, we have introduced the notion of a _commitment_, in which SNARK workers +commit to generating a proof for a pending SNARK job. -This is a message made by SNARK workers that informs other peers in the network that they are committing to generating a proof for a pending SNARK job, so that other SNARK workers do not perform the same task and can instead make commitments to other SNARK jobs. +This is a message made by SNARK workers that informs other peers in the network +that they are committing to generating a proof for a pending SNARK job, so that +other SNARK workers do not perform the same task and can instead make +commitments to other SNARK jobs. -Commitments are made through an extra P2P layer that was created for this purpose. +Commitments are made through an extra P2P layer that was created for this +purpose. ![image](https://github.com/openmina/openmina/assets/60480123/8966f501-c989-47dc-93e3-3477fbbdf5a3) - -Commitments are sent across WebRTC, which enables direct communication between peers via the P2P network. - +Commitments are sent across WebRTC, which enables direct communication between +peers via the P2P network. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/9fa32591-0e63-40c1-91ab-c77a74c0e8b4) - Valid commitments are added to the _commitment pool_. -For a commitment to be added here, it has to: - - - -1. Be made for a SNARK job that is still marked as not yet completed in the scan state -2. Have no other prior commitment to that job. Alternatively, if there are other commitments, then only the one with the cheapest fee will be added. +For a commitment to be added here, it has to: +1. Be made for a SNARK job that is still marked as not yet completed in the scan + state +2. Have no other prior commitment to that job. Alternatively, if there are other + commitments, then only the one with the cheapest fee will be added. image - -The work pool, which is a part of the modified SNARK pool, is updated with a commitment (including its fee) for a specific pending SNARK job. - +The work pool, which is a part of the modified SNARK pool, is updated with a +commitment (including its fee) for a specific pending SNARK job. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/5951772d-f0c0-4ad8-bb0b-089c5f42659e) - -The commitments, once added to the commitment pool, are then broadcasted by the node other peers in the network through direct WebRTC P2P communication. +The commitments, once added to the commitment pool, are then broadcasted by the +node other peers in the network through direct WebRTC P2P communication. image - - ### Receiving a SNARK from an OCaml node The Rust node receives a SNARK proof from an OCaml node (an OCaml SNARK worker). - ![image](https://github.com/openmina/openmina/assets/60480123/fbde0660-df6d-4184-8d8c-b2f8832b711b) - The SNARK is verified. - image - - -If it is the lowest fee SNARK for a specific pending SNARK job, then it is added to the SNARK pool, from where block producers can take SNARKs and add them into blocks. - - +If it is the lowest fee SNARK for a specific pending SNARK job, then it is added +to the SNARK pool, from where block producers can take SNARKs and add them into +blocks. If it is the lowest fee SNARK for that job, then it is added to the SNARK pool - image - - -After this, the updated SNARK pool with the completed (but not yet included in a block) SNARK is broadcast across the PubSub P2P network via the topic `mina/snark-work/1.0.0` (SNARK pool diff) _and_ directly to other nodes via WebRTC. - - +After this, the updated SNARK pool with the completed (but not yet included in a +block) SNARK is broadcast across the PubSub P2P network via the topic +`mina/snark-work/1.0.0` (SNARK pool diff) _and_ directly to other nodes via +WebRTC. ![image](https://github.com/JanSlobodnik/pre-publishing/assets/60480123/f02fc1f4-e30e-4296-9a20-b7b57e2cf4a1) - image - ### Receiving SNARK from Rust node Rust node sends SNARK via P2P. @@ -122,39 +110,38 @@ SNARK is verified. If it is the lowest fee, it will be added to the SNARK pool. - ### Committing and producing a SNARK -Once committed to a pending SNARK job, a SNARK worker will then produce a SNARK. - +Once committed to a pending SNARK job, a SNARK worker will then produce a SNARK. image - -If a commitment is for a SNARK job that is marked as not yet completed in the scan state and there are no prior commitments to that job (Alternatively, if there are other commitments, then it is the commitment with the cheapest fee for the SNARK work), it is added to the SNARK pool. - +If a commitment is for a SNARK job that is marked as not yet completed in the +scan state and there are no prior commitments to that job (Alternatively, if +there are other commitments, then it is the commitment with the cheapest fee for +the SNARK work), it is added to the SNARK pool. From the SNARK pool, it can be committed to one of the following: image - - 1. An available job that hasn’t been completed or included in a block 2. A job that has been already performed, but the new commitment has a lower fee - -If the commitment is for the lowest fee available, then the SNARK worker begins working on the SNARK proof, which is performed in OCaml. After it is done, the generated SNARK is sent back to the SNARK worker (Rust). - +If the commitment is for the lowest fee available, then the SNARK worker begins +working on the SNARK proof, which is performed in OCaml. After it is done, the +generated SNARK is sent back to the SNARK worker (Rust). image -A SNARK worker starts working on the committed job. The SNARK proof that is generated is then checked by a prover in OCaml, after which it is sent back to the SNARK worker. +A SNARK worker starts working on the committed job. The SNARK proof that is +generated is then checked by a prover in OCaml, after which it is sent back to +the SNARK worker. The SNARK proof is then sent to the SNARK pool. - image - -From here, it is broadcast to Rust nodes directly via WebRTC P2P, and to OCaml nodes indirectly via the `mina/snark-work/1.0.0` (SNARK pool diff) topic of the PubSub P2P network. +From here, it is broadcast to Rust nodes directly via WebRTC P2P, and to OCaml +nodes indirectly via the `mina/snark-work/1.0.0` (SNARK pool diff) topic of the +PubSub P2P network. diff --git a/docs/testing/README.md b/docs/testing/README.md index f1af32925..5e27e9434 100644 --- a/docs/testing/README.md +++ b/docs/testing/README.md @@ -3,161 +3,216 @@ ## Table of contents - [P2P tests](#p2p-tests) - - [RPC](#rpc) - - [Kademlia](#kademlia) - - [Identify](#identify) - - [Connection](#connection) + - [RPC](#rpc) + - [Kademlia](#kademlia) + - [Identify](#identify) + - [Connection](#connection) - [Scenarios](#scenarios) - - [Connection Discovery](#connection-discovery) - - [P2P Connections](#p2p-connections) - - [Kademlia](#p2p-kademlia) - - [Pubsub](#p2p-pubsub) - - [P2P Incoming](#p2p-incoming) - - [P2p Outgoing](#p2p-outgoing) - - [Single Node](#single-node) - - [Multi Node](#multi-node) - - [Record/Reply](#recordreplay) + - [Connection Discovery](#connection-discovery) + - [P2P Connections](#p2p-connections) + - [Kademlia](#p2p-kademlia) + - [Pubsub](#p2p-pubsub) + - [P2P Incoming](#p2p-incoming) + - [P2p Outgoing](#p2p-outgoing) + - [Single Node](#single-node) + - [Multi Node](#multi-node) + - [Record/Reply](#recordreplay) ## P2p tests ### [RPC](../../p2p/tests/rpc.rs) -* `rust_to_rust`: test that rust node can receive and send response to and from another rust node -* `rust_to_many_rust_query`: tests that rust node can respond to many rust peers -* `rust_to_many_rust`: test that rust node can send request to many rust peers -* rpc tests, these tests check if node can correctly communicate over rpc: -* `initial_peers`: check that initial peers are correctly sent and received -* `best_tip_with_proof`: check that best tip is correctly sent and received -* `ledger_query`: check that ledger query is sent correctly and received -* `staged_ledger_aux_and_pending_coinbases_at_block`: fails with `attempt to subtract with overflow` in yamux -* `block`: fails with `attempt to subtract with overflow` in yamux +- `rust_to_rust`: test that rust node can receive and send response to and from + another rust node +- `rust_to_many_rust_query`: tests that rust node can respond to many rust peers +- `rust_to_many_rust`: test that rust node can send request to many rust peers +- rpc tests, these tests check if node can correctly communicate over rpc: +- `initial_peers`: check that initial peers are correctly sent and received +- `best_tip_with_proof`: check that best tip is correctly sent and received +- `ledger_query`: check that ledger query is sent correctly and received +- `staged_ledger_aux_and_pending_coinbases_at_block`: fails with + `attempt to subtract with overflow` in yamux +- `block`: fails with `attempt to subtract with overflow` in yamux ### [Kademlia](../../p2p/tests/kademlia.rs) -* `kademlia_routing_table`: tests that node receives peers using kademlia -* `kademlia_incoming_routing_table`: test that kademlia is updated with incoming peer -* `bootstrap_no_peers`: test that kademlia bootstrap finished event if no peers are passed -* `discovery_seed_single_peer`: test nodes discovery over kademlia -* `discovery_seed_multiple_peers`: test node discovery and identify integration -* `test_bad_node`: test that if node gives us invalid peers we handle it +- `kademlia_routing_table`: tests that node receives peers using kademlia +- `kademlia_incoming_routing_table`: test that kademlia is updated with incoming + peer +- `bootstrap_no_peers`: test that kademlia bootstrap finished event if no peers + are passed +- `discovery_seed_single_peer`: test nodes discovery over kademlia +- `discovery_seed_multiple_peers`: test node discovery and identify integration +- `test_bad_node`: test that if node gives us invalid peers we handle it ### [Identify](../../p2p/tests/identify.rs) -* `rust_node_to_rust_node`: test if rust node can identify another rust node +- `rust_node_to_rust_node`: test if rust node can identify another rust node ### [Connection](../../p2p/tests/connection.rs) -* `rust_to_rust`: test if rust node can connect to rust node -* `rust_to_libp2p`: test if out node can connect to rust libp2p -* `libp2p_to_rust`: test if libp2p node can connect to rust node -* `mutual_rust_to_rust`: test if one rust node can connect to second rust node, while second node is trying to connect to first one -* `mutual_rust_to_rust_many`: test that many rust nodes can connect to each other at the same time -* `mutual_rust_to_libp2p`: test if rust node can connect to libp2p node, while libp2p node is trying to connect to rust node -* `mutual_rust_to_libp2p_port_reuse`: test that rust node can resolve mutual connection between itself and libp2p node, currently failing due to [Issue #399](https://github.com/openmina/openmina/issues/399) +- `rust_to_rust`: test if rust node can connect to rust node +- `rust_to_libp2p`: test if out node can connect to rust libp2p +- `libp2p_to_rust`: test if libp2p node can connect to rust node +- `mutual_rust_to_rust`: test if one rust node can connect to second rust node, + while second node is trying to connect to first one +- `mutual_rust_to_rust_many`: test that many rust nodes can connect to each + other at the same time +- `mutual_rust_to_libp2p`: test if rust node can connect to libp2p node, while + libp2p node is trying to connect to rust node +- `mutual_rust_to_libp2p_port_reuse`: test that rust node can resolve mutual + connection between itself and libp2p node, currently failing due to + [Issue #399](https://github.com/openmina/openmina/issues/399) ## Scenarios ### [Connection Discovery](../../node/testing/src/scenarios/multi_node/connection_discovery.rs) -We want to test whether the Rust node can connect and discover peers from Ocaml node, and vice versa - -* `RustToOCaml`: -This test ensures that after the Rust node connects to an OCaml node with a known address, it adds its address to its Kademlia state. It also checks that the OCaml node has a peer with the correct peer_id and port corresponding to the Rust node. - -* `OCamlToRust`: -This test ensures that after an OCaml node connects to the Rust node, its address becomes available in the Rust node’s Kademlia state. It also checks whether the OCaml node has a peer with the correct `peer_id` and a port corresponding to the Rust node. - -* `RustToOCamlViaSeed`: -This test ensures that the Rust node can connect to an OCaml peer, the address of whom can only be discovered from an OCaml seed node, and that the Rust node adds its address to its Kademlia state. It also checks whether the OCaml node has a peer with the correct `peer_id` and port corresponding to the Rust node. Initially, the OCaml seed node has the other two nodes in its peer list, while the OCaml node and the Rust node only have the seed node. The two (OCaml and Rust) non-seed nodes connect to the OCaml seed node. Once connected, they gain information about each other from the seed node. They then make a connection between themselves. If the test is successful, then at the end of this process, each node has each other in its peer list. - -* `OCamlToRustViaSeed`: This test ensures that an OCaml node can connect to the Rust node, the address of which can only be discovered from an OCaml seed node, and its address becomes available in the Rust node’s Kademlia state. It also checks whether the OCaml node has a peer with the correct `peer_id` and a port corresponding to the Rust node. - -* `RustNodeAsSeed`: This test ensures that the Rust node can work as a seed node by running two OCaml nodes that only know about the Rust node’s address. After these nodes connect to the Rust node, the test makes sure that they also have each other’s addresses as their peers. +We want to test whether the Rust node can connect and discover peers from Ocaml +node, and vice versa + +- `RustToOCaml`: This test ensures that after the Rust node connects to an OCaml + node with a known address, it adds its address to its Kademlia state. It also + checks that the OCaml node has a peer with the correct peer_id and port + corresponding to the Rust node. + +- `OCamlToRust`: This test ensures that after an OCaml node connects to the Rust + node, its address becomes available in the Rust node’s Kademlia state. It also + checks whether the OCaml node has a peer with the correct `peer_id` and a port + corresponding to the Rust node. + +- `RustToOCamlViaSeed`: This test ensures that the Rust node can connect to an + OCaml peer, the address of whom can only be discovered from an OCaml seed + node, and that the Rust node adds its address to its Kademlia state. It also + checks whether the OCaml node has a peer with the correct `peer_id` and port + corresponding to the Rust node. Initially, the OCaml seed node has the other + two nodes in its peer list, while the OCaml node and the Rust node only have + the seed node. The two (OCaml and Rust) non-seed nodes connect to the OCaml + seed node. Once connected, they gain information about each other from the + seed node. They then make a connection between themselves. If the test is + successful, then at the end of this process, each node has each other in its + peer list. + +- `OCamlToRustViaSeed`: This test ensures that an OCaml node can connect to the + Rust node, the address of which can only be discovered from an OCaml seed + node, and its address becomes available in the Rust node’s Kademlia state. It + also checks whether the OCaml node has a peer with the correct `peer_id` and a + port corresponding to the Rust node. + +- `RustNodeAsSeed`: This test ensures that the Rust node can work as a seed node + by running two OCaml nodes that only know about the Rust node’s address. After + these nodes connect to the Rust node, the test makes sure that they also have + each other’s addresses as their peers. ### [P2P Connections](../../node/testing/tests/p2p_basic_connections.rs) -* `SimultaneousConnections`: -Tests if two nodes are connecting to each other at the same time, they should be -connected, so each one has exactly one connection. +- `SimultaneousConnections`: Tests if two nodes are connecting to each other at + the same time, they should be connected, so each one has exactly one + connection. -* `AllNodesConnectionsAreSymmetric` -Connections between all peers are symmetric, i.e. if the node1 has the node2 among its active peers, then the node2 should have the node1 as its active peers. +- `AllNodesConnectionsAreSymmetric` Connections between all peers are symmetric, + i.e. if the node1 has the node2 among its active peers, then the node2 should + have the node1 as its active peers. -* `SeedConnectionsAreSymmetric` -Connections with other peers are symmetric for seed node, i.e. if a node is the seed's peer, then it has the node among its peers. +- `SeedConnectionsAreSymmetric` Connections with other peers are symmetric for + seed node, i.e. if a node is the seed's peer, then it has the node among its + peers. -* `MaxNumberOfPeersIncoming`: -Test that Rust node's incoming connections are limited. +- `MaxNumberOfPeersIncoming`: Test that Rust node's incoming connections are + limited. -* `MaxNumberOfPeersIs1` -Two nodes with max peers = 1 can connect to each other. +- `MaxNumberOfPeersIs1` Two nodes with max peers = 1 can connect to each other. ### [P2P Kademlia](../../node/testing/tests/p2p_kad.rs) Test related to kademlia layer. -* `KademliaBootstrap`: -Test that node discovers peers another rust node and is able to bootstrap +- `KademliaBootstrap`: Test that node discovers peers another rust node and is + able to bootstrap ### [P2P Pubsub](../../node/testing/tests/p2p_pubsub.rs) Tests related to pubsub layer. -* `P2pReceiveMessage` -Test that node receives message over meshsub from node +- `P2pReceiveMessage` Test that node receives message over meshsub from node ### [P2P Incoming](../../node/testing/tests/p2p_basic_incoming.rs) Tests related to handling incoming connections. -* `AcceptIncomingConnection`: Node should accept incoming connections. -* `AcceptMultipleIncomingConnections`: Node should accept multiple incoming connections. +- `AcceptIncomingConnection`: Node should accept incoming connections. +- `AcceptMultipleIncomingConnections`: Node should accept multiple incoming + connections. ### [P2P Outgoing](../../node/testing/tests/p2p_basic_outgoing.rs) Tests related to outgoing connections -* `MakeOutgoingConnection`: Node should be able to make an outgoing connection to a listening node. +- `MakeOutgoingConnection`: Node should be able to make an outgoing connection + to a listening node. -* `MakeMultipleOutgoingConnections`: Node should be able to create multiple outgoing connections. +- `MakeMultipleOutgoingConnections`: Node should be able to create multiple + outgoing connections. -* `DontConnectToNodeWithSameId`: Node shouldn't establish connection with a node with the same peer_id. +- `DontConnectToNodeWithSameId`: Node shouldn't establish connection with a node + with the same peer_id. -* `DontConnectToInitialPeerWithSameId`: Node shouldn't connect to a node with the same peer id even if its address specified in initial peers. +- `DontConnectToInitialPeerWithSameId`: Node shouldn't connect to a node with + the same peer id even if its address specified in initial peers. -* `DontConnectToSelfInitialPeer`: Node shouldn't connect to itself even if its address specified in initial peers. +- `DontConnectToSelfInitialPeer`: Node shouldn't connect to itself even if its + address specified in initial peers. -* `ConnectToInitialPeers`: Node should be able to connect to all initial peers. +- `ConnectToInitialPeers`: Node should be able to connect to all initial peers. -* `ConnectToUnavailableInitialPeers`: Node should repeat connecting to unavailable initial peer. +- `ConnectToUnavailableInitialPeers`: Node should repeat connecting to + unavailable initial peer. -* `ConnectToInitialPeersBecomeReady`: Node should be able to connect to all initial peers after they become ready. +- `ConnectToInitialPeersBecomeReady`: Node should be able to connect to all + initial peers after they become ready. ### [Single Node](../../node/testing/tests/single_node.rs): -We want to test whether the Rust node is compatible with the OCaml node. We achieve this by attempting to connect the Openmina node to the existing OCaml testnet. - -For that purpose, we are utilizing a _solo node_, which is a single Open Mina node connected to a network of OCaml nodes. Currently, we are using the public testnet, but later on we want to use our own network of OCaml nodes on our cluster. +We want to test whether the Rust node is compatible with the OCaml node. We +achieve this by attempting to connect the Openmina node to the existing OCaml +testnet. -* `SoloNodeBasicConnectivityAcceptIncoming`: Local test to ensure that the Openmina node can accept a connection from an existing OCaml node. +For that purpose, we are utilizing a _solo node_, which is a single Open Mina +node connected to a network of OCaml nodes. Currently, we are using the public +testnet, but later on we want to use our own network of OCaml nodes on our +cluster. -* `SoloNodeBasicConnectivityInitialJoining`: Local test to ensure that the Openmina node can connect to an existing OCaml testnet. +- `SoloNodeBasicConnectivityAcceptIncoming`: Local test to ensure that the + Openmina node can accept a connection from an existing OCaml node. -* `SoloNodeSyncRootSnarkedLedger`: Set up single Rust node and sync up root snarked ledger. +- `SoloNodeBasicConnectivityInitialJoining`: Local test to ensure that the + Openmina node can connect to an existing OCaml testnet. -* `SoloNodeBootstrap`: Set up single Rust node and bootstrap snarked ledger, bootstrap ledger and blocks. +- `SoloNodeSyncRootSnarkedLedger`: Set up single Rust node and sync up root + snarked ledger. +- `SoloNodeBootstrap`: Set up single Rust node and bootstrap snarked ledger, + bootstrap ledger and blocks. ### [Multi Node](../../node/testing/tests/multi_node.rs): -We also want to test a scenario in which the network consists only of Openmina nodes. If the Openmina node is using a functionality that is implemented only in the OCaml node, and it does not perform it correctly, then we will not be able to see it with solo node test. For that purpose, we utilize a Multi node test, which involves a network of our nodes, without any third party, so that the testing is completely local and under our control. +We also want to test a scenario in which the network consists only of Openmina +nodes. If the Openmina node is using a functionality that is implemented only in +the OCaml node, and it does not perform it correctly, then we will not be able +to see it with solo node test. For that purpose, we utilize a Multi node test, +which involves a network of our nodes, without any third party, so that the +testing is completely local and under our control. -* `MultiNodeBasicConnectivityPeerDiscovery`: Tests that our node is able to discovery Ocaml nodes through Ocaml seed node. +- `MultiNodeBasicConnectivityPeerDiscovery`: Tests that our node is able to + discovery Ocaml nodes through Ocaml seed node. -* `MultiNodeBasicConnectivityInitialJoining`: Tests that node maintains number of peers between minimum and maximum allowed peers. +- `MultiNodeBasicConnectivityInitialJoining`: Tests that node maintains number + of peers between minimum and maximum allowed peers. ### [Record/Replay](../../node/testing/tests/record_replay.rs) -* `RecordReplayBootstrap`: Bootstrap a rust node while recorder of state and input actions is enabled and make sure we can successfully replay it. +- `RecordReplayBootstrap`: Bootstrap a rust node while recorder of state and + input actions is enabled and make sure we can successfully replay it. -* `RecordReplayBlockProduction`: Makes sure we can successfully record and replay multiple nodes in the cluster + block production. +- `RecordReplayBlockProduction`: Makes sure we can successfully record and + replay multiple nodes in the cluster + block production. diff --git a/docs/testing/bootstrap.md b/docs/testing/bootstrap.md index 34c5a4940..ca3c9151f 100644 --- a/docs/testing/bootstrap.md +++ b/docs/testing/bootstrap.md @@ -10,6 +10,7 @@ The node's HTTP port is accessible at http://1.k8.openmina.com:31001. These are the main steps and checks: First, it performs some checks on the instance deployed previously: + - Node is in sync state - Node's best tip is the one that of berkeleynet @@ -19,7 +20,5 @@ Then it deploys the new instance of Openmina and waits until it is bootstrapped - The node's best tip is the same as in berkeleynet - There were no restarts for the openmina container - See the [Openmina Daily](../../.github/workflows/daily.yaml) workflow file for further details. - diff --git a/docs/testing/cluster.md b/docs/testing/cluster.md index 0ba75a870..c3ef1f558 100644 --- a/docs/testing/cluster.md +++ b/docs/testing/cluster.md @@ -6,4 +6,3 @@ The namespace `test-openmina-daily` is used. It has a service account `github-tester` with `edit` role that allows it to have full control over the namespace's resources. This account is used by GitHub actions that run daily tests. - diff --git a/docs/testing/testing.md b/docs/testing/testing.md index fbfc59d10..25b97d9a5 100644 --- a/docs/testing/testing.md +++ b/docs/testing/testing.md @@ -1,9 +1,7 @@ - # A Testing Framework for Mina - - ### Table of Contents + - [Introduction](#introduction) - [What we are testing](#what-we-are-testing) - [1. Network Connectivity and Peer Management](#1-network-connectivity-and-peer-management) @@ -41,394 +39,475 @@ - [Upgradability](#upgradability) - [7. How to run tests](#7-how-to-run-tests) - ### Introduction -Complex systems that handle important information such as blockchain networks must be thoroughly and continuously tested to ensure the highest degree of security, stability, and performance. - -To achieve that, we need to develop a comprehensive testing framework capable of deploying a variety of tests. - -Such a framework plays a pivotal role in assessing a blockchain's resistance to various malicious attacks. By simulating these attack scenarios and vulnerabilities, the framework helps identify weaknesses in the blockchain's security measures, enabling developers to fortify the system's defenses. This proactive approach is essential to maintain trust and integrity within the blockchain ecosystem, as it minimizes the risk of breaches and instills confidence in users and stakeholders. - -Secondly, a robust testing framework is equally crucial for evaluating the blockchain's scalability, speed, and stability. As blockchain networks grow in size and adoption, they must accommodate an increasing number of transactions and users while maintaining a high level of performance and stability. Scalability tests ensure that the system can handle greater workloads without degradation in speed or reliability, helping to avoid bottlenecks and congestion that can hinder transactions and overall network performance. - -Additionally, stability testing assesses the blockchain's ability to operate consistently under various conditions, even amid a protocol upgrade. We want to identify potential issues or crashes that could disrupt operations before they have a chance of occurring on the mainnet. - - +Complex systems that handle important information such as blockchain networks +must be thoroughly and continuously tested to ensure the highest degree of +security, stability, and performance. + +To achieve that, we need to develop a comprehensive testing framework capable of +deploying a variety of tests. + +Such a framework plays a pivotal role in assessing a blockchain's resistance to +various malicious attacks. By simulating these attack scenarios and +vulnerabilities, the framework helps identify weaknesses in the blockchain's +security measures, enabling developers to fortify the system's defenses. This +proactive approach is essential to maintain trust and integrity within the +blockchain ecosystem, as it minimizes the risk of breaches and instills +confidence in users and stakeholders. + +Secondly, a robust testing framework is equally crucial for evaluating the +blockchain's scalability, speed, and stability. As blockchain networks grow in +size and adoption, they must accommodate an increasing number of transactions +and users while maintaining a high level of performance and stability. +Scalability tests ensure that the system can handle greater workloads without +degradation in speed or reliability, helping to avoid bottlenecks and congestion +that can hinder transactions and overall network performance. + +Additionally, stability testing assesses the blockchain's ability to operate +consistently under various conditions, even amid a protocol upgrade. We want to +identify potential issues or crashes that could disrupt operations before they +have a chance of occurring on the mainnet. ### What we are testing -Here is a limited overview of test categories. Tests are currently focused on the network and P2P layer, the next steps will be consensus, ledger, and other parts. - -We need to work with the assumption that more than one-third of the nodes can be Byzantine for the system to function correctly. +Here is a limited overview of test categories. Tests are currently focused on +the network and P2P layer, the next steps will be consensus, ledger, and other +parts. +We need to work with the assumption that more than one-third of the nodes can be +Byzantine for the system to function correctly. ## 1. Network Connectivity and Peer Management - ### Network Connectivity -Nodes that get disconnected should eventually be able to reconnect and synchronize with the network. +Nodes that get disconnected should eventually be able to reconnect and +synchronize with the network. -_This test assesses the blockchain node's ability to maintain consistent network connectivity. It evaluates whether a node can gracefully handle temporary disconnections from the network and subsequently reestablish connections._ +_This test assesses the blockchain node's ability to maintain consistent network +connectivity. It evaluates whether a node can gracefully handle temporary +disconnections from the network and subsequently reestablish connections._ -We want to ensure that new nodes can join the network and handle being overwhelmed with connections or data requests, including various resilience and stability conditions (e.g., handling reconnections, latency, intermittent connections, and dynamic IP handling). +We want to ensure that new nodes can join the network and handle being +overwhelmed with connections or data requests, including various resilience and +stability conditions (e.g., handling reconnections, latency, intermittent +connections, and dynamic IP handling). -This is crucial for ensuring no node is permanently isolated and can always participate in the blockchain's consensus process. +This is crucial for ensuring no node is permanently isolated and can always +participate in the blockchain's consensus process. We are testing two versions of the node: - ### Solo node -We want to test whether the Rust node is compatible with the OCaml node. We achieve this by attempting to connect the Openmina node to the existing OCaml testnet. +We want to test whether the Rust node is compatible with the OCaml node. We +achieve this by attempting to connect the Openmina node to the existing OCaml +testnet. -For that purpose, we are utilizing a _solo node_, which is a single Open Mina node connected to a network of OCaml nodes. Currently, we are using the public testnet, but later on we want to use our own network of OCaml nodes on our cluster. +For that purpose, we are utilizing a _solo node_, which is a single Open Mina +node connected to a network of OCaml nodes. Currently, we are using the public +testnet, but later on we want to use our own network of OCaml nodes on our +cluster. -This test is performed by launching an Openmina node and connecting it to seed nodes of the public (or private) OCaml testnet. +This test is performed by launching an Openmina node and connecting it to seed +nodes of the public (or private) OCaml testnet. _The source code for this test can be found in this repo:_ [https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/solo_node/basic_connectivity_initial_joining.rs](https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/solo_node/basic_connectivity_initial_joining.rs) - - **We are testing the following scenarios:** #### Node Discovery Tests #### Rust node accepts incoming OCaml connection -Whether the Openmina (Rust) node can accept an incoming connection from the native Mina (OCaml) node. This test will prove our Rust node is listening to incoming connections and can accept them. - - +Whether the Openmina (Rust) node can accept an incoming connection from the +native Mina (OCaml) node. This test will prove our Rust node is listening to +incoming connections and can accept them. **Tests** + - [p2p_basic_incoming(accept_connection)](../node/testing/src/scenarios/p2p/basic_incoming_connections.rs#L16) - [p2p_basic_incoming(accept_multiple_connections)](../node/testing/src/scenarios/p2p/basic_incoming_connections.rs#L62) - [solo_node_accept_incoming](../node/testing/src/scenarios/solo_node/basic_connectivity_accept_incoming.rs) - #### OCaml connection to advertised Rust node -Whether the OCaml node can discover and connect to a Rust node that is advertising itself. This is done by advertising the Rust node so that the OCaml node can discover it and connect to the node. - -In this test, we do not inform the OCaml node to connect to it explicitly, it should find it automatically and connect using peer discovery (performed through Kademlia). This test will ensure the Rust node uses Kademlia in a way that is compatible with the OCaml node. - +Whether the OCaml node can discover and connect to a Rust node that is +advertising itself. This is done by advertising the Rust node so that the OCaml +node can discover it and connect to the node. +In this test, we do not inform the OCaml node to connect to it explicitly, it +should find it automatically and connect using peer discovery (performed through +Kademlia). This test will ensure the Rust node uses Kademlia in a way that is +compatible with the OCaml node. **Test** - - [solo_node_accept_incoming](../node/testing/src/scenarios/solo_node/basic_connectivity_accept_incoming.rs) - #### Rust and OCaml node discovery via OCaml seed node -The main goal of this test is to ensure that the Rust node can discover peers in the network, and is discoverable by other peers as well. +The main goal of this test is to ensure that the Rust node can discover peers in +the network, and is discoverable by other peers as well. 1. In this test, three nodes are started: -*OCaml seed node with known address and peer ID -*OCaml node with the seed node set as the initial peer -*Rust node with the seed node set as the initial peer - -Initially, the OCaml seed node has the other two nodes in its peer list, while the OCaml node and the Rust node only have the seed node. +*OCaml seed node with known address and peer ID *OCaml node with the seed node +set as the initial peer \*Rust node with the seed node set as the initial peer +Initially, the OCaml seed node has the other two nodes in its peer list, while +the OCaml node and the Rust node only have the seed node. ![peer1](https://github.com/openmina/openmina/assets/60480123/bb2c8428-7e89-4748-949a-4b8aa5954205) - - 2. The two (OCaml and Rust) non-seed nodes connect to the OCaml seed node - ![peer2](https://github.com/openmina/openmina/assets/60480123/480ffeb0-e7c7-4f16-bed3-76281a19e2bf) - 3. Once connected, they gain information about each other from the seed node. -They then make a connection between themselves. If the test is successful, then at the end of this process, each node has each other in its peer list. +They then make a connection between themselves. If the test is successful, then +at the end of this process, each node has each other in its peer list. ![peer3](https://github.com/openmina/openmina/assets/60480123/3ee75cd4-68cf-453c-aa7d-40c09b11d83b) - **Implementation Details** -The Helm chart that is used to deploy the network also contains the script that performs the checks. - +The Helm chart that is used to deploy the network also contains the script that +performs the checks. #### OCaml Peer Discovery Tests -We must ensure that the Rust node can utilize the Kademlia protocol (KAD) to discover and connect to OCaml nodes, and vice versa. - -For that purpose, we have developed a series of basic tests that check the correct peer discovery via KAD when the Rust node is connected to OCaml peers. +We must ensure that the Rust node can utilize the Kademlia protocol (KAD) to +discover and connect to OCaml nodes, and vice versa. +For that purpose, we have developed a series of basic tests that check the +correct peer discovery via KAD when the Rust node is connected to OCaml peers. #### OCaml to Rust -This test ensures that after an OCaml node connects to the Rust node, its address \ -becomes available in the Rust node’s Kademlia state. It also checks whether the OCaml \ -node has a peer with the correct `peer_id` and a port corresponding to the Rust node. +This test ensures that after an OCaml node connects to the Rust node, its +address \ +becomes available in the Rust node’s Kademlia state. It also checks whether the +OCaml \ +node has a peer with the correct `peer_id` and a port corresponding to the Rust +node. **Steps** - - 1. Configure and launch a Rust node 2. Start an OCaml node with the Rust node as the only peer -3. Run the Rust node until it receives an event signaling that the OCaml node is connected -4. Wait for an event Identify that is used to identify the remote peer’s address and port -5. Check that the Rust node has an address of the OCaml node in its Kademlia part of the state - +3. Run the Rust node until it receives an event signaling that the OCaml node is + connected +4. Wait for an event Identify that is used to identify the remote peer’s address + and port +5. Check that the Rust node has an address of the OCaml node in its Kademlia + part of the state #### Rust to OCaml -This test ensures that after the Rust node connects to an OCaml node with a known \ -address, it adds its address to its Kademlia state. It also checks that the OCaml \ -node has a peer with the correct peer_id and port corresponding to the Rust node. +This test ensures that after the Rust node connects to an OCaml node with a +known \ +address, it adds its address to its Kademlia state. It also checks that the +OCaml \ +node has a peer with the correct peer_id and port corresponding to the Rust +node. Steps: - - 1. Start an OCaml node and wait for its p2p to be ready 2. Start a Rust node and initiate its connection to the OCaml node -3. Run the Rust node until it receives an event signaling that connection is established -4. Run the Rust node until it receives a Kademlia event signaling that the address of the OCaml node has been added -5. Check that the Rust node has an address of the OCaml node in its Kademlia part of the state - +3. Run the Rust node until it receives an event signaling that connection is + established +4. Run the Rust node until it receives a Kademlia event signaling that the + address of the OCaml node has been added +5. Check that the Rust node has an address of the OCaml node in its Kademlia + part of the state #### OCaml to Rust via seed Node -This test ensures that an OCaml node can connect to the Rust node, the address of which can only be discovered from an OCaml seed node, and its address becomes available in the Rust node’s Kademlia state. It also checks whether the OCaml node has a peer with the correct `peer_id` and a port corresponding to the Rust node. +This test ensures that an OCaml node can connect to the Rust node, the address +of which can only be discovered from an OCaml seed node, and its address becomes +available in the Rust node’s Kademlia state. It also checks whether the OCaml +node has a peer with the correct `peer_id` and a port corresponding to the Rust +node. **Steps** - - 1. Start an OCaml node acting as a seed node and wait for its P2P to be ready 2. Start a Rust node and initiate its connection to the seed node -3. Run the Rust node until it receives an event signaling that connection is established +3. Run the Rust node until it receives an event signaling that connection is + established 4. Start an OCaml node acting with the seed node as its peer -5. Run the Rust node until it receives an event signaling that the connection with the OCaml node has been established -6. Check that the Rust node has an address of the OCaml node in its Kademlia part of the state - +5. Run the Rust node until it receives an event signaling that the connection + with the OCaml node has been established +6. Check that the Rust node has an address of the OCaml node in its Kademlia + part of the state #### Rust to OCaml via seed Node -This test ensures that the Rust node can connect to an OCaml peer, the address of whom can only be discovered from an OCaml seed node, and that the Rust node adds its address to its Kademlia state. It also checks whether the OCaml node has a peer with the correct `peer_id` and port corresponding to the Rust node. +This test ensures that the Rust node can connect to an OCaml peer, the address +of whom can only be discovered from an OCaml seed node, and that the Rust node +adds its address to its Kademlia state. It also checks whether the OCaml node +has a peer with the correct `peer_id` and port corresponding to the Rust node. Steps: - - 1. Start an OCaml node acting as a seed node -2. Start an OCaml node acting with the seed node as its peer and wait for its p2p to be ready +2. Start an OCaml node acting with the seed node as its peer and wait for its + p2p to be ready 3. Start a Rust node and initiate its connection to the seed node -4. Run the Rust node until it receives an event signaling that connection with the seed node is established -5. Run the Rust node until it receives an event signaling that connection with the non-seed OCaml node is established -6. Check that the Rust node has an address of the OCaml node in its Kademlia part of the state - +4. Run the Rust node until it receives an event signaling that connection with + the seed node is established +5. Run the Rust node until it receives an event signaling that connection with + the non-seed OCaml node is established +6. Check that the Rust node has an address of the OCaml node in its Kademlia + part of the state #### Rust as a Seed Node This test ensures that the Rust node can work as a seed node by running two \ -OCaml nodes that only know about the Rust node’s address. After these nodes connect \ -to the Rust node, the test makes sure that they also have each other’s addresses \ +OCaml nodes that only know about the Rust node’s address. After these nodes +connect \ +to the Rust node, the test makes sure that they also have each other’s addresses +\ as their peers. **Steps** - 1. Start a Rust node 2. Start two OCaml nodes, specifying the Rust node address as their peer -3. Wait for events indicating that connections with both OCaml nodes are established +3. Wait for events indicating that connections with both OCaml nodes are + established 4. Check that both OCaml nodes have each other’s address as their peers -5. Check that the Rust node has addresses of both OCaml nodes in the Kademlia state +5. Check that the Rust node has addresses of both OCaml nodes in the Kademlia + state #### Test Conditions We run these tests until: - -* The number of known peers is greater than or equal to the maximum number of peers. -* The number of connected peers is greater than or equal to some threshold. -* The test is failed if the specified number of steps occur but the conditions are not met. - +- The number of known peers is greater than or equal to the maximum number of + peers. +- The number of connected peers is greater than or equal to some threshold. +- The test is failed if the specified number of steps occur but the conditions + are not met. ### Multi node -We also want to test a scenario in which the network consists only of Openmina nodes. If the Openmina node is using a functionality that is implemented only in the OCaml node, and it does not perform it correctly, then we will not be able to see it with solo node test. +We also want to test a scenario in which the network consists only of Openmina +nodes. If the Openmina node is using a functionality that is implemented only in +the OCaml node, and it does not perform it correctly, then we will not be able +to see it with solo node test. -For that purpose, we utilize a Multi node test, which involves a network of our nodes, without any third party, so that the testing is completely local and under our control. +For that purpose, we utilize a Multi node test, which involves a network of our +nodes, without any third party, so that the testing is completely local and +under our control. #### Rust connection to all initially available nodes -This test checks whether the Rust node connects to all peers from its initial peer list - - +This test checks whether the Rust node connects to all peers from its initial +peer list **Test** - -- [multi_node_initial_joining](../node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs) (partially?) +- [multi_node_initial_joining](../node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs) + (partially?) #### Rust node connects to advertised Rust node -This test checks whether Rust nodes connect to a Rust node that is advertised. In this test, we do not inform the OCaml node to connect to it explicitly, it should find it automatically and connect using peer discovery (performed through Kademlia). - - +This test checks whether Rust nodes connect to a Rust node that is advertised. +In this test, we do not inform the OCaml node to connect to it explicitly, it +should find it automatically and connect using peer discovery (performed through +Kademlia). **Test** -- [multi_node_connection_discovery/OCamlToRustViaSeed](../node/testing/src/scenarios/multi_node/connection_discovery.rs#L267) +- [multi_node_connection_discovery/OCamlToRustViaSeed](../node/testing/src/scenarios/multi_node/connection_discovery.rs#L267) _The source code for this test can be found in this repo:_ [https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs#L9](https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs#L9) - - ### Adaptive Peer Management -Nodes should be able to discover and connect to new peers if their current peers become unresponsive or malicious. - -_This test evaluates the blockchain node's capacity to adapt to changing network conditions. It examines whether a node can autonomously identify unresponsive or malicious peers and replace them with trustworthy counterparts. Adaptive peer management enhances the network's resilience against potential attacks or unreliable participants._ +Nodes should be able to discover and connect to new peers if their current peers +become unresponsive or malicious. +_This test evaluates the blockchain node's capacity to adapt to changing network +conditions. It examines whether a node can autonomously identify unresponsive or +malicious peers and replace them with trustworthy counterparts. Adaptive peer +management enhances the network's resilience against potential attacks or +unreliable participants._ ## 2. Network Resistance - ### Resistance to DDoS Attacks -The network should remain operational even under targeted Denial-of-Service attacks on specific nodes or infrastructure. - -_This test focuses on the node's ability to withstand Distributed Denial-of-Service (DDoS) attacks, which can overwhelm a node's resources and render it inaccessible. It assesses whether the node can continue to function and serve the network even when subjected to deliberate and sustained attacks, as well as how much of these attacks it can withstand while remaining operational._ +The network should remain operational even under targeted Denial-of-Service +attacks on specific nodes or infrastructure. +_This test focuses on the node's ability to withstand Distributed +Denial-of-Service (DDoS) attacks, which can overwhelm a node's resources and +render it inaccessible. It assesses whether the node can continue to function +and serve the network even when subjected to deliberate and sustained attacks, +as well as how much of these attacks it can withstand while remaining +operational._ ### Resistance to Eclipse Attacks -Honest nodes should not be isolated by malicious nodes in a way that they only receive information from these malicious entities. - -_This test examines the blockchain node's resistance to eclipse attacks, where malicious nodes surround and isolate honest nodes, limiting their access to accurate information. It ensures that honest nodes can always access a diverse set of peers and aren't dominated by malicious actors._ +Honest nodes should not be isolated by malicious nodes in a way that they only +receive information from these malicious entities. +_This test examines the blockchain node's resistance to eclipse attacks, where +malicious nodes surround and isolate honest nodes, limiting their access to +accurate information. It ensures that honest nodes can always access a diverse +set of peers and aren't dominated by malicious actors._ ### Resistance to Sybil Attacks -The network should function even if an adversary creates a large number of pseudonymous identities. Honest nodes should still be able to connect with other honest nodes. - -_This test assesses the network's ability to mitigate Sybil attacks, wherein an adversary creates numerous fake identities to control a substantial portion of the network. It verifies that the network can maintain its integrity and continue to operate effectively despite the presence of these pseudonymous attackers._ +The network should function even if an adversary creates a large number of +pseudonymous identities. Honest nodes should still be able to connect with other +honest nodes. +_This test assesses the network's ability to mitigate Sybil attacks, wherein an +adversary creates numerous fake identities to control a substantial portion of +the network. It verifies that the network can maintain its integrity and +continue to operate effectively despite the presence of these pseudonymous +attackers._ ### Resistance to Censorship -The network should resist attempts by any subset of nodes to consistently censor or block certain transactions or blocks. - -_This test assesses the node's ability to resist censorship attempts by a subset of nodes. It verifies that the network's design prevents any small group from censoring specific transactions or blocks, upholding the blockchain's openness and decentralization._ +The network should resist attempts by any subset of nodes to consistently censor +or block certain transactions or blocks. +_This test assesses the node's ability to resist censorship attempts by a subset +of nodes. It verifies that the network's design prevents any small group from +censoring specific transactions or blocks, upholding the blockchain's openness +and decentralization._ ## 3. Node Bootstrapping and Data Availability - ### Node Bootstrapping -New nodes joining the network should eventually discover and connect to honest peers and synchronize the latest blockchain state. +New nodes joining the network should eventually discover and connect to honest +peers and synchronize the latest blockchain state. -_This test evaluates the node's capability to bootstrap onto the network successfully. It ensures that newly joined nodes can find trustworthy peers, initiate synchronization, and catch up with the current state of the blockchain, enhancing network scalability._ +_This test evaluates the node's capability to bootstrap onto the network +successfully. It ensures that newly joined nodes can find trustworthy peers, +initiate synchronization, and catch up with the current state of the blockchain, +enhancing network scalability._ -This test is focused on ensuring that the latest Openmina build is able to bootstrap against Berkeleynet. It is executed on daily basis. +This test is focused on ensuring that the latest Openmina build is able to +bootstrap against Berkeleynet. It is executed on daily basis. -The node's HTTP port is accessible as [http://1.k8.openmina.com:31001](http://1.k8.openmina.com:31001/). +The node's HTTP port is accessible as +[http://1.k8.openmina.com:31001](http://1.k8.openmina.com:31001/). These are the main steps and checks. First, it performs some checks on the instance deployed previously: +- Node is in sync state +- Node's best tip is the one that of berkeleynet +Then it deploys the new instance of Openmina and waits until it is bootstrapped +(with 10 minutes timeout). After that it performs the following checks: -* Node is in sync state -* Node's best tip is the one that of berkeleynet - -Then it deploys the new instance of Openmina and waits until it is bootstrapped (with 10 minutes timeout). After that it performs the following checks: - - - -* Node's best tip is the one that of berkeleynet -* There were no restarts for openmina container - -See [Openmina Daily](https://github.com/openmina/openmina/blob/develop/.github/workflows/daily.yaml) workflow file for further details. +- Node's best tip is the one that of berkeleynet +- There were no restarts for openmina container +See +[Openmina Daily](https://github.com/openmina/openmina/blob/develop/.github/workflows/daily.yaml) +workflow file for further details. ### Data Availability -Any piece of data (like a block or transaction) that is part of the blockchain should be available to any node that requests it. - -_This test confirms that the blockchain node can consistently provide requested data to other nodes in the network. It guarantees that data availability is maintained, promoting transparency and trust in the blockchain's history._ +Any piece of data (like a block or transaction) that is part of the blockchain +should be available to any node that requests it. +_This test confirms that the blockchain node can consistently provide requested +data to other nodes in the network. It guarantees that data availability is +maintained, promoting transparency and trust in the blockchain's history._ ## 4. Ledger Consistency and Propagation - ### Consistent View of the Ledger -All honest nodes in the network should eventually have a consistent view of the ledger, agreeing on the order and content of blocks. - -_This test ensures that all honest nodes converge on a consistent ledger view. It validates that nodes reach consensus on the order and content of blocks, preventing forks and ensuring a single, agreed-upon version of the blockchain._ +All honest nodes in the network should eventually have a consistent view of the +ledger, agreeing on the order and content of blocks. +_This test ensures that all honest nodes converge on a consistent ledger view. +It validates that nodes reach consensus on the order and content of blocks, +preventing forks and ensuring a single, agreed-upon version of the blockchain._ ### Block Propagation -Every new block that is mined or created should eventually be received by every honest node in the network. - -_This test checks the blockchain node's efficiency in propagating newly created blocks throughout the network. It verifies that no node is excluded from receiving critical block updates, maintaining the blockchain's integrity._ +Every new block that is mined or created should eventually be received by every +honest node in the network. +_This test checks the blockchain node's efficiency in propagating newly created +blocks throughout the network. It verifies that no node is excluded from +receiving critical block updates, maintaining the blockchain's integrity._ ### Transaction/Snark Propagation -Every transaction/snark broadcasted by a user should eventually be received and processed by the miners or validators in the network. - -_This test examines the node's ability to promptly disseminate user-generated transactions and Snarks to the network. It ensures that these transactions are reliably processed by miners or validators, facilitating efficient transaction processing._ +Every transaction/snark broadcasted by a user should eventually be received and +processed by the miners or validators in the network. +_This test examines the node's ability to promptly disseminate user-generated +transactions and Snarks to the network. It ensures that these transactions are +reliably processed by miners or validators, facilitating efficient transaction +processing._ ## 5. Blockchain Progress and Fairness - ### Chain Progress -New blocks should be added to the blockchain at regular intervals, ensuring that the system continues to process transactions. - -_This test assesses whether the blockchain node can consistently add new blocks to the chain at regular intervals. It guarantees that the blockchain remains operational and can accommodate a continuous influx of transactions._ +New blocks should be added to the blockchain at regular intervals, ensuring that +the system continues to process transactions. +_This test assesses whether the blockchain node can consistently add new blocks +to the chain at regular intervals. It guarantees that the blockchain remains +operational and can accommodate a continuous influx of transactions._ ### Fairness in Transaction Processing -Transactions should not be perpetually ignored or deprioritized by the network. Honest transactions should eventually get processed. - -_This test evaluates the node's fairness in processing transactions. We want to ensure that no valid transactions are unjustly ignored or delayed, maintaining a fair and efficient transaction processing system._ +Transactions should not be perpetually ignored or deprioritized by the network. +Honest transactions should eventually get processed. +_This test evaluates the node's fairness in processing transactions. We want to +ensure that no valid transactions are unjustly ignored or delayed, maintaining a +fair and efficient transaction processing system._ ## 6. Scalability and upgradibility - ### Network Scalability -As the number of participants or the rate of transactions increases, the network should still maintain its liveness properties. - -_This test examines how well the blockchain network can handle increased traffic and participation without compromising its liveness properties, ensuring that it remains robust and responsive as it scales._ +As the number of participants or the rate of transactions increases, the network +should still maintain its liveness properties. +_This test examines how well the blockchain network can handle increased traffic +and participation without compromising its liveness properties, ensuring that it +remains robust and responsive as it scales._ ### Upgradability -The network should be able to upgrade or change protocols without halting or fragmenting, ensuring continuous operation - -_This test ensures that the blockchain network can seamlessly undergo protocol upgrades or changes without causing disruptions or fragmenting the network. It supports the network's adaptability and longevity._ +The network should be able to upgrade or change protocols without halting or +fragmenting, ensuring continuous operation -These expanded descriptions provide a comprehensive understanding of the key tests for assessing the functionality and security of a blockchain node. Each test contributes to the overall robustness and reliability of the blockchain network. +_This test ensures that the blockchain network can seamlessly undergo protocol +upgrades or changes without causing disruptions or fragmenting the network. It +supports the network's adaptability and longevity._ +These expanded descriptions provide a comprehensive understanding of the key +tests for assessing the functionality and security of a blockchain node. Each +test contributes to the overall robustness and reliability of the blockchain +network. ## 7. How to run tests cargo test --release --features scenario-generators - - diff --git a/docs/why-openmina.md b/docs/why-openmina.md index 6d5d14198..6844f0d02 100644 --- a/docs/why-openmina.md +++ b/docs/why-openmina.md @@ -1,18 +1,33 @@ - # Why are we developing the Open Mina node and the Mina Web node? ## Diversifying the Mina ecosystem -Just as with any blockchain, Mina benefits from increasing its diversity of nodes. It contributes to the network’s security, improves the protocol's clarity and ensures transparency across the blockchain. Additionally, it fosters an environment conducive to innovation while also safeguarding the integrity of all network participants. +Just as with any blockchain, Mina benefits from increasing its diversity of +nodes. It contributes to the network’s security, improves the protocol's clarity +and ensures transparency across the blockchain. Additionally, it fosters an +environment conducive to innovation while also safeguarding the integrity of all +network participants. ## Choosing Rust for safety and stability -In systems responsible for safeguarding financial data, such as Mina, security and stability are of paramount importance. Hence, we have chosen Rust as our preferred language due to its exceptional levels of security, memory safety, and its ability to prevent concurrency issues, such as race conditions. +In systems responsible for safeguarding financial data, such as Mina, security +and stability are of paramount importance. Hence, we have chosen Rust as our +preferred language due to its exceptional levels of security, memory safety, and +its ability to prevent concurrency issues, such as race conditions. ## Increasing network resilience -With multiple development teams actively engaged in creating various node implementations, the identification and resolution of bugs become a smoother process, diminishing the likelihood of negative impacts on the ecosystem. Since the burden of chain validation isn't concentrated in a single implementation, any bugs that do arise are effectively isolated within a limited subset of nodes, minimizing the potential impact on the entire blockchain. +With multiple development teams actively engaged in creating various node +implementations, the identification and resolution of bugs become a smoother +process, diminishing the likelihood of negative impacts on the ecosystem. Since +the burden of chain validation isn't concentrated in a single implementation, +any bugs that do arise are effectively isolated within a limited subset of +nodes, minimizing the potential impact on the entire blockchain. ## More node options for the Mina community -Lastly, users always benefit from having more options for running a Mina node. People may have different preferences when it comes to node implementations. Each programming language brings its own unique set of features to the table. The availability of a wide range of nodes empowers users to make choices that align with their specific preferences and requirements. +Lastly, users always benefit from having more options for running a Mina node. +People may have different preferences when it comes to node implementations. +Each programming language brings its own unique set of features to the table. +The availability of a wide range of nodes empowers users to make choices that +align with their specific preferences and requirements. diff --git a/frontend/README.md b/frontend/README.md index 7bff8ff86..139e8b1d0 100644 --- a/frontend/README.md +++ b/frontend/README.md @@ -1,6 +1,7 @@ # Openmina Frontend -This is a simple Angular application that will help you to see the behaviour of your local rust based mina node. +This is a simple Angular application that will help you to see the behaviour of +your local rust based mina node. ## Prerequisites @@ -23,7 +24,8 @@ nvm install 23.1.0 #### Windows -Download [Node.js v23.1.0](https://nodejs.org/) from the official website, open the installer and follow the prompts to complete the installation. +Download [Node.js v23.1.0](https://nodejs.org/) from the official website, open +the installer and follow the prompts to complete the installation. ### 2. Angular CLI v17.3.0 @@ -53,7 +55,10 @@ npm start # Using O1JS wrapper -as of now, o1js is not prepared to work with Angular, therefore we need to use the wrapper that is provided in the `src/assets/o1js` folder. This wrapper is a simple javascript webpack based application that will allow us to use the o1js library in our Angular application. +as of now, o1js is not prepared to work with Angular, therefore we need to use +the wrapper that is provided in the `src/assets/o1js` folder. This wrapper is a +simple javascript webpack based application that will allow us to use the o1js +library in our Angular application. How to use it: @@ -70,4 +75,5 @@ npm install npm run build-o1jswrapper ``` -4. That's it. Now you can use your code from o1js-wrapper inside the Angular application by using `BenchmarksWalletsZkService => o1jsInterface` +4. That's it. Now you can use your code from o1js-wrapper inside the Angular + application by using `BenchmarksWalletsZkService => o1jsInterface` diff --git a/genesis_ledgers/README.md b/genesis_ledgers/README.md index 74bf75cbe..8d9091f6f 100644 --- a/genesis_ledgers/README.md +++ b/genesis_ledgers/README.md @@ -1 +1 @@ -See https://github.com/openmina/openmina-genesis-ledgers \ No newline at end of file +See https://github.com/openmina/openmina-genesis-ledgers diff --git a/helm/README.md b/helm/README.md index aa67f65fa..17848364b 100644 --- a/helm/README.md +++ b/helm/README.md @@ -2,23 +2,26 @@ ## Runing Openmina Snarker for Berkeley Testnet -This will start openmina snarker using account, chain ID and fee as specified in [openmina/values.yaml]. -``` sh +This will start openmina snarker using account, chain ID and fee as specified in +[openmina/values.yaml]. + +```sh helm install openmina ./openmina ``` To use e.g. different account and fee, run -``` sh +```sh helm install openmina ./openmina --set=openmina.snarkerPublicKey= --set=openmina.fee= ``` ## Runnig Openmina Nodes with Bootstrap Replayer -The command [bootstrap.sh] prints Helm commands that should be executed to install all needed pieces: +The command [bootstrap.sh] prints Helm commands that should be executed to +install all needed pieces: -``` sh -$ ./bootstrap.sh +```sh +$ ./bootstrap.sh helm upgrade --install bootstrap-replayer ./bootstrap-replayer helm upgrade --install openmina1 ./openmina --set=openmina.peers="/dns4/bootstrap-bootstrap-replayer/tcp/8302/p2p/12D3KooWETkiRaHCdztkbmrWQTET9HMWimQPx5sH5pLSRZNxRsjw /2axsdDAiiZee7hUsRPMtuyHt94UMrvJmMQDhDjKhdRhgqkMdy8e/http/openmina1/3000 /2bpACUcRh2u7WJ3zSBRWZZvQMTMofYr9SGQgcP2YKzwwDKanNAy/http/openmina2/3000 /2aQA3swTKVf16YgLXZS7TizU7ASgZ8LidEgyHhChpDinrvM9NMi/http/openmina3/3000" --set=openmina.secretKey=5KJKg7yAbYAQcNGWcKFf2C4ruJxwrHoQvsksU16yPzFzXHMsbMc helm upgrade --install openmina2 ./openmina --set=openmina.peers="/dns4/bootstrap-bootstrap-replayer/tcp/8302/p2p/12D3KooWETkiRaHCdztkbmrWQTET9HMWimQPx5sH5pLSRZNxRsjw /2axsdDAiiZee7hUsRPMtuyHt94UMrvJmMQDhDjKhdRhgqkMdy8e/http/openmina1/3000 /2bpACUcRh2u7WJ3zSBRWZZvQMTMofYr9SGQgcP2YKzwwDKanNAy/http/openmina2/3000 /2aQA3swTKVf16YgLXZS7TizU7ASgZ8LidEgyHhChpDinrvM9NMi/http/openmina3/3000" --set=openmina.secretKey=5JgkZGzHPC2SmQqRGxwbFjZzFMLvab5tPwkiN29HX9Vjc9rtwV4 diff --git a/ledger/README.md b/ledger/README.md index 3f24ff684..ee674f74b 100644 --- a/ledger/README.md +++ b/ledger/README.md @@ -3,15 +3,15 @@ Rust implementation of the mina ledger ## Run tests: + ```bash cargo test --release ``` ## Run tests on wasm: + ```bash export RUSTFLAGS="-C target-feature=+atomics,+bulk-memory,+mutable-globals -C link-arg=--max-memory=4294967296" wasm-pack test --release --chrome --headless -- -Z build-std=std,panic_abort # browser chrome wasm-pack test --release --firefox --headless -- -Z build-std=std,panic_abort # browser firefox ``` - - diff --git a/macros/src/action_event.md b/macros/src/action_event.md index 6267af32b..388b48849 100644 --- a/macros/src/action_event.md +++ b/macros/src/action_event.md @@ -4,7 +4,6 @@ Derives `[openmina_core::ActionEvent]` trait implementation for action. For action containers, it simply delegates to inner actions. - ```rust # use openmina_core::ActionEvent; # @@ -81,7 +80,6 @@ impl openmina_core::ActionEvent for Action { ### Summary field - If an action has doc-comment, its first line will be used for `summary` field of tracing events for the action. @@ -116,7 +114,8 @@ impl openmina_core::ActionEvent for Action { ### Fields -Certain fields can be added to the tracing event, using `#[action_event(fields(...))]` attribute. +Certain fields can be added to the tracing event, using +`#[action_event(fields(...))]` attribute. ```rust #[derive(openmina_core::ActionEvent)] diff --git a/macros/src/serde_yojson_enum.md b/macros/src/serde_yojson_enum.md index 8ac7730a4..3239d5443 100644 --- a/macros/src/serde_yojson_enum.md +++ b/macros/src/serde_yojson_enum.md @@ -4,18 +4,22 @@ This module provides a custom derive macro `SerdeYojsonEnum` that generates implementations for `serde::Serialize` and `serde::Deserialize` for enums. The macro is designed to serialize and deserialize enums in a format compatible -with the OCaml `yojson` format. Each enum variant is serialized as a tuple where -the first element is a string representing the variant name converted to snake_case -with the first letter capitalized (UpperCamelCase to Snake_case), and the following -elements are the variant's fields. +with the OCaml `yojson` format. Each enum variant is serialized as a tuple where +the first element is a string representing the variant name converted to +snake_case with the first letter capitalized (UpperCamelCase to Snake_case), and +the following elements are the variant's fields. ## Supported Enum Variants -- **Unit variants**: These are serialized as a single-element tuple, with just the variant name in Snake_case (capitalized first letter). -- **Struct-like variants**: These are serialized as a two-element tuple, with the variant name in Snake_case and - a JSON object representing the named fields. -- **Tuple-like variants with a single tuple element**: These are serialized as a tuple with the variant name in Snake_case - followed by the serialized fields of the contained tuple. The macro can handle single tuple elements of arbitrary length. +- **Unit variants**: These are serialized as a single-element tuple, with just + the variant name in Snake_case (capitalized first letter). +- **Struct-like variants**: These are serialized as a two-element tuple, with + the variant name in Snake_case and a JSON object representing the named + fields. +- **Tuple-like variants with a single tuple element**: These are serialized as a + tuple with the variant name in Snake_case followed by the serialized fields of + the contained tuple. The macro can handle single tuple elements of arbitrary + length. ## Example @@ -26,3 +30,4 @@ enum ExampleEnum { NamedFieldsVariant { field1: String, field2: i32 }, TupleVariant((String, i32, bool)), } +``` diff --git a/mina-p2p-messages/README.md b/mina-p2p-messages/README.md index fc3bb5acc..e92b902e7 100644 --- a/mina-p2p-messages/README.md +++ b/mina-p2p-messages/README.md @@ -8,6 +8,7 @@ This types are generated from `bin_prot` shapes. ## Decoding `bin_prot` Stream To decode a binary gossip message, one of the following types should be used: + - `mina_p2p_messages::p2p::MinaBlockExternalTransitionRawVersionedStable` for an External Transition message - `mina_p2p_messages::p2p::NetworkPoolSnarkPoolDiffVersionedStable` for a Snark @@ -22,7 +23,7 @@ three kinds of messages to be used over the wire. Each message implement `binprot::BinProtRead` trait, so e.g. for reading an external transition, use the following: -``` rust +```rust let external_transition = MinaBlockExternalTransitionRawVersionedStable::binprot_read(&mut ptr)?; ``` @@ -30,7 +31,7 @@ external transition, use the following: All types implement `serde` serialization, so they can be easily turned into JSON: -``` rust +```rust let external_transition_json = serde_json::to_string(&external_transition)?; ``` @@ -44,17 +45,18 @@ listed in the files [types-v1.txt](types-v1.txt) and To generate Mina V2 types, use the following command: -``` sh +```sh mina-types shapes/berkeley-b1facec.txt gen \ --config default-v2.toml \ --out src/v2/generated.rs $(cat types-v2.txt) ``` -Note that still some additional manual work is needed, like reverting missing derives. +Note that still some additional manual work is needed, like reverting missing +derives. The `mina-types` executable can be built using the following command: -``` sh +```sh cargo install --git https://github.com/openmina/bin-prot-rs --bin mina-types ``` diff --git a/node/README.md b/node/README.md index 0f4ec2a28..56189d4d9 100644 --- a/node/README.md +++ b/node/README.md @@ -1,13 +1,14 @@ ## `openmina-node` -Combines all state machines of the node into one state machine, which -has all the logic of the node, except services. -Services are abstracted away, so the node's core logic can be run with -any arbitrary service which implements [Service](src/service.rs) trait. That way we avoid -core logic being platform-dependant and enable better/easier testing. +Combines all state machines of the node into one state machine, which has all +the logic of the node, except services. -**NOTE:** Services are mostly just IO or computationally heavy tasks that -we want to run in another thread. +Services are abstracted away, so the node's core logic can be run with any +arbitrary service which implements [Service](src/service.rs) trait. That way we +avoid core logic being platform-dependant and enable better/easier testing. + +**NOTE:** Services are mostly just IO or computationally heavy tasks that we +want to run in another thread. --- diff --git a/node/common/README.md b/node/common/README.md index 0b94cf12d..8f9dd26e9 100644 --- a/node/common/README.md +++ b/node/common/README.md @@ -1,4 +1,5 @@ ## `openmina-node-common` -Exports partial implementation of [Service](src/service.rs), which will be reused between -different platform dependant implementations. E.g. this logic will be -reused in wasm and native builds. + +Exports partial implementation of [Service](src/service.rs), which will be +reused between different platform dependant implementations. E.g. this logic +will be reused in wasm and native builds. diff --git a/node/invariants/README.md b/node/invariants/README.md index 50db9cb20..26705fd7c 100644 --- a/node/invariants/README.md +++ b/node/invariants/README.md @@ -2,20 +2,19 @@ Defines node invariants that must always hold true. -For performance reasons, invariants won't be checked when running the node, -but they will be checked when using node replayer or when running testing +For performance reasons, invariants won't be checked when running the node, but +they will be checked when using node replayer or when running testing scenarios/simulations. ## Creating a new invariant -1. Add a new struct with an unique name, ideally name of which makes - it clear what it's about. +1. Add a new struct with an unique name, ideally name of which makes it clear + what it's about. 2. Derive macros: ` #[derive(documented::Documented, Default, Clone, Copy)]`. 3. Add doc comment to the struct further describing what invariant checks for. 4. Implement an `Invariant` trait for it. 5. Add an invariant in the [invariants definition list](src/lib.rs#L64). - ## Invariant internal state ```rust @@ -25,24 +24,23 @@ trait Invariant { } ``` - -Internal state of the invariant can be used to preserve state across -this invariant checks. +Internal state of the invariant can be used to preserve state across this +invariant checks. With this an `Invariant` can be used to represent either safety or liveness conditions (concepts from TLA+). -- If state doesn't need to be preserved across invariant checks, then we -are working with stateless safety condition, which can be represented -by an `Invariant` where `Invariant::InternalState = ()`. -- If state does need to be preserved across invariant checks, then we -are working with liveness condition, which can be represented -by an `Invariant` where `Invariant::InternalState = TypeRepresentingNecessaryData`. +- If state doesn't need to be preserved across invariant checks, then we are + working with stateless safety condition, which can be represented by an + `Invariant` where `Invariant::InternalState = ()`. +- If state does need to be preserved across invariant checks, then we are + working with liveness condition, which can be represented by an `Invariant` + where `Invariant::InternalState = TypeRepresentingNecessaryData`. Can be any type satisfying bounds: `InternalState: 'static + Send + Default`. -Storing and loading of the internal state is fully taken care of by -the framework. +Storing and loading of the internal state is fully taken care of by the +framework. ## Invariant triggers diff --git a/node/native/README.md b/node/native/README.md index 95efe9f32..249559499 100644 --- a/node/native/README.md +++ b/node/native/README.md @@ -1,3 +1,4 @@ ## `openmina-node-native` + Exports default [Service](src/service.rs) to be used in the natively (Linux/Mac/Windows) running node. diff --git a/node/src/transition_frontier/sync/ledger/staged/README.md b/node/src/transition_frontier/sync/ledger/staged/README.md index 4b261af4e..fa04c5d47 100644 --- a/node/src/transition_frontier/sync/ledger/staged/README.md +++ b/node/src/transition_frontier/sync/ledger/staged/README.md @@ -1,9 +1,10 @@ ## Sync Staged Ledger -At this point, we already have synced snarked ledger that the staged -ledger builds on top of. Now we need to: -1. Fetch additional parts (scan state, pending coinbases, etc...) - necessary for reconstructing staged ledger. +At this point, we already have synced snarked ledger that the staged ledger +builds on top of. Now we need to: + +1. Fetch additional parts (scan state, pending coinbases, etc...) necessary for + reconstructing staged ledger. 2. Use fetched parts along with already synced snarked ledger in order to reconstruct staged ledger. @@ -43,3 +44,4 @@ LedgerService-- send result --> event_source event_source-.->StagedReconstructError--->wait event_source--->StagedReconstructSuccess-->cont end +``` diff --git a/node/testing/peer-discovery-test.md b/node/testing/peer-discovery-test.md index a05125320..c060eebdb 100644 --- a/node/testing/peer-discovery-test.md +++ b/node/testing/peer-discovery-test.md @@ -1,47 +1,49 @@ - # Peer discovery with OCaml nodes -A diverse blockchain network consisting of at least two different node implementations is more decentralized, robust, and resilient to external as well as internal threats. However, with two different node implementations, we must also develop cross-compatibility between them. - -Peer discovery between the two node implementations is a good starting point. We want to ensure that native Mina nodes written in OCaml can discover and connect to the Rust-based Openmina node. +A diverse blockchain network consisting of at least two different node +implementations is more decentralized, robust, and resilient to external as well +as internal threats. However, with two different node implementations, we must +also develop cross-compatibility between them. -We have developed a global test to ensure that any OCaml node can discover and connect to the Openmina node. +Peer discovery between the two node implementations is a good starting point. We +want to ensure that native Mina nodes written in OCaml can discover and connect +to the Rust-based Openmina node. +We have developed a global test to ensure that any OCaml node can discover and +connect to the Openmina node. ### Steps -In these diagrams, we describe three different types of connections between peers: +In these diagrams, we describe three different types of connections between +peers: legend - - - -1. We launch an OCaml node as a seed node. We run three additional non-seed OCaml nodes, connecting only to the seed node. - +1. We launch an OCaml node as a seed node. We run three additional non-seed + OCaml nodes, connecting only to the seed node. ![PeerDiscovery-step1](https://github.com/openmina/openmina/assets/60480123/48944999-602c-473f-856e-dcdeac584746) - 2. Wait 3 minutes for the OCaml nodes to start and connect to the seed node. ![PeerDiscovery-step2](https://github.com/openmina/openmina/assets/60480123/25a75d51-6e27-4623-84ef-74084810d96e) -3. Run the Openmina node (application under test). +3. Run the Openmina node (application under test). - Wait for the Openmina node to complete peer discovery and connect to all four OCaml nodes. This step ensures that the Openmina node can discover OCaml nodes. + Wait for the Openmina node to complete peer discovery and connect to all four + OCaml nodes. This step ensures that the Openmina node can discover OCaml + nodes. ![PeerDiscovery-step3](https://github.com/openmina/openmina/assets/60480123/806fc07c-e4d8-4495-b4ff-d68738406353) - -4. Run another OCaml node that only knows the address of the OCaml seed node, and will only connect to the seed node. +4. Run another OCaml node that only knows the address of the OCaml seed node, + and will only connect to the seed node. ![PeerDiscovery-step4](https://github.com/openmina/openmina/assets/60480123/0e1b12ac-3de6-4d68-84bb-99eeca9a107a) - -5. Wait for the additional OCaml node to initiate a connection to the Openmina node. (This step ensures that the OCaml node can discover the Openmina node). +5. Wait for the additional OCaml node to initiate a connection to the Openmina + node. (This step ensures that the OCaml node can discover the Openmina node). ![PeerDiscovery-step5](https://github.com/openmina/openmina/assets/60480123/8dcd2cd5-8926-4502-b9c6-972f1dee9aae) - 6. Fail the test on the timeout. diff --git a/node/testing/src/scenarios/multi_node/connection_discovery.md b/node/testing/src/scenarios/multi_node/connection_discovery.md index aa835fe62..82af9b8a6 100644 --- a/node/testing/src/scenarios/multi_node/connection_discovery.md +++ b/node/testing/src/scenarios/multi_node/connection_discovery.md @@ -1,90 +1,107 @@ - # **OCaml Peers Discovery Tests** -As we develop the Openmina node, a new Rust-based node implementation for the Mina network, we must ensure that the Rust node can utilize the Kademlia protocol (KAD) to discover and connect to OCaml nodes, and vice versa. - -For that purpose, we have developed a series of basic tests that check the correct peer discovery via KAD when the Rust node is connected to OCaml peers. +As we develop the Openmina node, a new Rust-based node implementation for the +Mina network, we must ensure that the Rust node can utilize the Kademlia +protocol (KAD) to discover and connect to OCaml nodes, and vice versa. +For that purpose, we have developed a series of basic tests that check the +correct peer discovery via KAD when the Rust node is connected to OCaml peers. ## **OCaml to Rust** -This test ensures that after an OCaml node connects to the Rust node, its address \ -becomes available in the Rust node’s Kademlia state. It also checks whether the OCaml \ -node has a peer with the correct peer_id and a port corresponding to the Rust node. +This test ensures that after an OCaml node connects to the Rust node, its +address \ +becomes available in the Rust node’s Kademlia state. It also checks whether the +OCaml \ +node has a peer with the correct peer_id and a port corresponding to the Rust +node. **Steps:** - - 1. Configure and launch a Rust node 2. Start an OCaml node with the Rust node as the only peer -3. Run the Rust node until it receives an event signaling that the OCaml node is connected -4. Wait for an event Identify that is used to identify the remote peer’s address and port -5. Check that the Rust node has an address of the OCaml node in its Kademlia part of the state - +3. Run the Rust node until it receives an event signaling that the OCaml node is + connected +4. Wait for an event Identify that is used to identify the remote peer’s address + and port +5. Check that the Rust node has an address of the OCaml node in its Kademlia + part of the state ## **Rust to OCaml** -This test ensures that after the Rust node connects to an OCaml node with a known \ -address, it adds its address to its Kademlia state. It also checks that the OCaml \ -node has a peer with the correct peer_id and port corresponding to the Rust node. +This test ensures that after the Rust node connects to an OCaml node with a +known \ +address, it adds its address to its Kademlia state. It also checks that the +OCaml \ +node has a peer with the correct peer_id and port corresponding to the Rust +node. **Steps:** - - 1. Start an OCaml node and wait for its p2p to be ready 2. Start a Rust node and initiate its connection to the OCaml node -3. Run the Rust node until it receives an event signaling that connection is established -4. Run the Rust node until it receives a Kademlia event signaling that the address of the OCaml node has been added -5. Check that the Rust node has an address of the OCaml node in its Kademlia part of the state - +3. Run the Rust node until it receives an event signaling that connection is + established +4. Run the Rust node until it receives a Kademlia event signaling that the + address of the OCaml node has been added +5. Check that the Rust node has an address of the OCaml node in its Kademlia + part of the state ## **OCaml to Rust via seed Node** -This test ensures that an OCaml node can connect to the Rust node, the address of which can only be discovered from an OCaml seed node, and its address becomes available in the Rust node’s Kademlia state. It also checks whether the OCaml node has a peer with the correct peer_id and a port corresponding to the Rust node. +This test ensures that an OCaml node can connect to the Rust node, the address +of which can only be discovered from an OCaml seed node, and its address becomes +available in the Rust node’s Kademlia state. It also checks whether the OCaml +node has a peer with the correct peer_id and a port corresponding to the Rust +node. **Steps:** - - 1. Start an OCaml node acting as a seed node and wait for its P2P to be ready 2. Start a Rust node and initiate its connection to the seed node -3. Run the Rust node until it receives an event signaling that connection is established +3. Run the Rust node until it receives an event signaling that connection is + established 4. Start an OCaml node acting with the seed node as its peer -5. Run the Rust node until it receives an event signaling that the connection with the OCaml node has been established -6. Check that the Rust node has an address of the OCaml node in its Kademlia part of the state - +5. Run the Rust node until it receives an event signaling that the connection + with the OCaml node has been established +6. Check that the Rust node has an address of the OCaml node in its Kademlia + part of the state ## **Rust to OCaml via seed Node** -This test ensures that the Rust node can connect to an OCaml peer, the address of whom can only be discovered from an OCaml seed node, and that the Rust node adds its address to its Kademlia state. It also checks whether the OCaml node has a peer with the correct peer_id and port corresponding to the Rust node. +This test ensures that the Rust node can connect to an OCaml peer, the address +of whom can only be discovered from an OCaml seed node, and that the Rust node +adds its address to its Kademlia state. It also checks whether the OCaml node +has a peer with the correct peer_id and port corresponding to the Rust node. **Steps:** - - 1. Start an OCaml node acting as a seed node -2. Start an OCaml node acting with the seed node as its peer and wait for its p2p to be ready +2. Start an OCaml node acting with the seed node as its peer and wait for its + p2p to be ready 3. Start a Rust node and initiate its connection to the seed node -4. Run the Rust node until it receives an event signaling that connection with the seed node is established -5. Run the Rust node until it receives an event signaling that connection with the non-seed OCaml node is established -6. Check that the Rust node has an address of the OCaml node in its Kademlia part of the state - +4. Run the Rust node until it receives an event signaling that connection with + the seed node is established +5. Run the Rust node until it receives an event signaling that connection with + the non-seed OCaml node is established +6. Check that the Rust node has an address of the OCaml node in its Kademlia + part of the state ## **Rust as a Seed Node** This test ensures that the Rust node can work as a seed node by running two \ -OCaml nodes that only know about the Rust node’s address. After these nodes connect \ -to the Rust node, the test makes sure that they also have each other’s addresses \ +OCaml nodes that only know about the Rust node’s address. After these nodes +connect \ +to the Rust node, the test makes sure that they also have each other’s addresses +\ as their peers. **Steps:** - - 1. Start a Rust node 2. Start two OCaml nodes, specifying the Rust node address as their peer -3. Wait for events indicating that connections with both OCaml nodes are established +3. Wait for events indicating that connections with both OCaml nodes are + established 4. Check that both OCaml nodes have each other’s address as their peers -5. Check that the Rust node has addresses of both OCaml nodes in the Kademlia state +5. Check that the Rust node has addresses of both OCaml nodes in the Kademlia + state diff --git a/node/testing/src/scenarios/readme.md b/node/testing/src/scenarios/readme.md index e5273c696..3f1e1fbc0 100644 --- a/node/testing/src/scenarios/readme.md +++ b/node/testing/src/scenarios/readme.md @@ -1,142 +1,214 @@ - # Network Connectivity and Peer Management - ## Network Connectivity -Nodes that get disconnected should eventually be able to reconnect and synchronize with the network. +Nodes that get disconnected should eventually be able to reconnect and +synchronize with the network. -_This test assesses the blockchain node's ability to maintain consistent network connectivity. It evaluates whether a node can gracefully handle temporary disconnections from the network and subsequently reestablish connections. _ +_This test assesses the blockchain node's ability to maintain consistent network +connectivity. It evaluates whether a node can gracefully handle temporary +disconnections from the network and subsequently reestablish connections. _ -We want to ensure that new nodes can join the network and handle being overwhelmed with connections or data requests, including various resilience and stability conditions (e.g., handling reconnections, latency, intermittent connections, and dynamic IP handling). +We want to ensure that new nodes can join the network and handle being +overwhelmed with connections or data requests, including various resilience and +stability conditions (e.g., handling reconnections, latency, intermittent +connections, and dynamic IP handling). -This is crucial for ensuring that no node is permanently isolated and can always participate in the blockchain's consensus process. +This is crucial for ensuring that no node is permanently isolated and can always +participate in the blockchain's consensus process. We are testing two versions of the node: - ### Solo node -We want to be able to test whether the Rust node is compatible with the OCaml node. We achieve this by attempting to connect the Openmina node to the existing OCaml testnet. +We want to be able to test whether the Rust node is compatible with the OCaml +node. We achieve this by attempting to connect the Openmina node to the existing +OCaml testnet. -For that purpose, we are utilizing a _solo node_, which is a single Open Mina node connected to a network of OCaml nodes. Currently, we are using the public testnet, but later on we want to use our own network of OCaml nodes on our cluster. +For that purpose, we are utilizing a _solo node_, which is a single Open Mina +node connected to a network of OCaml nodes. Currently, we are using the public +testnet, but later on we want to use our own network of OCaml nodes on our +cluster. -This test is performed by launching an Openmina node and connecting it to seed nodes of the public (or private) OCaml testnet. +This test is performed by launching an Openmina node and connecting it to seed +nodes of the public (or private) OCaml testnet. _The source code for this test can be found in this repo:_ -[https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/solo_node/basic_connectivity_initial_joining.rs](https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/solo_node/basic_connectivity_initial_joining.rs) - - +[https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/solo_node/basic_connectivity_initial_joining.rs](https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/solo_node/basic_connectivity_initial_joining.rs) We are testing these scenarios: +1. Whether the Openmina node can accept an incoming connection from OCaml node. + This test will prove our Openmina node is listening to incoming connections + and can accept them. +2. Whether the OCaml node can discover and connect to an Openmina node that is + advertising itself. This is done by advertising the Openmina node so that the + OCaml node can discover it and connect to the node. + This test is the same as the previous one, except we do not inform the OCaml + node to connect to it explicitly, it should find it automatically and connect + using peer discovery (performed through Kademlia). This test will ensure the + Openmina node uses Kademlia in a way that is compatible with the OCaml node. -1. Whether the Openmina node can accept an incoming connection from OCaml node. This test will prove our Openmina node is listening to incoming connections and can accept them. -2. Whether the OCaml node can discover and connect to an Openmina node that is advertising itself. This is done by advertising the Openmina node so that the OCaml node can discover it and connect to the node. - - This test is the same as the previous one, except we do not inform the OCaml node to connect to it explicitly, it should find it automatically and connect using peer discovery (performed through Kademlia). This test will ensure the Openmina node uses Kademlia in a way that is compatible with the OCaml node. - - -However, with this test, we are currently experiencing problems that may be caused by OCaml nodes not being currently able to "see" the Openmina nodes, because our implementation of the p2p layer is incomplete. +However, with this test, we are currently experiencing problems that may be +caused by OCaml nodes not being currently able to "see" the Openmina nodes, +because our implementation of the p2p layer is incomplete. +We have implemented the missing protocol (Kademlia) into the p2p layer to make +OCaml nodes see our node. Despite being successfully implemented, the main test +is not working. One possible reason is that our implementation of Kademlia is +slightly incompatible with the OCaml implementation of Kademlia. -We have implemented the missing protocol (Kademlia) into the p2p layer to make OCaml nodes see our node. Despite being successfully implemented, the main test is not working. One possible reason is that our implementation of Kademlia is slightly incompatible with the OCaml implementation of Kademlia. - - -We are also missing certain p2p protocols like `/mina/peer-exchange`, `/mina/bitswap-exchange`, `/mina/node-status`, `/ipfs/id/1.0.0` - - -While these p2p protocol may not be relevant, it is possible OCaml nodes do not recognize the Openmina node because we are missing some of them. +We are also missing certain p2p protocols like `/mina/peer-exchange`, +`/mina/bitswap-exchange`, `/mina/node-status`, `/ipfs/id/1.0.0` +While these p2p protocol may not be relevant, it is possible OCaml nodes do not +recognize the Openmina node because we are missing some of them. We run these tests until: - - -* The number of known peers is greater than or equal to the maximum number of peers. -* The number of connected peers is greater than or equal to some threshold. -* The test is failed if the specified number of steps occur but the conditions are not met. - +- The number of known peers is greater than or equal to the maximum number of + peers. +- The number of connected peers is greater than or equal to some threshold. +- The test is failed if the specified number of steps occur but the conditions + are not met. #### Kademlia peer discovery -We want the Open Mina node to be able to connect to peers, both other Open Mina nodes (that are written in Rust) as well as native Mina nodes (written in OCaml). - -Native Mina nodes use Kademlia (KAD), a distributed hash table (DHT) for peer-to-peer computer networks. Hash tables are data structures that map _keys_ to _values_. Think of a hash table as a dictionary, where a word (i.e. dog) is mapped to a definition (furry, four-legged animal that barks). +We want the Open Mina node to be able to connect to peers, both other Open Mina +nodes (that are written in Rust) as well as native Mina nodes (written in +OCaml). -In Mina nodes, KAD specifies the structure of the network and the exchange of information through node lookups, which makes it efficient for locating nodes in the network. +Native Mina nodes use Kademlia (KAD), a distributed hash table (DHT) for +peer-to-peer computer networks. Hash tables are data structures that map _keys_ +to _values_. Think of a hash table as a dictionary, where a word (i.e. dog) is +mapped to a definition (furry, four-legged animal that barks). -Since we initially focused on other parts of the node, we used the RPC get_initial_peers as a sort-of workaround to connect our nodes between themselves. Now, to ensure compatibility with the native Mina node, we’ve implemented KAD for peer discovery for the Open Mina node. +In Mina nodes, KAD specifies the structure of the network and the exchange of +information through node lookups, which makes it efficient for locating nodes in +the network. +Since we initially focused on other parts of the node, we used the RPC +get_initial_peers as a sort-of workaround to connect our nodes between +themselves. Now, to ensure compatibility with the native Mina node, we’ve +implemented KAD for peer discovery for the Open Mina node. #### How does Mina utilize Kademlia? -Kademlia has two main parts - the routing table and the peer store. - - +Kademlia has two main parts - the routing table and the peer store. -1. The routing table is used to store information about network paths, enabling efficient data packet routing. It maintains peer information (peer id and network addresses) -2. The peer store is a database for storing and retrieving network peer information (peer IDs and their network addresses), forming part of the provider store. Providers are nodes that possess specific data and are willing to share or provide this data to other nodes in the network. +1. The routing table is used to store information about network paths, enabling + efficient data packet routing. It maintains peer information (peer id and + network addresses) +2. The peer store is a database for storing and retrieving network peer + information (peer IDs and their network addresses), forming part of the + provider store. Providers are nodes that possess specific data and are + willing to share or provide this data to other nodes in the network. -Peers are added to the routing table if they can communicate, support the correct protocol, and send/respond to valid queries. They are removed if unresponsive. Peers are added to the peer store when handling AddProvider messages. +Peers are added to the routing table if they can communicate, support the +correct protocol, and send/respond to valid queries. They are removed if +unresponsive. Peers are added to the peer store when handling AddProvider +messages. -A provider in Kademlia announces possession of specific data (identified by a unique key) and shares it with others. In MINA's case, all providers use the same key, which is the SHA256 hash of a specific string pattern. In MINA, every node acts as a “provider,” making the advertisement as providers redundant. Non-network nodes are filtered at the PNet layer. - -If there are no peers, KAD will automatically search for new ones. KAD will also search for new peers whenever the node is restarted. If a connection is already made, it will search for more peers every hour. +A provider in Kademlia announces possession of specific data (identified by a +unique key) and shares it with others. In MINA's case, all providers use the +same key, which is the SHA256 hash of a specific string pattern. In MINA, every +node acts as a “provider,” making the advertisement as providers redundant. +Non-network nodes are filtered at the PNet layer. +If there are no peers, KAD will automatically search for new ones. KAD will also +search for new peers whenever the node is restarted. If a connection is already +made, it will search for more peers every hour. #### Message types - - -* AddProvider - informs the peer that you can provide the information described by the specified key. -* GetProviders - a query for nodes that have already performed AddProvider. -* FindNode - is used for different purposes. It can find a place in the network where your node should be. Or it may find a node that you need to send an AddProvider (or GetProviders) message to. - +- AddProvider - informs the peer that you can provide the information described + by the specified key. +- GetProviders - a query for nodes that have already performed AddProvider. +- FindNode - is used for different purposes. It can find a place in the network + where your node should be. Or it may find a node that you need to send an + AddProvider (or GetProviders) message to. #### Potential issues identified -* An earlier issue in the Open Mina (Rust node) with incorrect provider key advertising is now fixed. -* The protocol's use in OCaml nodes might be a potential security risk; an adversary could exploit this to DoS the network. One possible solution is to treat all Kademlia peers as providers. -* The peer can choose its peer_id (not arbitrarily, the peer chooses a secret key, and then the peer_id is derived from the secret key). The peer can repeat this process until its peer_id is close to a key that identifies some desired information. Thus, the peer will be responsible for storing providers of this information. - -* The malicious peer can deliberately choose the peer_id and deny access to the information, just always say there are no providers. -This problem has been inherited from the OCaml implementation of the node. We have mitigated it by making the Openmina node not rely on GetProviders, instead, we only do AddProviders to advertise ourselves, but treat any peer of the Kademlia network as a valid Mina peer, no matter if it is a provider or not, so a malicious peer can prevent OCaml nodes from discovering us, but it will not prevent us from discovering OCaml nodes. - -* Kademlia exposes both internal and external addresses, which may be unnecessary in the MINA network and could be a security risk. - - If we test on any local network that belongs to these ranges because they will get filtered unless we manually disable (in code) these checks. -* There's also an issue with the handling of private IP ranges. -* We need more testing on support for IPv6. libp2p_helper code can't handle IPv6 for the IP range filtering. +- An earlier issue in the Open Mina (Rust node) with incorrect provider key + advertising is now fixed. +- The protocol's use in OCaml nodes might be a potential security risk; an + adversary could exploit this to DoS the network. One possible solution is to + treat all Kademlia peers as providers. +- The peer can choose its peer_id (not arbitrarily, the peer chooses a secret + key, and then the peer_id is derived from the secret key). The peer can repeat + this process until its peer_id is close to a key that identifies some desired + information. Thus, the peer will be responsible for storing providers of this + information. + +- The malicious peer can deliberately choose the peer_id and deny access to the + information, just always say there are no providers. This problem has been + inherited from the OCaml implementation of the node. We have mitigated it by + making the Openmina node not rely on GetProviders, instead, we only do + AddProviders to advertise ourselves, but treat any peer of the Kademlia + network as a valid Mina peer, no matter if it is a provider or not, so a + malicious peer can prevent OCaml nodes from discovering us, but it will not + prevent us from discovering OCaml nodes. + +- Kademlia exposes both internal and external addresses, which may be + unnecessary in the MINA network and could be a security risk. + + If we test on any local network that belongs to these ranges because they will + get filtered unless we manually disable (in code) these checks. + +- There's also an issue with the handling of private IP ranges. +- We need more testing on support for IPv6. libp2p_helper code can't handle IPv6 + for the IP range filtering. ### Multi node -We also want to test a scenario in which the network consists only of Openmina nodes. If the Openmina node is using a functionality that is implemented only in the OCaml node, and it does not perform it correctly, then we will not be able to see it with solo node test. +We also want to test a scenario in which the network consists only of Openmina +nodes. If the Openmina node is using a functionality that is implemented only in +the OCaml node, and it does not perform it correctly, then we will not be able +to see it with solo node test. -For that purpose, we utilize a Multi node test, which involves a network of our nodes, without any third party, so that the testing is completely local and under our control. +For that purpose, we utilize a Multi node test, which involves a network of our +nodes, without any third party, so that the testing is completely local and +under our control. _The source code for this test can be found in this repo:_ -[https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs#L9](https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs#L9) - +[https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs#L9](https://github.com/openmina/openmina/blob/develop/node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs#L9) #### How it's tested -**Node cluster**: We use a `ClusterRunner` utility to manage the setup and execution of test scenarios on a cluster of nodes. - -**Scenarios Enumeration**: `Scenarios` is an enum with derived traits to support iterating over the scenarios, converting them to strings, etc. It lists different test scenarios such as `SoloNodeSyncRootSnarkedLedger`, `SoloNodeBasicConnectivityInitialJoining`, and `MultiNodeBasicConnectivityInitialJoining`. +**Node cluster**: We use a `ClusterRunner` utility to manage the setup and +execution of test scenarios on a cluster of nodes. -Each scenario has a related module (e.g., `multi_node::basic_connectivity_initial_joining::MultiNodeBasicConnectivityInitialJoining`) which contains the logic for the test. +**Scenarios Enumeration**: `Scenarios` is an enum with derived traits to support +iterating over the scenarios, converting them to strings, etc. It lists +different test scenarios such as `SoloNodeSyncRootSnarkedLedger`, +`SoloNodeBasicConnectivityInitialJoining`, and +`MultiNodeBasicConnectivityInitialJoining`. -**Scenario Implementation**: The `Scenarios` enum has methods for executing tests such as `run`, `run_and_save`, and `run_only`. These methods use the `ClusterRunner` to run the scenarios and potentially save the results. +Each scenario has a related module (e.g., +`multi_node::basic_connectivity_initial_joining::MultiNodeBasicConnectivityInitialJoining`) +which contains the logic for the test. -**Dynamic Scenario Building**: There's logic (`blank_scenario`) to dynamically build a scenario's configuration, potentially from a JSON representation, which then gets executed in a test run. +**Scenario Implementation**: The `Scenarios` enum has methods for executing +tests such as `run`, `run_and_save`, and `run_only`. These methods use the +`ClusterRunner` to run the scenarios and potentially save the results. -**Async/Await**: The methods within the `Scenarios` are asynchronous (`async`), indicating that the tests are run in an asynchronous context, which is common when dealing with network operations to allow for non-blocking I/O operations. +**Dynamic Scenario Building**: There's logic (`blank_scenario`) to dynamically +build a scenario's configuration, potentially from a JSON representation, which +then gets executed in a test run. -**Parent Scenarios**: The `parent` and `parent_id` methods suggest that some scenarios may depend on others. The code constructs a hierarchy of test scenarios, ensuring parent scenarios are run before their children. +**Async/Await**: The methods within the `Scenarios` are asynchronous (`async`), +indicating that the tests are run in an asynchronous context, which is common +when dealing with network operations to allow for non-blocking I/O operations. -**Cluster Configuration and Execution**: `build_cluster_and_run_parents` is an asynchronous method for setting up a cluster according to a specified configuration and running all parent scenarios to prepare the environment for a specific test. +**Parent Scenarios**: The `parent` and `parent_id` methods suggest that some +scenarios may depend on others. The code constructs a hierarchy of test +scenarios, ensuring parent scenarios are run before their children. +**Cluster Configuration and Execution**: `build_cluster_and_run_parents` is an +asynchronous method for setting up a cluster according to a specified +configuration and running all parent scenarios to prepare the environment for a +specific test. diff --git a/node/web/README.md b/node/web/README.md index b0cfbf1ed..50b13b81b 100644 --- a/node/web/README.md +++ b/node/web/README.md @@ -1,3 +1,4 @@ ## `openmina-node-web` -Exports default [Service](src/service.rs) to be used in the web -(wasm) running node. + +Exports default [Service](src/service.rs) to be used in the web (wasm) running +node. diff --git a/p2p/libp2p.md b/p2p/libp2p.md index 27e49ebdd..118215bc8 100644 --- a/p2p/libp2p.md +++ b/p2p/libp2p.md @@ -1,192 +1,257 @@ -# Implementation of the LibP2P networking stack +# Implementation of the LibP2P networking stack -A peer-to-peer (P2P) network serves as the backbone of decentralized communication and data sharing among blockchain nodes. It enables the propagation of transaction and block information across the network, facilitating the consensus process crucial for maintaining the blockchain's integrity. Without a P2P network, nodes in the Mina blockchain would be isolated and unable to exchange vital information, leading to fragmentation and compromising the blockchain's trustless nature. +A peer-to-peer (P2P) network serves as the backbone of decentralized +communication and data sharing among blockchain nodes. It enables the +propagation of transaction and block information across the network, +facilitating the consensus process crucial for maintaining the blockchain's +integrity. Without a P2P network, nodes in the Mina blockchain would be isolated +and unable to exchange vital information, leading to fragmentation and +compromising the blockchain's trustless nature. -To begin with, we need a P2P _networking stack_, a set of protocols and layers that define how P2P communication occurs between devices over our network. Think of it as a set of rules and conventions for data transmission, addressing, and error handling, enabling different devices and applications to exchange data effectively in a networked environment. We want our networking stack to have the following features: +To begin with, we need a P2P _networking stack_, a set of protocols and layers +that define how P2P communication occurs between devices over our network. Think +of it as a set of rules and conventions for data transmission, addressing, and +error handling, enabling different devices and applications to exchange data +effectively in a networked environment. We want our networking stack to have the +following features: -For our networking stack, we are utilizing LibP2P, a modular networking stack that provides a unified framework for building decentralized P2P network applications. +For our networking stack, we are utilizing LibP2P, a modular networking stack +that provides a unified framework for building decentralized P2P network +applications. - -## LibP2P +## LibP2P LibP2P has the following features: ### Modularity -Being modular means that we can customize the stacks for various types of devices, i.e. a smartphone may use a different set of modules than a server. - +Being modular means that we can customize the stacks for various types of +devices, i.e. a smartphone may use a different set of modules than a server. ### Cohesion -Modules in the stack can communicate between each other despite differences in what each module should do according to its specification. - +Modules in the stack can communicate between each other despite differences in +what each module should do according to its specification. ### Layers -LibP2P provides vertical complexity in the form of layers. Each layer serves a specific purpose, which lets us neatly organize the various functions of the P2P network. It allows us to separate concerns, making the network architecture easier to manage and debug. - +LibP2P provides vertical complexity in the form of layers. Each layer serves a +specific purpose, which lets us neatly organize the various functions of the P2P +network. It allows us to separate concerns, making the network architecture +easier to manage and debug. JustLayers (1) +_Above: A simplified overview of the Open Mina LibP2P networking stack. The +abstraction is in an ascending order, i.e. the layers at the top have more +abstraction than the layers at the bottom._ -_Above: A simplified overview of the Open Mina LibP2P networking stack. The abstraction is in an ascending order, i.e. the layers at the top have more abstraction than the layers at the bottom._ - - -Now we describe each layer of the P2P networking stack in descending order of abstraction. - +Now we describe each layer of the P2P networking stack in descending order of +abstraction. ## RPCs -A node needs to continuously receive and send information across the P2P network. +A node needs to continuously receive and send information across the P2P +network. -For certain types of information, such as new transitions (blocks), the best tips or ban notifications, Mina nodes utilize remote procedure calls (RPCs). +For certain types of information, such as new transitions (blocks), the best +tips or ban notifications, Mina nodes utilize remote procedure calls (RPCs). -An RPC is a query for a particular type of information that is sent to a peer over the P2P network. After an RPC is made, the node expects a response from it. +An RPC is a query for a particular type of information that is sent to a peer +over the P2P network. After an RPC is made, the node expects a response from it. Mina nodes use the following RPCs. - - -* `get_staged_ledger_aux_and_pending_coinbases_at_hash` -* `answer_sync_ledger_query` -* `get_transition_chain` -* `get_transition_chain_proof` -* `Get_transition_knowledge` (note the initial capital) -* `get_ancestry` -* `ban_notify` -* `get_best_tip` -* `get_node_status` (v1 and v2) -* `Get_epoch_ledger` - - +- `get_staged_ledger_aux_and_pending_coinbases_at_hash` +- `answer_sync_ledger_query` +- `get_transition_chain` +- `get_transition_chain_proof` +- `Get_transition_knowledge` (note the initial capital) +- `get_ancestry` +- `ban_notify` +- `get_best_tip` +- `get_node_status` (v1 and v2) +- `Get_epoch_ledger` ### Kademlia for peer discovery -The P2P layer enables nodes in the Mina network to discover and connect with each other. We want the Open Mina node to be able to connect to peers, both other Open Mina nodes (that are written in Rust) as well as native Mina nodes (written in OCaml). - -To achieve that, we need to implement peer discovery via Kademlia as part of our LibP2P networking stack. Previously, we used the RPC `get_initial_peers` as a sort of workaround to connect our nodes between themselves. Now, to ensure compatibility with the native Mina node, we’ve implemented KAD for peer discovery for the Openmina node. - -Kademlia, or KAD, is a distributed hash table (DHT) for peer-to-peer computer networks. Hash tables are a type of data structure that maps _keys_ to _values_. In very broad and simplistic terms, think of a hash table as a dictionary, where a word (i.e. dog) is mapped to a definition (furry, four-legged animal that barks). In more practical terms, each key is passed through a hash function, which computes an index based on the key's content. - -KAD specifically works as a distributed hash table by storing key-value pairs across the network, where keys are mapped to nodes using the so-called XOR metric, ensuring that data can be efficiently located and retrieved by querying nodes that are closest to the key's hash. - +The P2P layer enables nodes in the Mina network to discover and connect with +each other. We want the Open Mina node to be able to connect to peers, both +other Open Mina nodes (that are written in Rust) as well as native Mina nodes +(written in OCaml). + +To achieve that, we need to implement peer discovery via Kademlia as part of our +LibP2P networking stack. Previously, we used the RPC `get_initial_peers` as a +sort of workaround to connect our nodes between themselves. Now, to ensure +compatibility with the native Mina node, we’ve implemented KAD for peer +discovery for the Openmina node. + +Kademlia, or KAD, is a distributed hash table (DHT) for peer-to-peer computer +networks. Hash tables are a type of data structure that maps _keys_ to _values_. +In very broad and simplistic terms, think of a hash table as a dictionary, where +a word (i.e. dog) is mapped to a definition (furry, four-legged animal that +barks). In more practical terms, each key is passed through a hash function, +which computes an index based on the key's content. + +KAD specifically works as a distributed hash table by storing key-value pairs +across the network, where keys are mapped to nodes using the so-called XOR +metric, ensuring that data can be efficiently located and retrieved by querying +nodes that are closest to the key's hash. #### Measuring distance via XOR -XOR is a unique feature of how KAD measures the distance between peers - it is defined as the XOR metric between two node IDs or between a node ID and a key, providing a way to measure closeness in the network's address space for efficient routing and data lookup. - -The term "XOR" stands for "exclusive or," which is a logical operation that outputs true only when the inputs differ (one is true, the other is false). See the diagram below for a visual explanation: +XOR is a unique feature of how KAD measures the distance between peers - it is +defined as the XOR metric between two node IDs or between a node ID and a key, +providing a way to measure closeness in the network's address space for +efficient routing and data lookup. +The term "XOR" stands for "exclusive or," which is a logical operation that +outputs true only when the inputs differ (one is true, the other is false). See +the diagram below for a visual explanation: ![image6](https://github.com/openmina/openmina/assets/60480123/4e57f9b9-9e68-4400-b0ad-ff17c14766a1) +_Above: A Kademlia binary tree organized into four distinct buckets (marked in +orange) of varying sizes. _ -_Above: A Kademlia binary tree organized into four distinct buckets (marked in orange) of varying sizes. _ - -The XOR metric used by Kademlia for measuring distance ensures uniformity and symmetry in distance calculations, allowing for predictable and decentralized routing without the need for hierarchical or centralized structures, which allows for better scalability and fault tolerance in our P2P network. - -LibP2P leverages Kademlia for peer discovery and DHT functionalities, ensuring efficient routing and data location in the network. In Mina nodes, KAD specifies the structure of the network and the exchange of information through node lookups, which makes it efficient for locating nodes in the network. +The XOR metric used by Kademlia for measuring distance ensures uniformity and +symmetry in distance calculations, allowing for predictable and decentralized +routing without the need for hierarchical or centralized structures, which +allows for better scalability and fault tolerance in our P2P network. +LibP2P leverages Kademlia for peer discovery and DHT functionalities, ensuring +efficient routing and data location in the network. In Mina nodes, KAD specifies +the structure of the network and the exchange of information through node +lookups, which makes it efficient for locating nodes in the network. ## Multiplexing via Yamux -In a P2P network, connections are a key resource. Establishing multiple connections between peers can be costly and impractical, particularly in a network consisting of devices with limited resources. To make the most of a single connection, we employ _multiplexing_, which means having multiple data streams transmitted over a single network connection concurrently. - +In a P2P network, connections are a key resource. Establishing multiple +connections between peers can be costly and impractical, particularly in a +network consisting of devices with limited resources. To make the most of a +single connection, we employ _multiplexing_, which means having multiple data +streams transmitted over a single network connection concurrently. ![image3](https://github.com/openmina/openmina/assets/60480123/5f6a48c7-bbae-4ca2-9189-badae2369f3d) - -For multiplexing, we utilize [_Yamux_](https://github.com/hashicorp/yamux), a multiplexer that provides efficient, concurrent handling of multiple data streams over a single connection, aligning well with the needs of modern, scalable, and efficient network protocols and applications. - +For multiplexing, we utilize [_Yamux_](https://github.com/hashicorp/yamux), a +multiplexer that provides efficient, concurrent handling of multiple data +streams over a single connection, aligning well with the needs of modern, +scalable, and efficient network protocols and applications. ## Noise encryption -We want to ensure that data exchanged between nodes remains confidential, authenticated, and resistant to tampering. For that purpose, we utilize Noise, a cryptographic protocol featuring ephemeral keys and forward secrecy, used to secure the connection. +We want to ensure that data exchanged between nodes remains confidential, +authenticated, and resistant to tampering. For that purpose, we utilize Noise, a +cryptographic protocol featuring ephemeral keys and forward secrecy, used to +secure the connection. Noise provides the following capabilities: - ### Async -Noise supports asynchronous communication, allowing nodes to communicate without both being online simultaneously can efficiently handle the kind of non-blocking I/O operations that are typical in P2P networks, where nodes may not be continuously connected, even in the asynchronous and unpredictable environments that are characteristic of blockchain P2P networks. - +Noise supports asynchronous communication, allowing nodes to communicate without +both being online simultaneously can efficiently handle the kind of non-blocking +I/O operations that are typical in P2P networks, where nodes may not be +continuously connected, even in the asynchronous and unpredictable environments +that are characteristic of blockchain P2P networks. ### Forward secrecy -Noise utilizes _ephemeral keys_, which are random keys generated for each new connection that must be destroyed after use. The use of ephemeral keys forward secrecy. This means that decrypting a segment of the data does not provide any additional ability to decrypt the rest of the data. Simply put, forward secrecy means that if an adversary gains knowledge of the secret key, they will be able to participate in the network on the behalf of the peer, but they will not be able to decrypt past nor future messages. - +Noise utilizes _ephemeral keys_, which are random keys generated for each new +connection that must be destroyed after use. The use of ephemeral keys forward +secrecy. This means that decrypting a segment of the data does not provide any +additional ability to decrypt the rest of the data. Simply put, forward secrecy +means that if an adversary gains knowledge of the secret key, they will be able +to participate in the network on the behalf of the peer, but they will not be +able to decrypt past nor future messages. ### How Noise works -The Noise protocol implemented by libp2p uses the [XX](http://www.noiseprotocol.org/noise.html#interactive-handshake-patterns-fundamental) handshake pattern, which happens in the following stages: - +The Noise protocol implemented by libp2p uses the +[XX](http://www.noiseprotocol.org/noise.html#interactive-handshake-patterns-fundamental) +handshake pattern, which happens in the following stages: ![image4](https://github.com/openmina/openmina/assets/60480123/a1b2b2bf-980e-459c-8375-9e8b6162b6d1) - - - 1. Alice sends Bob her ephemeral public key (32 bytes). - ![image2](https://github.com/openmina/openmina/assets/60480123/721103dd-0bb9-4f0b-8998-97b0cc19f6fc) - - 2. Bob responds to Alice with a message that contains: -* Bob’s ephemeral public key (32 bytes). -* Bob's static public key (32 bytes). -* The tag (MAC) of the static public key (16 bytes). -As well as a payload of extra data that includes the peer’s `identity_key`, an `identity_sig`, Noise's static public key and the tag (MAC) of the payload (16 bytes). +- Bob’s ephemeral public key (32 bytes). +- Bob's static public key (32 bytes). +- The tag (MAC) of the static public key (16 bytes). +As well as a payload of extra data that includes the peer’s `identity_key`, an +`identity_sig`, Noise's static public key and the tag (MAC) of the payload (16 +bytes). ![image5](https://github.com/openmina/openmina/assets/60480123/b7ed062d-2204-4b94-87af-abc6eecd7013) - 1. Alice responds to Bob with her own message that contains: -* Alice's static public key (32 bytes). -* The tag (MAC) of Alice’s static public key (16 bytes). -* The payload, in the same fashion as Bob does in the second step, but with Alice's information instead. -* The tag (MAC) of the payload (16 bytes). -After the messages are exchanged (two sent by Alice, the _initiator,_ and one sent by Bob, the _responder_), both parties can derive a pair of symmetric keys that can be used to cipher and decipher messages. +- Alice's static public key (32 bytes). +- The tag (MAC) of Alice’s static public key (16 bytes). +- The payload, in the same fashion as Bob does in the second step, but with + Alice's information instead. +- The tag (MAC) of the payload (16 bytes). +After the messages are exchanged (two sent by Alice, the _initiator,_ and one +sent by Bob, the _responder_), both parties can derive a pair of symmetric keys +that can be used to cipher and decipher messages. ## Pnet layer -We want to be able to determine whether the peer to whom we want to connect to is running the same network as our node. For instance, a node running on the Mina mainnet will connect to other mainnet nodes and avoid connecting to peers running on Mina’s testnet. +We want to be able to determine whether the peer to whom we want to connect to +is running the same network as our node. For instance, a node running on the +Mina mainnet will connect to other mainnet nodes and avoid connecting to peers +running on Mina’s testnet. -For that purpose, Mina utilizes pnet, an encryption transport layer that constitutes the lowest layer of libp2p. Please note that while the network (IP) and transport (TCP) layers are lower than pnet, they are not unique to LiP2P. +For that purpose, Mina utilizes pnet, an encryption transport layer that +constitutes the lowest layer of libp2p. Please note that while the network (IP) +and transport (TCP) layers are lower than pnet, they are not unique to LiP2P. -In Mina, the pnet _secret key_ refers to the chain on which the node is running, +In Mina, the pnet _secret key_ refers to the chain on which the node is running, -for instance `mina/mainnet` or `mina/testnet`. This prevents nodes from attempting connections with the incorrect chain. - -Although pnet utilizes a type of secret key known as a pre-shared key (PSK), every peer in the network knows this key. This is why, despite being encrypted, the pnet channel itself isn’t secure - that is achieved via the aforementioned Noise protocol. +for instance `mina/mainnet` or `mina/testnet`. This prevents nodes from +attempting connections with the incorrect chain. +Although pnet utilizes a type of secret key known as a pre-shared key (PSK), +every peer in the network knows this key. This is why, despite being encrypted, +the pnet channel itself isn’t secure - that is achieved via the aforementioned +Noise protocol. ## Transport -At the lowest level of abstraction, we want our P2P network to have a reliable, ordered, and error-checked method of transporting data between peers. crucial for maintaining the integrity and consistency of the blockchain. - -Libp2p connections are established by _dialing_ the peer address across a transport layer. Currently, Mina uses TCP, but it can also utilize UDP, which can be useful if we implement a node based on WebRTC. +At the lowest level of abstraction, we want our P2P network to have a reliable, +ordered, and error-checked method of transporting data between peers. crucial +for maintaining the integrity and consistency of the blockchain. -Peer addresses are written in a convention known as _Multiaddress_, which is a universal method of specifying various kinds of addresses. +Libp2p connections are established by _dialing_ the peer address across a +transport layer. Currently, Mina uses TCP, but it can also utilize UDP, which +can be useful if we implement a node based on WebRTC. -For example, let’s look at one of the addresses from the [Mina Protocol peer list](https://storage.googleapis.com/mina-seed-lists/mainnet_seeds.txt). +Peer addresses are written in a convention known as _Multiaddress_, which is a +universal method of specifying various kinds of addresses. +For example, let’s look at one of the addresses from the +[Mina Protocol peer list](https://storage.googleapis.com/mina-seed-lists/mainnet_seeds.txt). ``` /dns4/seed-1.mainnet.o1test.net/tcp/10000/p2p/12D3KooWCa1d7G3SkRxy846qTvdAFX69NnoYZ32orWVLqJcDVGHW ``` - - -* `/dns4/seed-1.mainnet.o1test.net/ `States that the domain name is resolvable only to IPv4 addresses -* `tcp/10000 `tells us we want to send TCP packets to port 10000. -* `p2p/12D3KooWCa1d7G3SkRxy846qTvdAFX69NnoYZ32orWVLqJcDVGHW` informs us of the hash of the peer’s public key, which allows us to encrypt communication with said peer. - -An address written under the _Multiaddress_ convention is ‘future-proof’ in the sense that it is backwards-compatible. For example, since multiple transports are supported, we can change `tcp `to `udp`, and the address will still be readable and valid. - +- `/dns4/seed-1.mainnet.o1test.net/ `States that the domain name is resolvable + only to IPv4 addresses +- `tcp/10000 `tells us we want to send TCP packets to port 10000. +- `p2p/12D3KooWCa1d7G3SkRxy846qTvdAFX69NnoYZ32orWVLqJcDVGHW` informs us of the + hash of the peer’s public key, which allows us to encrypt communication with + said peer. + +An address written under the _Multiaddress_ convention is ‘future-proof’ in the +sense that it is backwards-compatible. For example, since multiple transports +are supported, we can change `tcp `to `udp`, and the address will still be +readable and valid. diff --git a/p2p/readme.md b/p2p/readme.md index dfb817305..7a9a535da 100644 --- a/p2p/readme.md +++ b/p2p/readme.md @@ -1,189 +1,221 @@ # WebRTC Based P2P ## Design goals + In blockchain and especially in Mina, **security**, **decentralization**, -**scalability** and **eventual consistency** (in that order), is crucial. -Our design tries to achieve those, while building on top of -Mina Protocol's existing design (outside of p2p). +**scalability** and **eventual consistency** (in that order), is crucial. Our +design tries to achieve those, while building on top of Mina Protocol's existing +design (outside of p2p). #### Security -By security, we are mostly talking about **DDOS Resilience**, as that is -the primary concern in p2p layer. + +By security, we are mostly talking about **DDOS Resilience**, as that is the +primary concern in p2p layer. Main ways to achieve that: -1. Protocol design needs to enable an ability to identify malicious actors - as soon as possible, so that they can be punished (disconnected, blacklisted) - with minimal resource inverstment. This means, individual messages - should be small and verifiable, so we don't have to allocate bunch - of resources(CPU + RAM + NET) before we are able to process them. -2. Even if the peer isn't breaking any protocol rules, single peer - (or a group of them) shouldn't be able to consume big chunk of our - resources, so there needs to be **fairness** enabled by the protocol itself. + +1. Protocol design needs to enable an ability to identify malicious actors as + soon as possible, so that they can be punished (disconnected, blacklisted) + with minimal resource inverstment. This means, individual messages should be + small and verifiable, so we don't have to allocate bunch of resources(CPU + + RAM + NET) before we are able to process them. +2. Even if the peer isn't breaking any protocol rules, single peer (or a group + of them) shouldn't be able to consume big chunk of our resources, so there + needs to be **fairness** enabled by the protocol itself. 3. Malicious peers shouldn't be able to flood us with incoming connections, blocking us from receiving connections from genuine peers. #### Decentralization and Scalability -Mina Protocol, with it's consensus mechanism and recursive zk-snarks, -enables light-weight full clients. So anyone can run the full node -(as demonstrated with the WebNode). While this is great for decentralization, -it puts higher load on p2p network and raises it's requirements. -So we needed to come up with the design that can support hundreeds of -active connections in order to increase fault tolerance and not sacrifice +Mina Protocol, with it's consensus mechanism and recursive zk-snarks, enables +light-weight full clients. So anyone can run the full node (as demonstrated with +the WebNode). While this is great for decentralization, it puts higher load on +p2p network and raises it's requirements. + +So we needed to come up with the design that can support hundreeds of active +connections in order to increase fault tolerance and not sacrifice **scalability**, since if we have more nodes in the network but few connections -between them, [diameter](https://mathworld.wolfram.com/GraphDiameter.html) -of the network increases, so each message has to go through more hops -(each adding latency) in order to reach all nodes in the network. +between them, [diameter](https://mathworld.wolfram.com/GraphDiameter.html) of +the network increases, so each message has to go through more hops (each adding +latency) in order to reach all nodes in the network. #### Eventual Consistency -Nodes in the network should eventually come to the same state -(same best tip, transaction/snark pool), without the use of crude rebroadcasts. + +Nodes in the network should eventually come to the same state (same best tip, +transaction/snark pool), without the use of crude rebroadcasts. ## Transport Layer -**(TODO: explain why WebRTC is the best option for security and decentralization)** + +**(TODO: explain why WebRTC is the best option for security and +decentralization)** ## Poll-Based P2P -It's practically impossbile to achieve [above written design goals](#design-goals) -with a push-based approach, where messages are just sent to the peers, -without them ever requesting them, like it's done with libp2p GossipSub. -Since when you have a push based approach, most likely you can't process -all messages faster than they can be received, so you have to maintain a -queue for messages, which might even "expire" (be no longer relevant) -once we reach them. Also you can't grow queue infinitely, so you have to -drop some messages, which will break [eventual consistency](#eventual-consistency). -Also it's very bad for security as you are potentially letting peers -allocate significant amount of data. - -So instead, we decided to go with poll based approach, or more accurately, -with something resembling the [long polling](https://www.pubnub.com/guides/long-polling/). - -In a nutshell, instead of a peer flooding us with messages, we have to -request from them (send a sort of permit) for them to send us a message. -This way, the recipient controls the flow, so it can: + +It's practically impossbile to achieve +[above written design goals](#design-goals) with a push-based approach, where +messages are just sent to the peers, without them ever requesting them, like +it's done with libp2p GossipSub. Since when you have a push based approach, most +likely you can't process all messages faster than they can be received, so you +have to maintain a queue for messages, which might even "expire" (be no longer +relevant) once we reach them. Also you can't grow queue infinitely, so you have +to drop some messages, which will break +[eventual consistency](#eventual-consistency). Also it's very bad for security +as you are potentially letting peers allocate significant amount of data. + +So instead, we decided to go with poll based approach, or more accurately, with +something resembling the +[long polling](https://www.pubnub.com/guides/long-polling/). + +In a nutshell, instead of a peer flooding us with messages, we have to request +from them (send a sort of permit) for them to send us a message. This way, the +recipient controls the flow, so it can: + 1. Enforce **fairness**, that we mentioned in the scalability design goals. -2. Prevent peer from overwhelming the system, because previous message - needs to be processed, until the next is requested. +2. Prevent peer from overwhelming the system, because previous message needs to + be processed, until the next is requested. -This removes whole lot of complexity from the implementation as well. -We no longer have to worry about the message queue, what to do if we -can process messages slower than we are receiving them, if we drop them, -how do we recover them if they were relevant, etc... +This removes whole lot of complexity from the implementation as well. We no +longer have to worry about the message queue, what to do if we can process +messages slower than we are receiving them, if we drop them, how do we recover +them if they were relevant, etc... Also this unlocks whole lot of possibilities for **eventual consistency**. -Because it's not just about recipient. Now the sender has the guarantee that -the sent messages have been processed by the recipient, if they were followed -by a request for the next message. This way the sender can reason about what -the peer already has and what they lack, so it can adjust the messages -that it sends based on that. +Because it's not just about recipient. Now the sender has the guarantee that the +sent messages have been processed by the recipient, if they were followed by a +request for the next message. This way the sender can reason about what the peer +already has and what they lack, so it can adjust the messages that it sends +based on that. ## Implementation ### Connection -In order for two peers to connect to each other via WebRTC, they need -to exchange **Offer** and **Answer** messages between each other. Process of -exchanging those messages is called **Signaling**. There are many ways -to exchange them, our implementation supports two ways: -- **HTTP API** - Dialer sends an http request containing the **offer** and receives - an **answer** if peer is ok with connecting, or otherwise an error. + +In order for two peers to connect to each other via WebRTC, they need to +exchange **Offer** and **Answer** messages between each other. Process of +exchanging those messages is called **Signaling**. There are many ways to +exchange them, our implementation supports two ways: + +- **HTTP API** - Dialer sends an http request containing the **offer** and + receives an **answer** if peer is ok with connecting, or otherwise an error. - **Relay** - Dialer will discover the listener peer via relay peer. The relay peer needs to be connected to both dialer and listener, so that it can - facilitate exchange of those messages, if both parties agree to connect. - After those messages are exchanged, relay peer is no longer needed and - they establish a direct connection. + facilitate exchange of those messages, if both parties agree to connect. After + those messages are exchanged, relay peer is no longer needed and they + establish a direct connection. For security, it's best to use just the **relay** option, since it doesn't -require a node to open port accessible publicly and so it can't be flooded -with incoming connections. -Seed nodes will have to support **HTTP Api** though, so that initial connection -can be formed by new clients. +require a node to open port accessible publicly and so it can't be flooded with +incoming connections. Seed nodes will have to support **HTTP Api** though, so +that initial connection can be formed by new clients. ### Channels -We are using different [WebRTC DataChannel](https://developer.mozilla.org/en-US/docs/Web/API/RTCDataChannel) + +We are using different +[WebRTC DataChannel](https://developer.mozilla.org/en-US/docs/Web/API/RTCDataChannel) per protocol. So far, we have 8 protocols: + 1. **SignalingDiscovery** - used for discovering new peers via existing ones. -2. **SignalingExchange** - used for exchanging signaling messages via **relay** peer. +2. **SignalingExchange** - used for exchanging signaling messages via **relay** + peer. 3. **BestTipPropagation** - used for best tip propagation. Instead of the whole - block, only consensus state + block hash (things we need for consensus) - is propagated and the rest can be fetched via **Rpc** channel. -4. **TransactionPropagation** - used for transaction propagation. Only info - is sent which is necessary to determine if we want that transaction - based on the current transaction pool state. Full transaction can be - fetched with hash from **Rpc** channel. -5. **SnarkPropagation** - used for snark work propagation. Only info - is sent which is necessary to determine if we want that snark - based on the current snark pool state. Full snark can be - fetched with job id from **Rpc** channel. + block, only consensus state + block hash (things we need for consensus) is + propagated and the rest can be fetched via **Rpc** channel. +4. **TransactionPropagation** - used for transaction propagation. Only info is + sent which is necessary to determine if we want that transaction based on the + current transaction pool state. Full transaction can be fetched with hash + from **Rpc** channel. +5. **SnarkPropagation** - used for snark work propagation. Only info is sent + which is necessary to determine if we want that snark based on the current + snark pool state. Full snark can be fetched with job id from **Rpc** channel. 6. **SnarkJobCommitmentPropagation** - implemented but not used at the moment. - It's for the decentralized snark work coordination to minimize wasted resources. + It's for the decentralized snark work coordination to minimize wasted + resources. 7. **Rpc** - used for requesting specific data from peer. -8. **StreamingRpc** - used to fetch from peer big data in small verifiable chunks, - like fetching data necessary to reconstruct staged ledger at the root. +8. **StreamingRpc** - used to fetch from peer big data in small verifiable + chunks, like fetching data necessary to reconstruct staged ledger at the + root. For each channel, we have to receive a request before we can send a response. E.g. in case of **BestTipPropagation** channel, block won't be propagated until peer sends us the request. ### Efficient Pool Propagation with Eventual Consistency + To achieve scalable, eventually consistent and efficient (transaction/snark) pool propagation, we need to utilize benefits of the poll-based approach. -Since we can track which messages were processed by the peer, in order -to achieve eventual consistency, we just need to: -1. Make sure we only send pool messages if peer's best tip is same - or higher than our own, so that we make sure peer doesn't reject our - messages because it is out of sync. -2. After the connection is established, make sure all transactions/snarks - (just info) in the pool is sent to the connected peer. +Since we can track which messages were processed by the peer, in order to +achieve eventual consistency, we just need to: + +1. Make sure we only send pool messages if peer's best tip is same or higher + than our own, so that we make sure peer doesn't reject our messages because + it is out of sync. +2. After the connection is established, make sure all transactions/snarks (just + info) in the pool is sent to the connected peer. 3. Keep track of what we have already sent to the peer. 4. **(TODO eventual consistency with limited transaction pool size)** -For 2nd and 3rd points, to efficiently keep track of what messages we -have sent to which peer a special [data structure](../core/src/distributed_pool.rs) is used. -Basically it's an append only log, where each entry is indexed by a number -and if we want to update an entry, we have to remove it and append it at -the end. As new transactions/snarks are added to the pool, they are appended -to this log. +For 2nd and 3rd points, to efficiently keep track of what messages we have sent +to which peer a special [data structure](../core/src/distributed_pool.rs) is +used. Basically it's an append only log, where each entry is indexed by a number +and if we want to update an entry, we have to remove it and append it at the +end. As new transactions/snarks are added to the pool, they are appended to this +log. -With each peer we just keep an index (initially 0) of the next message -to propagate and we keep sending the next (+ jumping to the next index) -until we reach the end of the pool. This way we have to keep minimal data -with each peer and it efficiently avoids sending the same data twice. +With each peer we just keep an index (initially 0) of the next message to +propagate and we keep sending the next (+ jumping to the next index) until we +reach the end of the pool. This way we have to keep minimal data with each peer +and it efficiently avoids sending the same data twice. ## Appendix: Future Ideas ### Leveraging Local Pools for Smaller Blocks -Nodes already maintain local pools of transactions and snarks. Many of these stored items later appear in blocks. By using data the node already has, we reduce the amount of information needed for each new block. +Nodes already maintain local pools of transactions and snarks. Many of these +stored items later appear in blocks. By using data the node already has, we +reduce the amount of information needed for each new block. #### Motivation -As nodes interact with the network, they receive, verify, and store transactions and snarks in local pools. When a new block arrives, it often includes some of these items. Because the node already has them, the sender need not retransmit the data. This approach offers: -1. **Reduced Bandwidth Usage:** - Eliminating redundant transmissions of known snarks and transactions reduces block size and prevents wasted data exchange. +As nodes interact with the network, they receive, verify, and store transactions +and snarks in local pools. When a new block arrives, it often includes some of +these items. Because the node already has them, the sender need not retransmit +the data. This approach offers: -2. **Decreased Parsing and Validation Overhead:** - With fewer embedded items, nodes spend less time parsing and validating large blocks and their contents, and can more quickly integrate them into local state. +1. **Reduced Bandwidth Usage:** Eliminating redundant transmissions of known + snarks and transactions reduces block size and prevents wasted data exchange. -3. **Memory Footprint Optimization:** - By avoiding duplicate data, nodes can maintain more stable memory usage. +2. **Decreased Parsing and Validation Overhead:** With fewer embedded items, + nodes spend less time parsing and validating large blocks and their contents, + and can more quickly integrate them into local state. + +3. **Memory Footprint Optimization:** By avoiding duplicate data, nodes can + maintain more stable memory usage. #### Practical Considerations -- **Snarks:** - Snarks, being large, benefit most from this approach. Skipping their retransmission saves significant bandwidth. -- **Ensuring Synchronization:** - This approach assumes nodes maintain consistent local pools. The poll-based model and eventual consistency ensure nodes receive needed items before they appear in a block, making it likely that a node has them on hand. +- **Snarks:** Snarks, being large, benefit most from this approach. Skipping + their retransmission saves significant bandwidth. + +- **Ensuring Synchronization:** This approach assumes nodes maintain consistent + local pools. The poll-based model and eventual consistency ensure nodes + receive needed items before they appear in a block, making it likely that a + node has them on hand. -- **Adjusting the Block Format:** - This idea may require altering the protocol so the block references, rather than embeds, items nodes probably have. The node would fetch only missing pieces if a reference does not match its local data. +- **Adjusting the Block Format:** This idea may require altering the protocol so + the block references, rather than embeds, items nodes probably have. The node + would fetch only missing pieces if a reference does not match its local data. #### Outcome -By using local data, the network can propagate smaller blocks, improving scalability, reducing resource usage, and speeding propagation. + +By using local data, the network can propagate smaller blocks, improving +scalability, reducing resource usage, and speeding propagation. # OCaml node compatibility -In order to be compatible with the current OCaml node p2p implementation, -we have [libp2p implementation](./libp2p.md) as well. So communication between OCaml -and Rust nodes is done via LibP2P, while Rust nodes will use WebRTC -to converse with each other. + +In order to be compatible with the current OCaml node p2p implementation, we +have [libp2p implementation](./libp2p.md) as well. So communication between +OCaml and Rust nodes is done via LibP2P, while Rust nodes will use WebRTC to +converse with each other. diff --git a/p2p/src/service_impl/peer-discovery.md b/p2p/src/service_impl/peer-discovery.md index 251801658..334fb73ce 100644 --- a/p2p/src/service_impl/peer-discovery.md +++ b/p2p/src/service_impl/peer-discovery.md @@ -2,25 +2,36 @@ ## Objectives -The Openmina node must maintain connections with peers. The list of peers must meet the requirements: +The Openmina node must maintain connections with peers. The list of peers must +meet the requirements: -* The number of peers must not exceed an upper limit and should not fall below a lower limit. -* Peers must be as good as possible. Node must evaluate each peer by uptime, correctness of information provided, and ping. Node must find a balance point to dynamically maximize all of these values. -* Node must choose peers in a way that allows global consistency of the network, provides security and avoids the network centralization. +- The number of peers must not exceed an upper limit and should not fall below a + lower limit. +- Peers must be as good as possible. Node must evaluate each peer by uptime, + correctness of information provided, and ping. Node must find a balance point + to dynamically maximize all of these values. +- Node must choose peers in a way that allows global consistency of the network, + provides security and avoids the network centralization. -This specification describes initial peer discovery, peer selecting and peer evaluation algorithms. +This specification describes initial peer discovery, peer selecting and peer +evaluation algorithms. ### Initial peer discovery -There are so-called seed peers. These peers are normal nodes except for a few features: +There are so-called seed peers. These peers are normal nodes except for a few +features: -* The seed peer's address must be static. -* The seed peer must not do block production or snark work. -* The seed peer should have high uptime and support many connections. +- The seed peer's address must be static. +- The seed peer must not do block production or snark work. +- The seed peer should have high uptime and support many connections. -The node should have a list of seed peer addresses at startup. To do a peer discovery, the node must connect to seed peers and call the `get_some_initial_peers` RPC. It will return a list of addresses of peers to start with. +The node should have a list of seed peer addresses at startup. To do a peer +discovery, the node must connect to seed peers and call the +`get_some_initial_peers` RPC. It will return a list of addresses of peers to +start with. -The call `get_some_initial_peers` doesn't have parameters, the response is the list of structures that represents a peer: +The call `get_some_initial_peers` doesn't have parameters, the response is the +list of structures that represents a peer: ```Rust type Response = Vec; @@ -34,19 +45,28 @@ struct InitialPeerAddress { The same RPC call can be made to any peer, not just the seed peer. -The node must distinguish between a temporary connection, which is made only for calling the `get_some_initial_peers` RPC, and a normal connection, which is used for all other tasks. +The node must distinguish between a temporary connection, which is made only for +calling the `get_some_initial_peers` RPC, and a normal connection, which is used +for all other tasks. ### Peer selecting -After the node receives initial peers from seed nodes, the total number of known peers may already exceed the maximum number of peers. However, using these peers is not optimal because it leads to centralization. Imagine there are three seed nodes, each connected to 100 peers, they can provide a total of 100-300 unique peers. And thousands of fresh users connect to those same peers. These 100-300 peers will be congested and the network will be unreliable and centralized. +After the node receives initial peers from seed nodes, the total number of known +peers may already exceed the maximum number of peers. However, using these peers +is not optimal because it leads to centralization. Imagine there are three seed +nodes, each connected to 100 peers, they can provide a total of 100-300 unique +peers. And thousands of fresh users connect to those same peers. These 100-300 +peers will be congested and the network will be unreliable and centralized. To avoid this situation, the node must select peers. #### Create a reference graph -Node should keep a database of known nodes and their references. Let call the node $A$ references the node $B$ (denoted $A \to B$) if and only if the node $A$ return address of node $B$ in the response for `get_some_initial_peers` call. +Node should keep a database of known nodes and their references. Let call the +node $A$ references the node $B$ (denoted $A \to B$) if and only if the node $A$ +return address of node $B$ in the response for `get_some_initial_peers` call. -Having this relation we can create a peer graph. +Having this relation we can create a peer graph. ```Rust struct PeerGraphNode { @@ -60,15 +80,24 @@ struct PeerGraphNode { struct PeerGraphEdge; ``` -The node must keep this graph current and complete. To do this, the node must make a temporary connection to the peer that was updated a long time ago and re-run the `get_some_initial_peers` RPC. +The node must keep this graph current and complete. To do this, the node must +make a temporary connection to the peer that was updated a long time ago and +re-run the `get_some_initial_peers` RPC. #### Security -Having the peer graph allows us to assign to each peer the one more evaluation: the number of unique paths between a given seed peer and the peer under evaluation. The peer that has more unique paths to reach it is better because it is less likely to be part of the malicious network. Malicious actor can create arbitrary many peers, but it is difficult to create arbitrary many links and it is even more difficult to create arbitrary many paths between honest and malicious nodes. +Having the peer graph allows us to assign to each peer the one more evaluation: +the number of unique paths between a given seed peer and the peer under +evaluation. The peer that has more unique paths to reach it is better because it +is less likely to be part of the malicious network. Malicious actor can create +arbitrary many peers, but it is difficult to create arbitrary many links and it +is even more difficult to create arbitrary many paths between honest and +malicious nodes. #### Decentralization -To maintain decentralization the node must select peers uniformly across the graph, so more distant nodes has the same chance to be selected. +To maintain decentralization the node must select peers uniformly across the +graph, so more distant nodes has the same chance to be selected. ### Peer evaluation @@ -78,11 +107,16 @@ To maintain decentralization the node must select peers uniformly across the gra High level algorithm overview: -* Maintain the peer graph using temporal connections and the `get_some_initial_peers` RPC. -* Define a "base probability to select". Compute the probability equal to the maximum allowed number of connections divided by the number of nodes in the peer graph. -* Compute an integrated score of the peer, a single normalized number that includes "security", "uptime", "correctness", and "ping" scores. -* Vary the base probability to account for the score. -* Randomly select peers with probability. -* Truncate the smallest (by the score) peers if the number of peers selected exceeds the maximum. +- Maintain the peer graph using temporal connections and the + `get_some_initial_peers` RPC. +- Define a "base probability to select". Compute the probability equal to the + maximum allowed number of connections divided by the number of nodes in the + peer graph. +- Compute an integrated score of the peer, a single normalized number that + includes "security", "uptime", "correctness", and "ping" scores. +- Vary the base probability to account for the score. +- Randomly select peers with probability. +- Truncate the smallest (by the score) peers if the number of peers selected + exceeds the maximum. diff --git a/p2p/testing.md b/p2p/testing.md index a64eae85c..92175443f 100644 --- a/p2p/testing.md +++ b/p2p/testing.md @@ -13,21 +13,21 @@ connected, so each one has exactly one connection. - [`p2p_basic_connections(connection_stbility)`](../node/testing/src/scenarios/p2p/basic_connection_handling.rs#L342) - ### All connections should be tracked by the state machine Connections that are initiated outside of the state machine (e.g. by Kademlia) should be present in the state machine. **Tests:** + - [`p2p_basic_connections(all_nodes_connections_are_symmetric)`](../node/testing/src/scenarios/p2p/basic_connection_handling.rs#L98) - [`p2p_basic_connections(seed_connections_are_symmetric)`](../node/testing/src/scenarios/p2p/basic_connection_handling.rs#L165) ### Number of active peers should not exceed configured maximum number **Tests:** -- [`p2p_basic_connections(max_number_of_peers)`](../node/testing/src/scenarios/p2p/basic_connection_handling.rs#L248) +- [`p2p_basic_connections(max_number_of_peers)`](../node/testing/src/scenarios/p2p/basic_connection_handling.rs#L248) ## Incoming Connections @@ -36,16 +36,18 @@ should be present in the state machine. We should accept an incoming connection from an arbitrary node. **Tests:** + - [p2p_basic_incoming(accept_connection)](../node/testing/src/scenarios/p2p/basic_incoming_connections.rs#L16) - [p2p_basic_incoming(accept_multiple_connections)](../node/testing/src/scenarios/p2p/basic_incoming_connections.rs#L62) - [solo_node_accept_incoming](../node/testing/src/scenarios/solo_node/basic_connectivity_accept_incoming.rs) -- [multi_node_connection_discovery/OCamlToRust](../node/testing/src/scenarios/multi_node/connection_discovery.rs#L127) (should be replaced with one with non-OCaml peer) +- [multi_node_connection_discovery/OCamlToRust](../node/testing/src/scenarios/multi_node/connection_discovery.rs#L127) + (should be replaced with one with non-OCaml peer) - TODO: fast-running short test ### Node shouldn't accept duplicate incoming connections -The Rust node should reject a connection from a peer if there is one with the same -peer ID already. +The Rust node should reject a connection from a peer if there is one with the +same peer ID already. **Tests:** TODO @@ -56,8 +58,8 @@ this one. This is either a program error (see above), network setup error, or a malicious node that uses the same peer ID. **Tests:** -- [`p2p_basic_incoming(does_not_accept_self_connection)`](../node/testing/src/scenarios/p2p/basic_incoming_connections.rs#L120) +- [`p2p_basic_incoming(does_not_accept_self_connection)`](../node/testing/src/scenarios/p2p/basic_incoming_connections.rs#L120) ## Outgoing connections @@ -68,9 +70,11 @@ malicious node that uses the same peer ID. ### Node shouldn't try to make outgoing connection using its own peer_id -The node can obtain its address from other peers. It shouldn't use it when connecting to new peers. +The node can obtain its address from other peers. It shouldn't use it when +connecting to new peers. **Tests:** + - [`p2p_basic_outgoing(dont_connect_to_node_same_id)`](node/testing/src/scenarios/p2p/basic_outgoing_connections.rs#L134) - [`p2p_basic_outgoing(dont_connect_to_initial_peer_same_id)`](node/testing/src/scenarios/p2p/basic_outgoing_connections.rs#L187) - [`p2p_basic_outgoing(dont_connect_to_self_initial_peer)`](node/testing/src/scenarios/p2p/basic_outgoing_connections.rs#L226) @@ -80,7 +84,8 @@ The node can obtain its address from other peers. It shouldn't use it when conne TODO: what if the number of initial peers exceeds the max number of peers? - [`p2p_basic_outgoing(connect_to_all_initial_peers)`](../node/testing/src/scenarios/p2p/basic_outgoing_connections.rs#L293) -- [multi_node_initial_joining](../node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs) (partially?) +- [multi_node_initial_joining](../node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs) + (partially?) ### Node should retry connecting to unavailable initial peers @@ -88,19 +93,20 @@ TODO: what if the number of initial peers exceeds the max number of peers? ### Node should be able to connect to initial peers eventually, even if initially they are not available. -If, for some reason, the node can't connect to enough peers (e.g. it is the first -node in the network), it should keep retrying to those with failures (see also -below). +If, for some reason, the node can't connect to enough peers (e.g. it is the +first node in the network), it should keep retrying to those with failures (see +also below). TODO: Use cases where this is important. **Tests:** + - [`p2p_basic_outgoing(connect_to_all_initial_peers_become_ready)`](../node/testing/src/scenarios/p2p/basic_outgoing_connections.rs#L362) ### Node should have a reasonable retry rate for reconnection -We should consider different reasons why the outgoing connection failed. The Rust -node shouldn't reconnect too soon to a node that dropped the connection. +We should consider different reasons why the outgoing connection failed. The +Rust node shouldn't reconnect too soon to a node that dropped the connection. **Tests:** TODO @@ -108,11 +114,13 @@ node shouldn't reconnect too soon to a node that dropped the connection. ### Node advertises itself through Kademlia -- [solo_node_accept_incoming](../node/testing/src/scenarios/solo_node/basic_connectivity_accept_incoming.rs) (TODO: should be replaced by one with Rust-only peer) +- [solo_node_accept_incoming](../node/testing/src/scenarios/solo_node/basic_connectivity_accept_incoming.rs) + (TODO: should be replaced by one with Rust-only peer) ### Node should be able to perform initial peer selection (Kademlia "bootstrap") -During this stage, the node queries its existing peers for more peers, thus getting more peer addresses. +During this stage, the node queries its existing peers for more peers, thus +getting more peer addresses. See #148. @@ -122,13 +130,15 @@ See #148. See #148. -To obtain a set of random peers, the Rust node performs a Kademlia query -that returns a list of peers that are "close" to some random peer. +To obtain a set of random peers, the Rust node performs a Kademlia query that +returns a list of peers that are "close" to some random peer. This step starts after Kademlia initialization is complete. -- [multi_node_peer_discovery](../node/testing/src/scenarios/multi_node/basic_connectivity_peer_discovery.rs) (partially, should be replaced with one with a non-OCaml peer) -- [multi_node_connection_discovery/OCamlToRust](../node/testing/src/scenarios/multi_node/connection_discovery.rs#L127) (indirectly, should be replaced with one with a non-OCaml peer) +- [multi_node_peer_discovery](../node/testing/src/scenarios/multi_node/basic_connectivity_peer_discovery.rs) + (partially, should be replaced with one with a non-OCaml peer) +- [multi_node_connection_discovery/OCamlToRust](../node/testing/src/scenarios/multi_node/connection_discovery.rs#L127) + (indirectly, should be replaced with one with a non-OCaml peer) - TODO: fast-running Rust-only test ### Node should only advertise its "real" address @@ -179,7 +189,8 @@ able to also discover and connect to each other. ### Initial peer connection -The node should connect to as many peers as it is configured to (between min and max number). +The node should connect to as many peers as it is configured to (between min and +max number). - [multi_node_initial_joining](../node/testing/src/scenarios/multi_node/basic_connectivity_initial_joining.rs) @@ -192,8 +203,7 @@ from its existing peers. # Attacks Resistance - -## DDoS +## DDoS **Tests:** TODO diff --git a/p2p/testing/README.md b/p2p/testing/README.md index 4ba6cef93..1641d3973 100644 --- a/p2p/testing/README.md +++ b/p2p/testing/README.md @@ -4,11 +4,13 @@ This is a lighter version of the [Testing Framework](../../node/testing/) that is focused on peer-to-peer functionality of the Rust node. It allows to set up, run and observe clusters of peers that implement Mina p2p -protocols. Such peers are Rust-node based and `rust-libp2p` based (to have reference implementatios). +protocols. Such peers are Rust-node based and `rust-libp2p` based (to have +reference implementatios). -Running and observing a test cluster is done by reading it as a `Stream`, emitting different events as its components progress. +Running and observing a test cluster is done by reading it as a `Stream`, +emitting different events as its components progress. -#### Cluster +#### Cluster #### Rust Node diff --git a/package-lock.json b/package-lock.json new file mode 100644 index 000000000..1405b75bf --- /dev/null +++ b/package-lock.json @@ -0,0 +1,31 @@ +{ + "name": "openmina", + "version": "1.0.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "openmina", + "version": "1.0.0", + "devDependencies": { + "prettier": "^3.0.0" + } + }, + "node_modules/prettier": { + "version": "3.6.2", + "resolved": "https://registry.npmjs.org/prettier/-/prettier-3.6.2.tgz", + "integrity": "sha512-I7AIg5boAr5R0FFtJ6rCfD+LFsWHp81dolrFD8S79U9tb8Az2nGrJncnMSnys+bpQJfRUzqs9hnA81OAA3hCuQ==", + "dev": true, + "license": "MIT", + "bin": { + "prettier": "bin/prettier.cjs" + }, + "engines": { + "node": ">=14" + }, + "funding": { + "url": "https://github.com/prettier/prettier?sponsor=1" + } + } + } +} diff --git a/package.json b/package.json new file mode 100644 index 000000000..d6204f923 --- /dev/null +++ b/package.json @@ -0,0 +1,23 @@ +{ + "name": "openmina", + "version": "1.0.0", + "description": "Prettier configuration for markdown formatting", + "scripts": { + "format:md": "prettier --write \"**/*.md\"", + "check:md": "prettier --check \"**/*.md\"" + }, + "devDependencies": { + "prettier": "^3.0.0" + }, + "prettier": { + "printWidth": 80, + "tabWidth": 2, + "useTabs": false, + "semi": true, + "singleQuote": false, + "quoteProps": "as-needed", + "bracketSpacing": true, + "arrowParens": "avoid", + "proseWrap": "always" + } +} diff --git a/status.md b/status.md index 29f61cee3..19ea1ae3d 100644 --- a/status.md +++ b/status.md @@ -1,15 +1,15 @@ # Current status of the Rust node -* [High Level Functionality Overview](#overview) -* [VRF Evaluator](#vrf-evaluator) -* [Block Producer](#block-producer) -* [Ledger](#ledger) -* [Proofs](#proofs) -* [P2P Implementation (State Machine Version)](#state-machine-p2p) -* [P2P Related Tests](#p2p-tests) -* [Frontend](#frontend) -* [Documentation](#documentation) -* [Experimental State Machine Architecture](#experimental-state-machine-architecture) +- [High Level Functionality Overview](#overview) +- [VRF Evaluator](#vrf-evaluator) +- [Block Producer](#block-producer) +- [Ledger](#ledger) +- [Proofs](#proofs) +- [P2P Implementation (State Machine Version)](#state-machine-p2p) +- [P2P Related Tests](#p2p-tests) +- [Frontend](#frontend) +- [Documentation](#documentation) +- [Experimental State Machine Architecture](#experimental-state-machine-architecture) ## High Level Functionality Overview @@ -20,94 +20,123 @@ - [x] Full block with proof - [x] Blocks with transactions. - Networking layer - - [x] P2P layer in general along with serialization/deserialization of all messages - - RPCs support - - [x] `Get_some_initial_peers`(this is not used by the OCaml node) - - [x] `Get_staged_ledger_aux_and_pending_coinbases_at_hash` - - [x] `Answer_sync_ledger_query` - - [x] `Get_transition_chain` - - `Get_transition_knowledge` (I don't think this one is used at all, `Get_transition_chain_proof` is used instead) - - [x] `Get_transition_chain_proof` - - [x] `Get_ancestry` - - `Ban_notify` - - [x] `Get_best_tip` - - `Get_node_status` - - Peer discovery/advertising - - [x] Peer discovery through kademlia - - [x] Advertising the node through kademlia so that OCaml nodes can see us - - Publish subscribe - - [x] Floodsub-like broadcasting of produced block - - [x] Floodsub-like resending of blocks, txs and snarks -- [ ] Trust system (to punish/ban peers): **not implemented (and no equivalent)** + - [x] P2P layer in general along with serialization/deserialization of all + messages + - RPCs support + - [x] `Get_some_initial_peers`(this is not used by the OCaml node) + - [x] `Get_staged_ledger_aux_and_pending_coinbases_at_hash` + - [x] `Answer_sync_ledger_query` + - [x] `Get_transition_chain` + - `Get_transition_knowledge` (I don't think this one is used at + all, `Get_transition_chain_proof` is used instead) + - [x] `Get_transition_chain_proof` + - [x] `Get_ancestry` + - `Ban_notify` + - [x] `Get_best_tip` + - `Get_node_status` + - Peer discovery/advertising + - [x] Peer discovery through kademlia + - [x] Advertising the node through kademlia so that OCaml nodes can see us + - Publish subscribe + - [x] Floodsub-like broadcasting of produced block + - [x] Floodsub-like resending of blocks, txs and snarks +- [ ] Trust system (to punish/ban peers): **not implemented (and no + equivalent)** - Pools - - Transaction pool: **in progress** - - [x] Receiving, validating and integrating transactions - - [x] Payments - - [x] zkApp transactions (with proofs too) - - [x] Broadcasting transactions to peers. - - [x] Updating and revalidating the txn pool when new blocks are applied (by removing transactions already in the block) - - [x] Updating and revalidating the txn pool when there are chain reorgs (by restoring transactions from discarded chains) - - [ ] Error handling - - [ ] Testing - - SNARK pool - - [x] SNARK Verification - - [x] Pool is implemented - - [x] SNARK work production and broadcasting. - - [ ] Testing + - Transaction pool: **in progress** + - [x] Receiving, validating and integrating transactions + - [x] Payments + - [x] zkApp transactions (with proofs too) + - [x] Broadcasting transactions to peers. + - [x] Updating and revalidating the txn pool when new blocks are applied (by + removing transactions already in the block) + - [x] Updating and revalidating the txn pool when there are chain reorgs (by + restoring transactions from discarded chains) + - [ ] Error handling + - [ ] Testing + - SNARK pool + - [x] SNARK Verification + - [x] Pool is implemented + - [x] SNARK work production and broadcasting. + - [ ] Testing - [x] Compatible ledger implementation - [x] Transition frontier - [x] Support for loading arbitrary genesis ledgers at startup - Bootstrap/Catchup process - - [x] Ledger synchronization - - [x] Snarked ledgers (staking and next epoch ledgers + transition frontier root) - - [x] Handling of peer disconnections, timeouts or cases when the peer doesn't have the data - - [x] Detecting ledger hash mismatches for the downloaded chunk - - [x] Handling ledger hash mismatches gracefully, without crashing the node - - [x] Optimized snarked ledgers synchronization (reusing previous ledgers when constructing the next during (re)synchronization) - - [x] Staged ledgers (transition frontier root) - - [x] Handling of peer disconnections, timeouts or cases when the peer doesn't have the data - - [x] Detection and handling of validation errors - - [x] Handling of the rpc requests from other nodes to sync them up - - [x] Moving root of the transition frontier - - [x] Maintaining ledgers for transition frontier root, staking and next epoch ledgers - - [x] When scan state tree gets committed, snarked ledger of the block is updated. When that happens for the root block in the transition frontier, reconstruct the new root snarked ledger - - [x] At the end of an epoch make the "next epoch" ledger the new "staking" ledger, discard the old "staking" ledger and make the snarked ledger of the best tip the new "next epoch" ledger - - [x] Best chain synchronization - - [x] Download missing blocks from peers - - [x] Handling of peer disconnections, timeouts or cases when the peer doesn't have the data - - [x] Downloaded block header integrity validation by checking it's hash and handling the mismatch - - [ ] Downloaded block body integrity validation by checking it's hash and handling the mismatch - - [x] Missing blocks application - - [ ] Graceful handling of block application error without crashing the node - - [x] Handling of reorgs (short/long range forks) or best chain extension after or even mid-synchronization, by adjusting synchronization target and reusing what we can from the previous synchronization attempt + - [x] Ledger synchronization + - [x] Snarked ledgers (staking and next epoch ledgers + transition frontier + root) + - [x] Handling of peer disconnections, timeouts or cases when the peer + doesn't have the data + - [x] Detecting ledger hash mismatches for the downloaded chunk + - [x] Handling ledger hash mismatches gracefully, without crashing the + node + - [x] Optimized snarked ledgers synchronization (reusing previous ledgers + when constructing the next during (re)synchronization) + - [x] Staged ledgers (transition frontier root) + - [x] Handling of peer disconnections, timeouts or cases when the peer + doesn't have the data + - [x] Detection and handling of validation errors + - [x] Handling of the rpc requests from other nodes to sync them up + - [x] Moving root of the transition frontier + - [x] Maintaining ledgers for transition frontier root, staking and next epoch + ledgers + - [x] When scan state tree gets committed, snarked ledger of the block is + updated. When that happens for the root block in the transition + frontier, reconstruct the new root snarked ledger + - [x] At the end of an epoch make the "next epoch" ledger the new "staking" + ledger, discard the old "staking" ledger and make the snarked ledger + of the best tip the new "next epoch" ledger + - [x] Best chain synchronization + - [x] Download missing blocks from peers + - [x] Handling of peer disconnections, timeouts or cases when the peer + doesn't have the data + - [x] Downloaded block header integrity validation by checking it's hash + and handling the mismatch + - [ ] Downloaded block body integrity validation by checking it's hash and + handling the mismatch + - [x] Missing blocks application + - [ ] Graceful handling of block application error without crashing the + node + - [x] Handling of reorgs (short/long range forks) or best chain extension + after or even mid-synchronization, by adjusting synchronization target + and reusing what we can from the previous synchronization attempt - Block application - - [x] Transaction application logic - - [x] Block application logic - - Proof verification: - - [x] Block proof verification - - [x] Transaction proof verification (same as above) - - [x] Zkapp proof verification (same as above) -- [ ] Client API (currently the node has a very partial support, not planned at the moment) -- [ ] Support for the archive node sidecar process (sending updates through RPC calls). + - [x] Transaction application logic + - [x] Block application logic + - Proof verification: + - [x] Block proof verification + - [x] Transaction proof verification (same as above) + - [x] Zkapp proof verification (same as above) +- [ ] Client API (currently the node has a very partial support, not planned at + the moment) +- [ ] Support for the archive node sidecar process (sending updates through RPC + calls). - [x] Devnet support - - [x] Raw data for gates used to produced files updated for devnet compatibility + - [x] Raw data for gates used to produced files updated for devnet + compatibility - [x] Non-circuit logic updated for devnet compatibility - [x] Circuit logic updated for devnet compatibility - [x] Genesis ledger file loadable by openmina for connecting to devnet - [x] Updated to handle fork proof and new genesis state - [x] Mainnet support - - [x] Raw data for gates used to produced files updated for mainnet compatibility + - [x] Raw data for gates used to produced files updated for mainnet + compatibility - [x] Non-circuit logic updated for mainnet compatibility - [x] Circuit logic updated for mainnet compatibility - [x] Genesis ledger file loadable by openmina for connecting to mainnet - [x] Updated to handle fork proof and new genesis state - Block replayer using precomputed blocks from Google Cloud Storage - - [x] Basic replayer that applies blocks with openmina and verifies the results. - - [ ] Enable proofs verification (for performance reasons, that is skipped right now) - - [x] OCaml node counterpart to replay failed block applications (for debugging an testing) + - [x] Basic replayer that applies blocks with openmina and verifies the + results. + - [ ] Enable proofs verification (for performance reasons, that is skipped + right now) + - [x] OCaml node counterpart to replay failed block applications (for + debugging an testing) - [ ] CI pipeline to regularly test application of mainnet blocks - [ ] Support for applying all blocks, not just the cannonical chain - - [ ] Produce tracing receipts from both the OCaml and Rust implementations that can be compared (for debugging and verification purposes) + - [ ] Produce tracing receipts from both the OCaml and Rust implementations + that can be compared (for debugging and verification purposes) - Webnode - [x] WASM compilation - [x] WebRTC-based P2P layer @@ -119,21 +148,26 @@ ## VRF Evaluator - [x] VRF evaluator functionality: - - [x] Calculation of the VRF output - - [x] Threshold calculation determining if the slot has been won - - [ ] (Optional) Providing verification of the producers VRF output (Does not impact the node functionality, just provides a way for the delegates to verify their impact on winning/losing a slot) + - [x] Calculation of the VRF output + - [x] Threshold calculation determining if the slot has been won + - [ ] (Optional) Providing verification of the producers VRF output (Does not + impact the node functionality, just provides a way for the delegates to + verify their impact on winning/losing a slot) - [x] Implement VRF evaluator state machine - [x] Computation service - [x] Collecting the delegator table for the producer - [x] Integrate with the block producer - - [x] Handling epoch changes - starting new evaluation as soon as new epoch data is available - - [ ] Retention logic - cleanup slot data that is in the past based on current global slot (Slight node impact - the won slot map grows indefinitely) + - [x] Handling epoch changes - starting new evaluation as soon as new epoch + data is available + - [ ] Retention logic - cleanup slot data that is in the past based on current + global slot (Slight node impact - the won slot map grows indefinitely) - [ ] Testing - [ ] Correctness test - Selecting the correct ledgers - [x] (Edge case) In genesis epoch - [ ] In other (higher) epochs - [x] Correctness test - Computation output comparison with mina cli - - [x] Correctness test - Start a new VRF evaluation on epoch switch for the next available epoch + - [x] Correctness test - Start a new VRF evaluation on epoch switch for the + next available epoch - [ ] Correctness test - Retaining the slot data only for future blocks - [ ] Documentation @@ -157,29 +191,30 @@ - [x] Ledger/Mask implementation - [x] Staged Ledger implementation - - [x] Scan state - - [x] Pending coinbase collection - - [x] Transaction application - - [x] Regular transaction (payment, delegation, coinbase, fee transfer) - - [x] Zkapps + - [x] Scan state + - [x] Pending coinbase collection + - [x] Transaction application + - [x] Regular transaction (payment, delegation, coinbase, fee transfer) + - [x] Zkapps - [x] Ledger interactions are asynchronous and cannot stall the state machine. - [x] Persistent database - - [x] (discarded) Drop-in replacement for RocksDB https://github.com/MinaProtocol/mina/pull/13340 - - [ ] Design and implement a persistent ledger - - DRAFT design https://github.com/openmina/openmina/issues/522 - - [ ] Design and implement a persistent block storage - - [ ] Design and implement a persistent proof storage + - [x] (discarded) Drop-in replacement for RocksDB + https://github.com/MinaProtocol/mina/pull/13340 + - [ ] Design and implement a persistent ledger + - DRAFT design https://github.com/openmina/openmina/issues/522 + - [ ] Design and implement a persistent block storage + - [ ] Design and implement a persistent proof storage ## Proofs - [x] Proof verification - - [x] Block proof - - [x] Transaction/Merge proof - - [x] Zkapp proof + - [x] Block proof + - [x] Transaction/Merge proof + - [x] Zkapp proof - [x] Proof/Witness generation - - [x] Block proof - - [x] Transaction/Merge proof - - [x] Zkapp proof + - [x] Block proof + - [x] Transaction/Merge proof + - [x] Zkapp proof - [ ] Circuit generation ## P2P Implementation (State Machine Version) @@ -194,7 +229,8 @@ - [x] Handle simultaneous connect case. - [x] Noise protocol for outgoing connections. - [x] Noise protocol for incoming connections. -- [x] Forbid connections whose negotiated peer-id don't match the one in the dial-opts or routing table. +- [x] Forbid connections whose negotiated peer-id don't match the one in the + dial-opts or routing table. - [x] Yamux multiplexer. - [ ] Yamux congestion control. @@ -219,7 +255,8 @@ ### Gossipsub - [x] Implement gossipsub compatible with libp2p. -- [ ] Research how to use "expander graph" theory to make gossipsub robust and efficient. +- [ ] Research how to use "expander graph" theory to make gossipsub robust and + efficient. - [x] Implement mesh (meshsub protocol) - [x] Handle control messages - [ ] Limit received blocks, txs and snarks from the same peer @@ -231,11 +268,14 @@ - [x] Fix network debugger for the latest berkeley network. - [x] Test that the Openmina node can bootstrap from the replayer tool. - [ ] Test that the OCaml node can bootstrap from the Openmina node. -- [ ] Test that the Openmina node can bootstrap from another instance of openmina node. +- [ ] Test that the Openmina node can bootstrap from another instance of + openmina node. - [ ] Test block propagation ### Fuzzing -- [x] Mutator-based (bit-flipping/extend/shrink) fuzzing of communication between two openmina nodes + +- [x] Mutator-based (bit-flipping/extend/shrink) fuzzing of communication + between two openmina nodes - [x] PNet layer mutator. - [x] Protocol select mutator. - [x] Noise mutator. @@ -251,27 +291,35 @@ See [Testing](./docs/testing/README.md) for more details. - [ ] P2p functionality tests - [ ] p2p messages - - [ ] Binprot types (de)serialization testing/fuzzing - - [ ] Mina RPC types testing (ideally along with OCaml codecs) - - [ ] hashing testing (ideally along with OCaml hash implementations) + - [ ] Binprot types (de)serialization testing/fuzzing + - [ ] Mina RPC types testing (ideally along with OCaml codecs) + - [ ] hashing testing (ideally along with OCaml hash implementations) - [ ] Connection - - [x] Proper initial peers handling, like reconnecting if offline - - [x] Peers number maintaining, including edge cases, when we have max peers but still allow peers to connect for e.g. discovery, that is dropping connection strategy - - [x] Other connection constraints, like no duplicate connections to the same peer, peer_id, no self connections etc - - [ ] Connection quality metrics - - [x] Connects to OCaml node and vice versa + - [x] Proper initial peers handling, like reconnecting if offline + - [x] Peers number maintaining, including edge cases, when we have max peers + but still allow peers to connect for e.g. discovery, that is dropping + connection strategy + - [x] Other connection constraints, like no duplicate connections to the + same peer, peer_id, no self connections etc + - [ ] Connection quality metrics + - [x] Connects to OCaml node and vice versa - [ ] Kademlia - - [x] Peers discovery, according to Kademlia parameters (a new node gets 20 new peers) - - [x] Bootstraps from OCaml node and vice versa - - [ ] Kademlia routing table is up-to-date with the network (each peer status, like connected/disconnected/can_connect/cant_connect, reflects actual peer state) + - [x] Peers discovery, according to Kademlia parameters (a new node gets 20 + new peers) + - [x] Bootstraps from OCaml node and vice versa + - [ ] Kademlia routing table is up-to-date with the network (each peer + status, like connected/disconnected/can_connect/cant_connect, reflects + actual peer state) - [ ] Gossipsub - - [ ] Reacheability (all nodes get the message) - - [ ] Non-redundancy (minimal number of duplicating/unneeded messages) + - [ ] Reacheability (all nodes get the message) + - [ ] Non-redundancy (minimal number of duplicating/unneeded messages) - [ ] Interoperability with OCaml node - - [ ] Bootstrap Rust node from OCaml and vice versa - - [x] Discovery using Rust node - - [ ] Gossipsub relaying -- [ ] Public network tests. This should be the only set of tests that involve publicly available networks, and should be executed if we're sure we don't ruin them. + - [ ] Bootstrap Rust node from OCaml and vice versa + - [x] Discovery using Rust node + - [ ] Gossipsub relaying +- [ ] Public network tests. This should be the only set of tests that involve + publicly available networks, and should be executed if we're sure we don't + ruin them. - [ ] Attack resistance testing ## Frontend @@ -297,6 +345,7 @@ See [Testing](./docs/testing/README.md) for more details. - [x] Block Production - Won Slots ### Testing + - [x] Tests for Nodes Overview - [x] Tests for Nodes Live - [ ] Tests for Nodes Bootstrap @@ -315,6 +364,7 @@ See [Testing](./docs/testing/README.md) for more details. - [ ] Tests for Block Production - Won Slots ### Other + - [x] CI Integration and Docker build & upload - [x] State management - [x] Update to Angular v16 @@ -328,10 +378,12 @@ See [Testing](./docs/testing/README.md) for more details. - [x] [Openmina Node](https://github.com/openmina/openmina#the-open-mina-node) - [x] [The Mina Web Node](https://github.com/openmina/webnode/blob/main/README.md) - P2P - - [ ] [P2P Networking Stack](https://github.com/openmina/openmina/blob/develop/p2p/readme.md) in progress + - [ ] [P2P Networking Stack](https://github.com/openmina/openmina/blob/develop/p2p/readme.md) + in progress - [x] [P2P services](https://github.com/openmina/openmina/blob/documentation/docs/p2p_service.md) - - [ ] [RPCs support](https://github.com/JanSlobodnik/pre-publishing/blob/main/RPCs.md) - in progress - - [x] [GossipSub](https://github.com/openmina/mina-wiki/blob/3ea9041e52fb2e606918f6c60bd3a32b8652f016/p2p/mina-gossip.md) + - [ ] [RPCs support](https://github.com/JanSlobodnik/pre-publishing/blob/main/RPCs.md) - + in progress + - [x] [GossipSub](https://github.com/openmina/mina-wiki/blob/3ea9041e52fb2e606918f6c60bd3a32b8652f016/p2p/mina-gossip.md) - [x] [Scan state](https://github.com/openmina/openmina/blob/main/docs/scan-state.md) - [x] [SNARKs](https://github.com/openmina/openmina/blob/main/docs/snark-work.md) - Developer tools @@ -340,23 +392,30 @@ See [Testing](./docs/testing/README.md) for more details. - [x] [Dashboard](https://github.com/openmina/mina-frontend/blob/main/docs/MetricsTracing.md#Dashboard) - [x] [Debugger](https://github.com/openmina/mina-network-debugger) - ### By use-case - [x] [Why we are developing Open Mina](https://github.com/openmina/openmina/blob/main/docs/why-openmina.md) - [ ] Consensus logic - not documented yet - Block production logic - - [ ] [Internal transition](https://github.com/JanSlobodnik/pre-publishing/blob/main/block-production.md) - in progress + - [ ] [Internal transition](https://github.com/JanSlobodnik/pre-publishing/blob/main/block-production.md) - + in progress - [ ] External transition - not documented yet - - [ ] [VRF function](https://github.com/openmina/openmina/blob/feat/block_producer/vrf_evaluator/vrf/README.md) - in progress + - [ ] [VRF function](https://github.com/openmina/openmina/blob/feat/block_producer/vrf_evaluator/vrf/README.md) - + in progress - Peer discovery/advertising - - [ ] [Peer discovery through Kademlia](https://github.com/openmina/openmina/blob/develop/p2p/readme.md#kademlia-for-peer-discovery) - in progress -- [x] [SNARK work](https://github.com/openmina/openmina/blob/main/docs/snark-work.md) - SNARK production is implemented (through OCaml). Node can complete and broadcast SNARK work. - - [ ] [Witness folding](https://github.com/JanSlobodnik/pre-publishing/blob/main/witness-folding.md) - in progress -- [ ] [Bootstrapping process](https://github.com/JanSlobodnik/pre-publishing/blob/main/bootstrap-catchup.md) - in progress + - [ ] [Peer discovery through Kademlia](https://github.com/openmina/openmina/blob/develop/p2p/readme.md#kademlia-for-peer-discovery) - + in progress +- [x] [SNARK work](https://github.com/openmina/openmina/blob/main/docs/snark-work.md) - + SNARK production is implemented (through OCaml). Node can complete and + broadcast SNARK work. + - [ ] [Witness folding](https://github.com/JanSlobodnik/pre-publishing/blob/main/witness-folding.md) - + in progress +- [ ] [Bootstrapping process](https://github.com/JanSlobodnik/pre-publishing/blob/main/bootstrap-catchup.md) - + in progress - [ ] Block application - not documented yet - Testing - - [ ] [Testing framework](https://github.com/openmina/openmina/blob/main/docs/testing/testing.md) - partially complete (some tests are documented) + - [ ] [Testing framework](https://github.com/openmina/openmina/blob/main/docs/testing/testing.md) - + partially complete (some tests are documented) - How to run - [x] [Launch Openmina node](https://github.com/openmina/openmina#how-to-launch-without-docker-compose) - [x] [Launch Node with UI](https://github.com/openmina/openmina#how-to-launch-with-docker-compose) @@ -367,49 +426,69 @@ See [Testing](./docs/testing/README.md) for more details. ### Core state machine -- [x] Automaton implementation that separates *action* kinds in *pure* and *effectful*. -- [x] Callback (dispatch-back) support for action composition: enable us to specify in the action itself the actions that will dispatched next. -- [x] Fully serializable state machine state and actions (including descriptors to callbacks!). +- [x] Automaton implementation that separates _action_ kinds in _pure_ and + _effectful_. +- [x] Callback (dispatch-back) support for action composition: enable us to + specify in the action itself the actions that will dispatched next. +- [x] Fully serializable state machine state and actions (including descriptors + to callbacks!). - State machine state management - - [x] Partitioning of the state machine state between models sub-states (for *pure* models). - - [x] Forbid direct access to state machine state in *effectful* models. - - [x] Support for running multiple instances concurrently in the same state machine for testing scenarios: for example if the state machine represents a node, we can "run" multiple of them inside the same state machine. + - [x] Partitioning of the state machine state between models sub-states (for + _pure_ models). + - [x] Forbid direct access to state machine state in _effectful_ models. + - [x] Support for running multiple instances concurrently in the same state + machine for testing scenarios: for example if the state machine + represents a node, we can "run" multiple of them inside the same state + machine. ### Models -Each model handles a subset of actions and they are registered like a plugin system. +Each model handles a subset of actions and they are registered like a plugin +system. #### Effectful -Thin layer of abstraction between the "external world" (IO) and the state machine. +Thin layer of abstraction between the "external world" (IO) and the state +machine. -- [x] MIO model: provides the abstraction layer for the polling and TCP APIs of the MIO crate. +- [x] MIO model: provides the abstraction layer for the polling and TCP APIs of + the MIO crate. - [x] Time model: provides the abstraction layer for `SystemTime::now()` #### Pure Handle state transitions and can dispatch actions to other models. -- [x] Time model: this is the *pure* counterpart which dispatches an action to *effectful* time model to get the system time and updates the internal time in the state machine state. -- [x] TCP model: built on top of the MIO layer to provide all necessary features for handling TCP connections (it also uses the time model to provide timeout support for all actions). -- [x] TCP-client model: built on top of the TCP model, provides a high-level interface for building client applications. -- [x] TCP-server model: built on top of the TCP model, provides a high-level interface for building server applications. +- [x] Time model: this is the _pure_ counterpart which dispatches an action to + _effectful_ time model to get the system time and updates the internal + time in the state machine state. +- [x] TCP model: built on top of the MIO layer to provide all necessary features + for handling TCP connections (it also uses the time model to provide + timeout support for all actions). +- [x] TCP-client model: built on top of the TCP model, provides a high-level + interface for building client applications. +- [x] TCP-server model: built on top of the TCP model, provides a high-level + interface for building server applications. - [x] PRNG model: unsafe, fast, pure RNG for testing purposes. - PNET models: implements the private network transport used in libp2p. - - [x] Server - - [x] Client - - Testing models: - - [x] Echo client: connects to an echo server and sends random data, then checks that it receives the same data. - - [x] Echo server. - - [x] Echo client (PNET). - - [x] Echo server (PNET). - - [x] Simple PNET client: connects to berkeleynet and does a simple multistream negotiation. + - [x] Server + - [x] Client +- Testing models: + - [x] Echo client: connects to an echo server and sends random data, then + checks that it receives the same data. + - [x] Echo server. + - [x] Echo client (PNET). + - [x] Echo server (PNET). + - [x] Simple PNET client: connects to berkeleynet and does a simple + multistream negotiation. ### Tests - Echo network - [x] State machine with a network composed of 1 client and 1 server instance. - - [x] State machine with a network composed of 5 clients and 1 erver instance. - - [x] State machine with a network composed of 50 clients and 1 erver instance. -- [x] Echo network PNET (same tests as echo network but over the PNET transport). + - [x] State machine with a network composed of 5 clients and 1 erver instance. + - [x] State machine with a network composed of 50 clients and 1 erver + instance. +- [x] Echo network PNET (same tests as echo network but over the PNET + transport). - [x] Berkeley PNET test: runs the simple PNET client model. diff --git a/tools/heartbeats-processor/README.md b/tools/heartbeats-processor/README.md index 3334d2a05..cd8859298 100644 --- a/tools/heartbeats-processor/README.md +++ b/tools/heartbeats-processor/README.md @@ -1,44 +1,53 @@ # Heartbeats Processor -This application processes "heartbeat" entries from Firestore. It fetches data, groups it by time windows, and stores the results into a local SQLite database for further analysis or reporting. +This application processes "heartbeat" entries from Firestore. It fetches data, +groups it by time windows, and stores the results into a local SQLite database +for further analysis or reporting. ## Environment Variables The following environment variables control the program's behavior. -These variables can be set in your shell environment or in a `.env` file in the project root directory. +These variables can be set in your shell environment or in a `.env` file in the +project root directory. ### Required Variables -* `DATABASE_PATH` - SQLite database path (e.g., "./data.db") -* `GOOGLE_CLOUD_PROJECT` - Google Cloud project ID -* `WINDOW_RANGE_START` - Start time for window creation in RFC3339 format -* `WINDOW_RANGE_END` - End time for window creation in RFC3339 format + +- `DATABASE_PATH` - SQLite database path (e.g., "./data.db") +- `GOOGLE_CLOUD_PROJECT` - Google Cloud project ID +- `WINDOW_RANGE_START` - Start time for window creation in RFC3339 format +- `WINDOW_RANGE_END` - End time for window creation in RFC3339 format ### Optional Variables -* `GOOGLE_APPLICATION_CREDENTIALS` - Path to Google Cloud credentials file -* `DISABLED_WINDOWS` - Comma-separated list of time ranges to disable in RFC3339 format (e.g., `2023-01-01T00:00:00Z/2023-01-02T00:00:00Z,2023-02-01T00:00:00Z/2023-02-02T00:00:00Z`) + +- `GOOGLE_APPLICATION_CREDENTIALS` - Path to Google Cloud credentials file +- `DISABLED_WINDOWS` - Comma-separated list of time ranges to disable in RFC3339 + format (e.g., + `2023-01-01T00:00:00Z/2023-01-02T00:00:00Z,2023-02-01T00:00:00Z/2023-02-02T00:00:00Z`) ## Development With Firestore Emulator To develop locally using the Firestore Emulator, do the following: 1. Set these environment variables in your shell: - + ``` FIRESTORE_EMULATOR_HOST=127.0.0.1:8080 GOOGLE_CLOUD_PROJECT=staging ``` 2. From the "frontend/firestore" directory, start the emulator by running: - + ``` npm run serve ``` -3. Authenticate on your local machine with Google Cloud to allow proper credential usage: - +3. Authenticate on your local machine with Google Cloud to allow proper + credential usage: + ``` gcloud auth application-default login ``` -Once these steps are complete, the application can connect to the local emulator to simulate production-like Firestore behavior for debugging or development. +Once these steps are complete, the application can connect to the local emulator +to simulate production-like Firestore behavior for debugging or development. diff --git a/vrf/README.md b/vrf/README.md index 69b156d67..9762d2617 100644 --- a/vrf/README.md +++ b/vrf/README.md @@ -1,7 +1,15 @@ ## Implementing the OCaml node’s verifiable random function (VRF) into the Openmina node’s block producer -In Proof of Stake (PoS) systems, block producers are chosen based on their stake. However, to avoid centralization and giving too much power to major stake-owners, we need to add an element of randomness in selecting new block producers. +In Proof of Stake (PoS) systems, block producers are chosen based on their +stake. However, to avoid centralization and giving too much power to major +stake-owners, we need to add an element of randomness in selecting new block +producers. -The role of a Verifiable Random Function (VRF) in this context is to add an element of randomness to the selection process, which complements the stake-based criteria. A VRF ensures that this selection process is fair and unbiased, preventing any single entity from gaining undue influence or control over the block production process. +The role of a Verifiable Random Function (VRF) in this context is to add an +element of randomness to the selection process, which complements the +stake-based criteria. A VRF ensures that this selection process is fair and +unbiased, preventing any single entity from gaining undue influence or control +over the block production process. -VRFs are cryptographic primitives that generate a random number and proof that the number was legitimately generated. +VRFs are cryptographic primitives that generate a random number and proof that +the number was legitimately generated.