Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 53 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Ratio1 Edge Node


Welcome to the **Ratio1 Edge Node** repository, formerly known as the **Naeural Edge Protocol Edge Node**. As a pivotal component of the Ratio1 ecosystem, this Edge Node software empowers a decentralized, privacy-preserving, and secure edge computing network. By enabling a collaborative network of edge nodes, Ratio1 facilitates the secure sharing of resources and the seamless execution of computation tasks across diverse devices.

Documentation sections:
Expand All @@ -16,12 +17,12 @@ Documentation sections:

## Introduction

The Ratio1 Edge Node is a meta Operating System designed to operate on edge devices, providing them the essential functionality required to join and thrive within the Ratio1 network. Each Edge Node manages the device’s resources, executes computation tasks efficiently, and communicates securely with other nodes in the network. Leveraging the powerful Ratio1 core libraries (formely known as Naeural Edge Protocol libraries) `naeural_core` and `ratio1` the Ratio1 Edge Node offers out-of-the-box usability starting in 2025. Users can deploy the Edge Node and SDK (`ratio1`) effortlessly without the need for intricate configurations, local subscriptions, tenants, user accounts, passwords, or broker setups.
The Ratio1 Edge Node is a meta Operating System designed to operate on edge devices, providing them the essential functionality required to join and thrive within the Ratio1 network. Each Edge Node manages the device’s resources, executes computation tasks efficiently, and communicates securely with other nodes in the network. Leveraging the powerful Ratio1 core libraries (formerly known as Naeural Edge Protocol libraries) `naeural_core` and the Ratio1 SDK (`ratio1_sdk`, published on PyPI as `ratio1`), the Ratio1 Edge Node offers out-of-the-box usability starting in 2025 without intricate configurations, local subscriptions, tenants, user accounts, passwords, or broker setups.

## Related Repositories

- [ratio1/naeural_core](https://github.com/ratio1/naeural_core) provides the modular pipeline engine that powers data ingestion, processing, and serving inside this node. Extend or troubleshoot runtime behavior by mirroring the folder layout in `extensions/` against the upstream modules.
- [Ratio1/ratio1_sdk](https://github.com/Ratio1/ratio1_sdk) is the client toolkit for building and dispatching jobs to Ratio1 nodes. Its tutorials pair with the workflows in `plugins/business/tutorials/` and are the best place to validate end-to-end scenarios.
- [Ratio1/ratio1_sdk](https://github.com/Ratio1/ratio1_sdk) is the client toolkit for building and dispatching jobs to Ratio1 nodes (published on PyPI as `ratio1`). Its tutorials pair with the workflows in `plugins/business/tutorials/` and are the best place to validate end-to-end scenarios.

When developing custom logic, install the three repositories in the same virtual environment (`pip install -e . ../naeural_core ../ratio1_sdk`) so interface changes remain consistent across the stack.

Expand All @@ -33,28 +34,30 @@ When developing custom logic, install the three repositories in the same virtual
Deploying a Ratio1 Edge Node within a development network is straightforward. Execute the following Docker command to launch the node making sure you mount a persistent volume to the container to preserve the node data between restarts:

```bash
docker run -d --rm --name r1node --pull=always -v r1vol:/edge_node/_local_cache/ ratio1/edge_node:develop
docker run -d --rm --name r1node --pull=always -v r1vol:/edge_node/_local_cache/ ratio1/edge_node:devnet
```

- `-d`: Runs the container in the background.
- `--rm`: Removes the container upon stopping.
- `--name r1node`: Assigns the name `r1node` to the container.
- `--pull=always`: Ensures the latest image version is always pulled.
- `ratio1/edge_node:develop`: Specifies the Docker image to run.
- `ratio1/edge_node:devnet`: Specifies the devnet image; use `:mainnet` or `:testnet` for those networks.
- `-v r1vol:/edge_node/_local_cache/`: Mounts the `r1vol` volume to the `/edge_node/_local_cache/` directory within the container.

Architecture-specific variants (for example `:devnet-arm64`, `:devnet-tegra`, `:devnet-amd64-cpu`) will follow; pick the tag that matches your hardware once available.

This command initializes the Ratio1 Edge Node in development mode, automatically connecting it to the Ratio1 development network and preparing it to receive computation tasks while ensuring that all node data is stored in `r1vol`, preserving it between container restarts.


If for some reason you encounter issues when running the Edge Node, you can try to run the container with the `--platform linux/amd64` flag to ensure that the container runs on the correct platform.

```bash
docker run -d --rm --name r1node --platform linux/amd64 --pull=always -v r1vol:/edge_node/_local_cache/ ratio1/edge_node:develop
docker run -d --rm --name r1node --platform linux/amd64 --pull=always -v r1vol:/edge_node/_local_cache/ ratio1/edge_node:devnet
```
Also, if you have GPU(s) on your machine, you can enable GPU support by adding the `--gpus all` flag to the Docker command. This flag allows the Edge Node to utilize the GPU(s) for computation tasks.

```bash
docker run -d --rm --name r1node --gpus all --pull=always -v r1vol:/edge_node/_local_cache/ ratio1/edge_node:develop
docker run -d --rm --name r1node --gpus all --pull=always -v r1vol:/edge_node/_local_cache/ ratio1/edge_node:devnet
```

This will ensure that your node will be able to utilize the GPU(s) for computation tasks and will accept training and inference jobs that require GPU acceleration.
Expand All @@ -64,12 +67,12 @@ This will ensure that your node will be able to utilize the GPU(s) for computati
If you want to run multiple Edge Nodes on the same machine, you can do so by specifying different names for each container but more importantly, you need to specify different volumes for each container to avoid conflicts between the nodes. You can do this by creating a new volume for each node and mounting it to the container as follows:

```bash
docker run -d --rm --name r1node1 --pull=always -v r1vol1:/edge_node/_local_cache/ ratio1/edge_node:develop
docker run -d --rm --name r1node2 --pull=always -v r1vol2:/edge_node/_local_cache/ ratio1/edge_node:develop
docker run -d --rm --name r1node1 --pull=always -v r1vol1:/edge_node/_local_cache/ ratio1/edge_node:devnet
docker run -d --rm --name r1node2 --pull=always -v r1vol2:/edge_node/_local_cache/ ratio1/edge_node:devnet
```

Now you can run multiple Edge Nodes on the same machine without any conflicts between them.
>NOTE: If you are running multiple nodes on the same machine it is recommended to use docker-compose to manage the nodes. You can find an example of how to run multiple nodes on the same machine using docker-compose in the [Running multiple nodes on the same machine](#running-multiple-nodes-on-the-same-machine) section.
>NOTE: If you are running multiple nodes on the same machine it is recommended to use docker-compose to manage the nodes. You can find a docker-compose example in the section below.


## Inspecting the Edge Node
Expand Down Expand Up @@ -145,6 +148,8 @@ The [Ratio1 SDK](https://github.com/Ratio1/ratio1_sdk) is the recommended way to
pip install -e ../ratio1_sdk
```

If you prefer the published package, install from PyPI via `pip install ratio1`.

- Use the `nepctl` (formerly `r1ctl`) CLI that ships with the SDK to inspect the network, configure clients, and dispatch jobs.
- Explore `ratio1_sdk/tutorials/` for end-to-end examples; most have matching runtime counterparts in `plugins/business/tutorials/` inside this repository.
- SDK releases 2.6+ perform automatic dAuth configuration. After whitelisting your client, you can submit jobs without additional secrets.
Expand Down Expand Up @@ -226,6 +231,7 @@ Lets suppose you have the following node data:
"whitelist": [
"0xai_AthDPWc_k3BKJLLYTQMw--Rjhe3B6_7w76jlRpT6nDeX"
]
}
}
```

Expand All @@ -250,6 +256,7 @@ docker exec r1node get_node_info
"whitelist": [
"0xai_AthDPWc_k3BKJLLYTQMw--Rjhe3B6_7w76jlRpT6nDeX"
]
}
}
```

Expand Down Expand Up @@ -286,7 +293,7 @@ If you want to run multiple nodes on the same machine the best option is to use
```yaml
services:
r1node1:
image: ratio1/edge_node:testnet
image: ratio1/edge_node:devnet
container_name: r1node1
platform: linux/amd64
restart: always
Expand All @@ -297,7 +304,7 @@ services:
- "com.centurylinklabs.watchtower.stop-signal=SIGINT"

r1node2:
image: ratio1/edge_node:testnet
image: ratio1/edge_node:devnet
container_name: r1node2
platform: linux/amd64
restart: always
Expand Down Expand Up @@ -350,7 +357,7 @@ docker-compose down

Now, lets dissect the `docker-compose.yml` file:
- we have a variable number of nodes - in our case 2 nodes - `r1node1` and `r1node2` as services (we commented out the third node for simplicity)
- each node is using the `ratio1/edge_node:testnet` image
- each node is using the `ratio1/edge_node:devnet` image (swap the tag for `:mainnet` or `:testnet` as needed; architecture-specific variants such as `-arm64`, `-tegra`, `-amd64-cpu` will follow)
- each node has own unique volume mounted to it
- we have a watchtower service that will check for new images every 1 minute and will update the nodes if a new image is available

Expand All @@ -375,6 +382,7 @@ For inquiries regarding the funding and its impact on this project, please conta

## Citation


If you use the Ratio1 Edge Node in your research or projects, please cite it as follows:

```bibtex
Expand All @@ -385,3 +393,36 @@ If you use the Ratio1 Edge Node in your research or projects, please cite it as
howpublished = {\url{https://github.com/Ratio1/edge_node}},
}
```


Additional publications and references:

```bibtex
@inproceedings{Damian2025CSCS,
author = {Damian, Andrei Ionut and Bleotiu, Cristian and Grigoras, Marius and
Butusina, Petrica and De Franceschi, Alessandro and Toderian, Vitalii and
Tapus, Nicolae},
title = {Ratio1 meta-{OS} -- decentralized {MLOps} and beyond},
booktitle = {2025 25th International Conference on Control Systems and Computer Science (CSCS)},
year = {2025},
pages = {258--265},
address = {Bucharest, Romania},
month = {May 27--30},
doi = {10.1109/CSCS66924.2025.00046},
isbn = {979-8-3315-7343-0},
issn = {2379-0482},
publisher = {IEEE}
}

@misc{Damian2025arXiv,
title = {Ratio1 -- AI meta-OS},
author = {Damian, Andrei and Butusina, Petrica and De Franceschi, Alessandro and
Toderian, Vitalii and Grigoras, Marius and Bleotiu, Cristian},
year = {2025},
month = {September},
eprint = {2509.12223},
archivePrefix = {arXiv},
primaryClass = {cs.OS},
doi = {10.48550/arXiv.2509.12223}
}
```
17 changes: 17 additions & 0 deletions extensions/business/cybersec/red_mesh/PROMPT.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
[We have the following new features that must be implemented in RedMesh red-teaming pentesting framework API:
- better monitoring of worker status and job progress
- possibility to select certain types of tests instead of running all tests (we already have the option to exclude certain ports)
- per-worker port distribution strategies: "slice" each Ratio1 worker that runs RedMesh will receive a "slice" or "mirror" when all workers get the same port range
- for each local-worker (thread configured via `nr_local_workers` parameter of the `_launch_job` method) we want to have a parameter called PORT_ORDER (sequential or shuffle), that would define if we want to iterate through ports sequentially or shuffle. In on_inti order ports_to_scan per port_order.
- run mode: singlepass (one time scan) vs continuous monitoring (scanning continuously). Don't forget to keep the stats about each run .
- for continuous job, make it possible to configure the scheduling
- Add parameters for configuring Pacing/jitter (“Dune sand walking”) to slow/space out scans by adding random pauses

I want you to describe the architecture as minimalistic as possible. Don't add unnecessary thing. It should follow the Keep It Simple Stuping principles.
It should be simple and functional.


Review the TODO_A.md and TODO_B.md and create a final TODO_C.md that proposes a action plan with clear architectural steps to implement these features (no code as TODO_B.md currently wrongly has incorrect code)
Please, focus on architecture, feature and planning them.

Use the maximum amount of ultrathink. Take all the time you need. It's much better if you do too much research and thinking than not enough.
61 changes: 61 additions & 0 deletions extensions/business/cybersec/red_mesh/TODO_A.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# RedMesh feature plan (PentesterApi01 / PentestLocalWorker)

## Current state (quick map)
- API plugin `pentester_api_01.py` orchestrates jobs announced via CStore, splits target port ranges evenly across local workers, and aggregates reports when all workers finish.
- `PentestLocalWorker` auto-runs port scan → service probes (all `_service_info_*`) → web probes (all `_web_test_*`), with optional port exclusions but no per-test selection or pacing controls.
- Job spec today: `job_id`, `target`, `start_port`, `end_port`, `exceptions`, `launcher`, `workers{peer->{finished,result}}`. No concept of run mode (single/continuous), jitter, or distribution strategy choices.

## Required feature tracks
- Service/web test selection
- Extend job spec with `include_tests` / `exclude_tests` (separate lists for `service_info` vs `web_tests`) validated against `_get_all_features()`. Default: run all.
- Add API params to `launch_test` (and validation rules) to accept comma/space lists; normalize to method names. Reject unknown tests with a helpful error.
- `PentestLocalWorker` should accept the allowed set and filter the discovered `_service_info_*` / `_web_test_*` before execution. Persist allowed/blocked lists in report metadata for auditability.
- Reporting: add per-worker fields `tests_run`, `tests_skipped`, and propagate to aggregated report.
- Tests: add unit coverage for include-only, exclude-only, and conflicting rules (exclude wins).

- Worker port-range distribution modes
- New job flag `distribution_mode`: `slice` (default, breadth-first coverage) vs `mirror` (every worker gets same range) vs optional `staggered` (all workers same range but randomized start offset/stride to reduce duplication).
- If `mirror`/`staggered`, mark worker reports with `coverage_strategy` and ensure aggregation dedupes `open_ports` and merges service/web results deterministically.
- Wire flag into `_launch_job` splitter; keep guardrails when requested workers > ports. In `staggered`, randomize per-worker port order and introduce optional `max_retries_per_port` to bound duplicate effort.
- Config surface: plugin default (e.g., `CFG_PORT_DISTRIBUTION_MODE`), default `slice`. Default worker count = available CPU cores (plugin runs as sole job); allow override but cap at cores.

- Run mode: singlepass vs continuous monitoring
- Add `run_mode` in job spec (`singlepass` default, `continuous` for chained jobs). Continuous: after `_close_job`, schedule a successor job with inherited params, new `job_id`, incremented `iteration` counter, and backoff delay.
- Persist lineage fields (`parent_job_id`, `iteration`, `last_report_at`, `next_launch_at`) to aid observability and cleanup. Add TTL/`max_iterations` or `stop_after` datetime to prevent infinite loops.
- API responses should surface next scheduled run time; allow `stop_and_delete_job` to cancel the chain (mark lineage as canceled in cstore).
- Consider a `run_interval_sec` knob; default to a conservative interval to avoid rate-limiting targets.
- Add optional daily runtime windows (UTC hour-based `window_start`, `window_end`, 0–23) with at least one-hour disallowed window; continuous mode should pause/resume respecting the window. Default can be full-day minus a 1-hour break for safety.

- Steps temporization (“Dune sand walking”)
- Job-level pacing config: `min_steps`, `max_steps`, `min_wait_sec`, `max_wait_sec`. Optional for `singlepass`; enforced non-zero for `continuous` (fallback defaults if not provided).
- Implement in `PentestLocalWorker` port scanning loop and optionally between service/web methods: after a random number of actions, sleep random wait while honoring `stop_event`.
- Record pacing stats in worker status (`jitter_applied`, `total_sleep_sec`) for transparency. Make values configurable via plugin defaults and `launch_test` params.

## Additional logical enhancements (fit with API/framework)
- Target/port randomization: shuffle port order per worker (configurable) to reduce IDS signatures and distribute load.
- Safe-rate controller: per-target max RPS and concurrent sockets; auto-throttle on repeated timeouts/connection resets to mimic human scanning.
- Fingerprinting & preflight: optional light-touch pre-scan (ICMP/TCP SYN-lite) to bail early on dead hosts; enrich reports with ASN/cloud provider to better interpret noise/blocks.
- Credential hygiene: allow injecting bearer/API keys for authenticated tests via secrets manager pointer (never store raw secrets in cstore; expect vaulted reference).
- Health/timeout guardrails: per-stage max duration; force-close jobs that exceed SLAs and flag in report to avoid runaway continuous chains.
- Observability: append `audit_log` entries (timestamps, action, module) to worker status; expose via `get_job_status` for forensic traceability.
- Extensibility hooks: plugin registry for `_service_info_*` / `_web_test_*` from `extensions/.../plugins` so users can add probes without core edits; validate names against allowlist.

## Stealth & red-team best practices (public Ratio1 edge nodes)
- Pacing and jitter: default to non-burst scans with randomized inter-request sleeps; stagger workers across time windows to evade traffic spikes.
- Traffic shaping: rotate User-Agent/Host headers, optionally bind to egress pools or proxies per cloud region to avoid IP reputation clustering.
- Noise reduction: avoid full 1–65535 sweeps by default; prefer common/high-value ports + heuristics from previous runs; honor exclusions strictly.
- Detection-aware retries: back off or skip ports when seeing WAF/IDS fingerprints (e.g., TCP RST storms, HTTP 429/403 patterns).
- Cover traffic & blending: mix benign HEAD/OPTIONS with probes; throttle to stay below typical NIDS thresholds; optionally insert dormant intervals to simulate human behavior.
- Logging hygiene: ensure reports strip sensitive headers/body fragments; store only minimal artifacts needed for findings.
- Authorization compliance: enforce explicit allowlists/attestation per target before running (config flag) to prevent misuse of public nodes.

## Testing & rollout
- Add unit tests covering new job spec validation, distribution modes, pacing counters, continuous chaining lifecycle, and aggregation dedupe paths.
- Provide a dry-run/simulation mode to exercise scheduling without sending network traffic for CI.
- Update documentation/README and FastAPI schema to reflect new params and defaults.

## Open questions for the product/ops team
- What is the acceptable default pacing for continuous mode (sleep floor/ceiling, max daily test hours) given UTC windows and the mandated 1-hour daily break?
- Confirm default distribution stays `slice` and whether any cap below `cpu_count` is desired (thermal/network guardrails).
- Do we need per-target authorization tokens or signed scopes to launch tests from public edge nodes, and are certain probes (SQLi/path traversal/auth bypass) forbidden for specific tenants/environments (e.g., production vs staging, regulated sectors)?
- How should chained jobs be retained (TTL) and how much historical reporting is required for compliance?
Loading