Skip to content

Commit d2792ee

Browse files
Merge pull request #360 from Ratio1/develop
Develop
2 parents 5658641 + 755d67e commit d2792ee

File tree

9 files changed

+300
-66
lines changed

9 files changed

+300
-66
lines changed

AGENTS.md

Lines changed: 25 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,18 @@
33
This file is the durable operating manual for future agents working in `/edge_node`.
44
It has two goals:
55
1. Stable reference information that should remain useful across sessions.
6-
2. Append-only memory of important discoveries, decisions, and changes.
6+
2. Curated high-signal memory of critical/fundamental changes and horizontal project insights.
77

88
## Hard Rules
9-
- Treat this file as append-only memory for the `Memory Log` section.
10-
- Never delete or rewrite prior memory entries.
9+
- Treat the `Memory Log` as a high-signal ledger, not a full activity history.
10+
- Log only:
11+
- Critical/fundamental changes with architectural, security, operational, or reliability impact.
12+
- Horizontal insights that affect multiple subsystems, onboarding, deployment, or runbook safety.
13+
- Do not log:
14+
- Minor docs edits, section reordering, wording tweaks, formatting, or cosmetic refactors.
15+
- Narrow/local changes without material behavioral or operational impact.
16+
- Keep the `Memory Log` append-only for qualifying entries.
17+
- Cleanup/removal of ballast entries is allowed only during explicit curation requested by project owners.
1118
- If an older entry is wrong, add a new correction entry that references the old entry ID.
1219
- Use UTC timestamps in ISO-8601 format: `YYYY-MM-DDTHH:MM:SSZ`.
1320
- Keep shell examples copy-pasteable.
@@ -110,39 +117,35 @@ BUILDER must:
110117
- Refine the change if needed.
111118
- List verification commands run and observed results (pass/fail + short evidence).
112119

113-
### Step 4: Log It
114-
Append a `Memory Log` entry with:
120+
### Step 4: Log It (Critical-Only)
121+
Append a `Memory Log` entry only when the change/insight is critical or fundamental.
122+
Each entry must include:
115123
- Timestamp and entry ID.
116124
- Summary of change and decision.
125+
- Why this is critical/horizontal.
117126
- CRITIC findings summary.
118127
- Verification commands and outcomes.
119128
- If correction: `Correction of: <entry_id>`.
120129

121-
## Memory Log (append-only)
130+
## Memory Log (critical-only; append-only for qualifying entries)
122131

123132
Entry format:
124133
- `ID`: `ML-YYYYMMDD-###`
125134
- `Timestamp`: UTC ISO-8601
126135
- `Type`: discovery | decision | change | correction
127136
- `Summary`:
137+
- `Criticality`:
128138
- `Details`:
129139
- `Verification`:
130140
- `Links`:
131141

132142
---
133143

134-
- ID: `ML-20260211-001`
135-
- Timestamp: `2026-02-11T09:13:34Z`
136-
- Type: `discovery`
137-
- Summary: Repo-wide docs/ops audit performed to ground AGENTS/README rewrite.
138-
- Details: Confirmed runtime entrypoints (`device.py`, `constants.py`), operational scripts (`cmds/`), compose variants (`docker-compose/`), and deployment artifacts (`docker/`, `k8s/`).
139-
- Verification: `pwd && ls -la`; `find . -maxdepth 2 -type d | sort`; `find extensions -type f`; `find plugins -type f`
140-
- Links: `device.py`, `constants.py`, `docker-compose/debug-docker-compose.yaml`, `cmds/get_node_info`
141-
142144
- ID: `ML-20260211-002`
143145
- Timestamp: `2026-02-11T09:13:34Z`
144146
- Type: `discovery`
145147
- Summary: Found operational mismatches that can break onboarding.
148+
- Criticality: Cross-cutting operations/onboarding risk across local dev, compose, and k8s paths.
146149
- Details: `debug.sh` builds `local_node` while debug compose expects `local_edge_node`; `docker-compose/debug_start.bat` references missing `Dockerfile_dev`; multiple `k8s/` naming/namespace/PVC path mismatches exist.
147150
- Verification: `rg -n "local_edge_node|local_node" -S`; `sed -n '1,120p' debug.sh`; `sed -n '1,160p' docker-compose/debug_start.bat`; `sed -n '1,220p' k8s/README.md`; `sed -n '1,220p' k8s/edgenode-deploy.yaml`; `sed -n '1,220p' k8s/edgenode-sa.yaml`; `sed -n '1,220p' k8s/edgenode-storage.yaml`
148151
- Links: `debug.sh`, `docker-compose/debug-docker-compose.yaml`, `docker-compose/debug_start.bat`, `k8s/README.md`, `k8s/edgenode-deploy.yaml`
@@ -151,46 +154,16 @@ Entry format:
151154
- Timestamp: `2026-02-11T09:13:34Z`
152155
- Type: `change`
153156
- Summary: Replaced prior short AGENTS guidance with durable long-term memory structure and mandatory BUILDER-CRITIC loop.
157+
- Criticality: Foundation process change governing agent behavior and decision quality.
154158
- Details: Added stable sections for run/test, repo map, conventions, pitfalls; established append-only log protocol with correction semantics.
155159
- Verification: `sed -n '1,260p' AGENTS.md`
156160
- Links: `AGENTS.md`
157161

158-
- ID: `ML-20260211-004`
159-
- Timestamp: `2026-02-11T09:19:14Z`
160-
- Type: `change`
161-
- Summary: Rewrote `README.md` to prioritize operator usability and ordered sections as requested.
162-
- Details: Added explicit Need/Objective/Purpose, moved all practical usage content under `Usability & Features` (quickstart/examples/config/outputs/troubleshooting), and placed architecture/modules/deps/testing/security under `Technical Details`.
163-
- Verification: `git diff -- README.md`; `sed -n '1,320p' README.md`
164-
- Links: `README.md`
165-
166-
- ID: `ML-20260211-005`
167-
- Timestamp: `2026-02-11T09:19:14Z`
168-
- Type: `discovery`
169-
- Summary: Verification run identified environment/tooling limits and current test signal.
170-
- Details: In this workspace, `docker` exists but neither `docker compose` plugin nor `docker-compose` binary is available; compose syntax could not be validated by execution. `python3 -m unittest discover -s plugins -p "*test*.py"` returns `Ran 0 tests`. Focused RedMesh suite runs but currently fails (`34` tests run, `1` failure, `3` errors with missing `service_info`/`web_tests_info` keys).
171-
- Verification: `command -v docker || true; command -v python3 || true`; `docker compose -f docker-compose/debug-docker-compose.yaml config`; `docker-compose -f docker-compose/debug-docker-compose.yaml config`; `python3 -m unittest discover -s plugins -p "*test*.py"`; `python3 -m unittest extensions.business.cybersec.red_mesh.test_redmesh`
172-
- Links: `README.md`, `AGENTS.md`, `extensions/business/cybersec/red_mesh/test_redmesh.py`
173-
174-
- ID: `ML-20260211-006`
175-
- Timestamp: `2026-02-11T09:20:06Z`
176-
- Type: `change`
177-
- Summary: Added Compose command compatibility note (`docker-compose` and Compose v2 `docker compose`) in stable docs.
178-
- Details: Kept primary examples aligned to repo scripts (`docker-compose`) while explicitly documenting the Compose v2 equivalent to reduce operator ambiguity.
179-
- Verification: `rg -n "docker-compose|docker compose" AGENTS.md README.md`
180-
- Links: `AGENTS.md`, `README.md`
181-
182-
- ID: `ML-20260211-007`
183-
- Timestamp: `2026-02-11T12:27:43Z`
162+
- ID: `ML-20260212-009`
163+
- Timestamp: `2026-02-12T14:32:58Z`
184164
- Type: `change`
185-
- Summary: Restored README citation section with the original BibTeX entries and improved end-of-doc discoverability.
186-
- Details: BUILDER Intent: restore dropped `## Citation` content exactly and place it before `## License` so researchers can find it quickly. Change scope: `README.md`, `AGENTS.md`. Assumptions: prior citations were correct and should be preserved verbatim. CRITIC findings: risk of accidental BibTeX drift or malformed markdown fence during reinsertion. BUILDER response: copied entries verbatim from provided/previous content, used explicit `bibtex` fenced blocks, and validated marker strings and line placement.
187-
- Verification: `rg -n "^## Citation|@misc\\{Ratio1EdgeNode|@inproceedings\\{Damian2025CSCS|@misc\\{Damian2025arXiv" README.md` (pass: section + all three entries found); `nl -ba README.md | sed -n '165,260p'` (pass: citation block rendered between Related Repositories and License)
188-
- Links: `README.md`, `AGENTS.md`
189-
190-
- ID: `ML-20260211-008`
191-
- Timestamp: `2026-02-11T12:30:01Z`
192-
- Type: `change`
193-
- Summary: Restored the previously removed `Contact` and `Project Financing Disclaimer` sections in README.
194-
- Details: BUILDER Intent: reinsert the exact prior sections requested by user while keeping the recently restored citation block intact. Change scope: `README.md`, `AGENTS.md`. Assumptions: old section wording remained authoritative and should be restored verbatim. CRITIC findings: risk of introducing wording drift or placing sections in a confusing position near document end. BUILDER response: copied text directly from pre-rewrite README snapshot and inserted it immediately before `## License` to keep end-of-doc informational/legal sections grouped.
195-
- Verification: `rg -n "^## Contact|^## Project Financing Disclaimer|support@ratio1.ai|SMIS 143488|SMIS 156084" README.md` (pass: both headings and key markers found); `nl -ba README.md | sed -n '210,280p'` (pass: sections rendered with expected paragraphs and order)
196-
- Links: `README.md`, `AGENTS.md`
165+
- Summary: Re-scoped AGENTS memory policy to critical-only logging and pruned prior ballast entries.
166+
- Criticality: Fundamental governance change for long-term agent memory quality and signal-to-noise control.
167+
- Details: Updated Hard Rules and BUILDER-CRITIC Step 4 to enforce critical/horizontal-only logging; removed non-critical historical entries (`ML-20260211-001`, `ML-20260211-004`, `ML-20260211-005`, `ML-20260211-006`, `ML-20260211-007`, `ML-20260211-008`) per owner request.
168+
- Verification: `rg -n "critical-only|ballast|Criticality|ML-20260211-00[1245678]|ML-20260212-009" AGENTS.md`; `sed -n '1,260p' AGENTS.md`
169+
- Links: `AGENTS.md`

extensions/business/container_apps/container_app_runner.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3081,6 +3081,10 @@ def _restart_container(self, stop_reason=None):
30813081
self._stop_container_and_save_logs_to_disk()
30823082
self.__reset_vars()
30833083

3084+
# Reset chainstore response for restart cycle
3085+
self.reset_chainstore_response()
3086+
self.set_plugin_ready(False)
3087+
30843088
# Restore preserved state (reset_vars clears it)
30853089
self._consecutive_failures = preserved_failures
30863090
self._last_successful_start = preserved_last_success

extensions/business/deeploy/deeploy_manager_api.py

Lines changed: 51 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
'SUPRESS_LOGS_AFTER_INTERVAL' : 300,
3838
'WARMUP_DELAY' : 300,
3939
'PIPELINES_CHECK_DELAY' : 300,
40+
'MIN_ETH_BALANCE' : 0.00005,
4041

4142
'VALIDATION_RULES': {
4243
**BasePlugin.CONFIG['VALIDATION_RULES'],
@@ -74,8 +75,47 @@ def on_init(self):
7475
)
7576
self.__warmup_start_time = self.time()
7677
self.__last_pipelines_check_time = 0
78+
if not self.__check_eth_balance():
79+
self.P(
80+
f"Shutting down tunnel engine for {self.__class__.__name__} due to insufficient ETH balance "
81+
f"on {my_eth_address}. Please top up and restart the node.",
82+
color='r', boxed=True
83+
)
84+
self.maybe_stop_tunnel_engine()
7785
return
7886

87+
def __check_eth_balance(self):
88+
"""
89+
Check if the oracle has enough ETH to cover gas fees for web3 transactions.
90+
Returns True if balance is sufficient, False otherwise.
91+
"""
92+
try:
93+
eth_address = self.bc.eth_address
94+
balances = self.bc.get_addresses_balances([eth_address])
95+
eth_balance = balances.get(eth_address, {}).get("ethBalance", 0)
96+
if eth_balance < self.cfg_min_eth_balance:
97+
self.P(
98+
f"Insufficient ETH balance for oracle {eth_address}: "
99+
f"{eth_balance:.6f} ETH < {self.cfg_min_eth_balance} ETH minimum.",
100+
color='r'
101+
)
102+
return False
103+
return True
104+
except Exception as e:
105+
self.P(f"Failed to check ETH balance: {e}", color='r')
106+
return False
107+
108+
def __ensure_eth_balance(self):
109+
"""
110+
Check ETH balance and raise ValueError if insufficient for web3 transactions.
111+
Called at the top of mutating endpoints.
112+
"""
113+
if not self.__check_eth_balance():
114+
raise ValueError(
115+
f"{DEEPLOY_ERRORS.GENERIC}: Oracle {self.bc.eth_address} does not have enough ETH "
116+
f"to cover gas fees. Please top up the address and retry."
117+
)
118+
79119
def __handle_error(self, exc, request, extra_error_code=DEEPLOY_ERRORS.GENERIC):
80120
"""
81121
Handle the error and return a response.
@@ -164,6 +204,7 @@ def _process_pipeline_request(
164204
The response dictionary
165205
"""
166206
try:
207+
self.__ensure_eth_balance()
167208
sender, inputs = self.deeploy_verify_and_get_inputs(request)
168209
normalized_request = self._normalize_plugins_input(self.deepcopy(request))
169210
if DEEPLOY_KEYS.PLUGINS in normalized_request:
@@ -622,13 +663,14 @@ def scale_up_job_workers(self,
622663
A dictionary with the result of the operation
623664
"""
624665
try:
625-
sender, inputs = self.deeploy_verify_and_get_inputs(request)
666+
self.__ensure_eth_balance()
667+
sender, inputs = self.deeploy_verify_and_get_inputs(request)
626668
auth_result = self.deeploy_get_auth_result(inputs)
627669
job_id = inputs.get(DEEPLOY_KEYS.JOB_ID, None)
628670
if not job_id:
629671
msg = f"{DEEPLOY_ERRORS.REQUEST13}: Job ID is required."
630672
raise ValueError(msg)
631-
673+
632674
is_confirmable_job = inputs.chainstore_response
633675

634676
# check payment
@@ -659,13 +701,12 @@ def scale_up_job_workers(self,
659701
nodes = list(cstore_response["node"] for cstore_response in dct_status.values())
660702
self.Pd(f"Nodes to confirm: {self.json_dumps(nodes, indent=2)}")
661703

662-
self._submit_bc_job_confirmation(str_status=str_status,
663-
dct_status=dct_status,
664-
nodes=nodes,
665-
job_id=job_id,
704+
self._submit_bc_job_confirmation(str_status=str_status,
705+
dct_status=dct_status,
706+
nodes=nodes,
707+
job_id=job_id,
666708
is_confirmable_job=is_confirmable_job)
667709

668-
669710
return_request = request.get(DEEPLOY_KEYS.RETURN_REQUEST, False)
670711
if return_request:
671712
dct_request = self.deepcopy(request)
@@ -715,6 +756,7 @@ def delete_pipeline(self,
715756
A dictionary with the result of the operation
716757
"""
717758
try:
759+
self.__ensure_eth_balance()
718760
self.Pd(f"Called Deeploy delete_pipeline endpoint")
719761
sender, inputs = self.deeploy_verify_and_get_inputs(request)
720762
auth_result = self.deeploy_get_auth_result(inputs)
@@ -774,6 +816,7 @@ def send_instance_command(self,
774816
A dictionary with the result of the operation
775817
"""
776818
try:
819+
self.__ensure_eth_balance()
777820
self.Pd(f"Called Deeploy send_instance_command endpoint")
778821
sender, inputs = self.deeploy_verify_and_get_inputs(request)
779822
auth_result = self.deeploy_get_auth_result(inputs)
@@ -827,6 +870,7 @@ def send_app_command(self,
827870
A dictionary with the result of the operation
828871
"""
829872
try:
873+
self.__ensure_eth_balance()
830874
self.Pd(f"Called Deeploy send_app_command endpoint")
831875
sender, inputs = self.deeploy_verify_and_get_inputs(request)
832876
auth_result = self.deeploy_get_auth_result(inputs)

extensions/business/deeploy/deeploy_mixin.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2454,6 +2454,11 @@ def _get_online_apps(self, owner=None, target_nodes=None, job_id=None, project_i
24542454
continue
24552455
filtered_result[node][app_name] = app_data
24562456
result = filtered_result
2457+
2458+
for node, apps in result.items():
2459+
node_alias = self.netmon.network_node_eeid(node)
2460+
for _, app_data in apps.items():
2461+
app_data["node_alias"] = node_alias
24572462
return result
24582463

24592464
# TODO: REMOVE THIS, once instance_id is coming from ui for instances that have to be updated

extensions/business/edge_inference_api/base_inference_api.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,8 @@
8585
"REQUEST_TIMEOUT": 600, # 10 minutes
8686
"SAVE_PERIOD": 300, # 5 minutes
8787

88+
"LOG_REQUESTS_STATUS_EVERY_SECONDS": 5, # log pending request status every 5 seconds
89+
8890
"REQUEST_TTL_SECONDS": 60 * 60 * 2, # keep historical results for 2 hours
8991
"RATE_LIMIT_PER_MINUTE": 5,
9092
"AUTH_TOKEN_ENV": "INFERENCE_API_TOKEN",
@@ -129,6 +131,7 @@ def on_init(self):
129131
self.P(err_msg)
130132
raise ValueError(err_msg)
131133
# endif AI_ENGINE not specified
134+
self._request_last_log_time: Dict[str, float] = {}
132135
self._requests: Dict[str, Dict[str, Any]] = {}
133136
self._api_errors: Dict[str, Dict[str, Any]] = {}
134137
# TODO: add inference metrics tracking (latency, tokens, etc)
@@ -569,7 +572,11 @@ def solve_postponed_request(self, request_id: str):
569572
Request result when completed or failed, or a PostponedRequest for pending work.
570573
"""
571574
if request_id in self._requests:
572-
self.Pd(f"Checking status of request ID {request_id}...")
575+
last_logged_status = self._request_last_log_time.get(request_id, 0)
576+
if (self.time() - last_logged_status) > self.cfg_log_requests_status_every_seconds:
577+
self.Pd(f"Checking status of request ID {request_id}...")
578+
self._request_last_log_time[request_id] = self.time()
579+
# endif logging status
573580
request_data = self._requests[request_id]
574581

575582
self.maybe_mark_request_timeout(request_id=request_id, request_data=request_data)

0 commit comments

Comments
 (0)