feat(proof): backend-assurance harness for prim__strAppend (3/4) + prim__strSubstr (4/4)#139
Merged
hyperpolymath merged 1 commit intoMay 21, 2026
Conversation
…rSubstr (2/4 + 3/4) Continues epic #87 Tier C backend-assurance campaign. Slice 1/4 (prim__eqChar) landed in #129; this PR bundles slices 3 and 4 (prim__strAppend + prim__strSubstr) per explicit owner direction — both primitives are length-arithmetic on opaque strings and share the same Chez-substring-vs-BEAM-UTF-8 trusted-extraction argument shape, so a combined slice trades 'one PR per primitive' for review ergonomics. The remaining primitive (prim__strToCharList, backing unpackLength) ships separately. Discharges external evidence for two class-(J) believe_me axioms in src/abi/Boj/SafetyLemmas.idr: :226 appendLengthSum : length (s ++ t) = length s + length t :233 substrLengthBound : LTE (length (substr start len s)) len The in-language proof does NOT change. The %unsafe / believe_me sites stay in SafetyLemmas.idr. The harness shrinks the trusted base from "we trust the backend" to "we read the backend lowering and randomly tested the operation against the property". Artefacts (matching the #129 template): - elixir/test/backend_assurance/prim_str_append_test.exs BEAM property tests via stream_data over UTF-8 binaries. Properties: length additivity, left/right identity, three-way-associativity-on-length, + boundary table covering all four UTF-8 encoding widths (ASCII / Latin-1 / CJK / astral). - elixir/test/backend_assurance/prim_str_substr_test.exs Properties: length bound, len=0 corner, start-past-end clamp, full-string slice tightness, + boundary table. - docs/backend-assurance/prim__strAppend.md Trusted-extraction prose: Chez (R6RS string-append + string-length, §6.7 + §11.12) and BEAM (Elixir <> + String.length/1 over UTF-8, prefix-free codepoint counting). - docs/backend-assurance/prim__strSubstr.md Trusted-extraction prose: Chez (R6RS substring with codegen clamp) and BEAM (String.slice/3 codepoint-counting clamp). - docs/backend-assurance/README.md Coverage table flipped from _pending_ for both rows. - PROOF-NEEDS.md Backend-assurance evidence column filled for both axioms; class column updated to "J ✓" marker indicating externally-validated class-(J) status (a bare "J" indicates harness not yet landed). Audit-table rationale updated to note the harness; remaining- axiomatic-surface justification reworded; standing-note "primitives landed" line bumped to reflect 3/4 done. The CI workflow .github/workflows/backend-assurance.yml already path-filters on elixir/test/backend_assurance/** and docs/backend-assurance/** so no workflow change is needed — the new test modules are picked up automatically. Constraints honoured (from docs/backend-assurance/README.md): - No new believe_me. External evidence only; the 2 axioms stay. - No in-language discharge. Ruled out by 2026-05-18 audit. - Two backends. Chez argued from R6RS; BEAM exercised by tests. - Honest framing. Surrogates excluded; normalisation out of scope and documented; negative len/start ruled out by upstream typing. Test plan: - [ ] CI backend-assurance workflow: BEAM property tests green - [x] elixir parse-check on both new test files (syntax-only) - [x] BEAM operation smoke on boundary strings (appendLengthSum 25 cases + substrLengthBound 80 cases) — direct String.length / String.slice/3 calls outside the test harness - [ ] CI: existing workflows unaffected (no path-filter collisions) Note: this session could not run `mix test --only backend_assurance` locally — the asdf elixir 1.18.4-otp-25 install on this machine is empty (only .mix archives, no binary), and system elixir 1.14 fails the project's `~> 1.15` requirement. Same gap pattern slice 1/4 hit; CI is the gate. Slice 2/4 (prim__strToCharList, unpackLength) is the only remaining primitive after this PR merges. After that, the 5 class-(J) axioms will all be externally validated and the backend-assurance campaign closes — modulo any future backend (e.g. a JS or native codegen) for which the trusted-extraction argument would need to be re-checked. Refs #87 (Tier C backend-assurance campaign). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
4 tasks
hyperpolymath
added a commit
that referenced
this pull request
May 21, 2026
Resolve backend-assurance coverage conflicts after PR #139 and keep all four primitive harnesses listed as validated. Adjust BEAM harness checks to measure codepoint counts explicitly instead of Elixir grapheme counts.
hyperpolymath
added a commit
that referenced
this pull request
May 21, 2026
…— campaign complete (#140) ## Summary **Closes the backend-assurance campaign** for epic #87 Tier C. With this slice merged alongside #129 (slice 1/4) and #139 (slices 3/4 + 4/4), all 5 class-(J) `believe_me` axioms in `src/abi/Boj/SafetyLemmas.idr` have external evidence. Discharges external evidence for the final class-(J) axiom: | Site | Axiom | Type | |------|-------|------| | `:218` | `unpackLength` | `(s : String) -> length (unpack s) = length s` | The in-language proof does **not** change. The `%unsafe` / `believe_me` site stays in source. The harness shrinks the trusted base from "we trust the backend" to "we read the lowering and randomly tested the operation against the property". ## Artefacts (matching the #129 / #139 template) - `elixir/test/backend_assurance/prim_str_to_char_list_test.exs` — BEAM property tests over the legal codepoint space (surrogates excluded). 4 properties + 1 boundary table: - length preservation: `length(String.to_charlist(s)) == String.length(s)` - empty-string corner: `String.to_charlist("") == []` - round-trip identity: `to_string(to_charlist(s)) == s` - charlist element-type sanity (legal codepoints, no surrogates) - `docs/backend-assurance/prim__strToCharList.md` — trusted-extraction prose. Chez argued from R6RS §6.7 (string model) + §11.12 (`string->list`); BEAM argued from UTF-8 prefix-free codepoint counting via Elixir `String.to_charlist/1` + `String.length/1`. - `docs/backend-assurance/README.md` — coverage-table row flipped from _pending_. - `PROOF-NEEDS.md` — backend-assurance evidence column filled for `unpackLength`. Class column updated to **J ✓** marker (convention introduced by #139). Site line number corrected from `:211` to `:218` (the file has shifted since the 2026-05-18 audit; previous PRs didn't sync this row). The existing CI workflow `.github/workflows/backend-assurance.yml` already path-filters on `elixir/test/backend_assurance/**` and `docs/backend-assurance/**`, so it picks up the new test module automatically — no workflow change in this PR. ## Constraints honoured Per `docs/backend-assurance/README.md`: - **No new `believe_me`.** External evidence only. - **No in-language discharge.** Ruled out by 2026-05-18 audit. - **Two backends.** Chez argued from R6RS; BEAM exercised by tests. - **Honest framing.** Surrogates excluded; normalisation out of scope and documented; round-trip property guards against a backend whose charlist conversion drops or duplicates codepoints in a way that happens to preserve length. ## Test plan - [x] Elixir parse-check on the new test file (syntax-only, via `Code.string_to_quoted!`) - [x] BEAM operation smoke on boundary strings — 9 strings (empty, ASCII, Latin-1, CJK, astral, max codepoint, etc.); direct `String.to_charlist/1` + `String.length/1` + `to_string` calls outside the test harness. Length preservation and round-trip identity both hold across all 9. - [ ] CI backend-assurance workflow: BEAM property tests green - [ ] CI: existing workflows unaffected (no path-filter collisions) > **Note:** this session could not run `mix test --only backend_assurance` > locally — the asdf `elixir 1.18.4-otp-25` install on the work > machine is empty (only `.mix` archives, no binary), and system > `elixir 1.14` fails the project's `~> 1.15` requirement. Same gap > pattern slices 1/4 and 3/4+4/4 followed; CI is the gate. ## Expected conflicts with #139 Both this PR and #139 branch off `origin/main` (per saved stacked-PR-avoidance feedback), so they share a base. Conflicts on merge are limited and trivial: | File | Conflict shape | Resolution | |---|---|---| | `docs/backend-assurance/README.md` | This PR flips the `prim__strToCharList` row; #139 flips the `prim__strAppend` and `prim__strSubstr` rows. | Different rows of the same table — line-level conflict only; accept both. | | `PROOF-NEEDS.md` audit table | This PR updates row 3; #139 updates rows 1/2/4/5 and adds the **J ✓** explanation paragraph. | Different rows; accept both. | | `PROOF-NEEDS.md` remaining-axiomatic-surface table | This PR updates `unpackLength` row; #139 updates the other four. | Different rows; accept both. | | `PROOF-NEEDS.md` standing note "Primitives validated so far" | #139 bumps it to 3/4; reviewer should apply 4/4 (campaign complete) once both merge. | Manual one-line edit during merge of whichever PR lands second. | The **J ✓** marker is used on row 3 here even though the explanation paragraph is contributed by #139 — post-#139-merge the explanation is in place, and the marker's meaning is self-evident from the value. ## Heads-up for the reviewer After this PR + #139 merge, the 4-slice backend-assurance harness is **complete**. All 5 class-(J) axioms in `SafetyLemmas.idr` have external evidence: - `charEqSound`, `charEqSym` via `prim__eqChar` (#129) - `unpackLength` via `prim__strToCharList` (this PR) - `appendLengthSum` via `prim__strAppend` (#139) - `substrLengthBound` via `prim__strSubstr` (#139) Future trusted-extraction work would only be needed if BoJ ships a new codegen backend (e.g. JavaScript, native ML). Refs #87 (Tier C backend-assurance campaign — closes the harness sequence). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🔍 Hypatia Security ScanFindings: 30 issues detected
View findings[
{
"reason": "Stale AI session file -- delete",
"type": "stale",
"file": "GEMINI.md",
"action": "delete",
"rule_module": "root_hygiene",
"severity": "medium"
},
{
"reason": "Issue in quality.yml",
"type": "missing_workflow",
"file": "quality.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in security-policy.yml",
"type": "missing_workflow",
"file": "security-policy.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/sanctify-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/academic-workflow-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/fireflag-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/ephapax-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/bofig-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/hesiod-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Continues the backend-assurance campaign for epic #87 Tier C. Slice
1/4 (
prim__eqChar, coveringcharEqSound+charEqSym) landed via#129. This PR bundles slices 3/4 and 4/4 per explicit owner
direction — both
prim__strAppendandprim__strSubstrarelength-arithmetic on opaque strings and share the same Chez-vs-BEAM
trusted-extraction argument shape, so a combined slice trades the
nominal "one PR per primitive" rule for review ergonomics.
Slice 2/4 (
prim__strToCharList, coveringunpackLength) remainspending; ships as a follow-on PR.
Discharges external evidence for two class-(J)
believe_meaxioms in
src/abi/Boj/SafetyLemmas.idr::226appendLengthSumlength (s ++ t) = length s + length t:233substrLengthBoundLTE (length (substr start len s)) lenThe in-language proof does not change. The
%unsafe/believe_mesites stay in source. The harness shrinks the trustedbase from "we trust the backend" to "we read the lowering and
randomly tested the operation against the property".
Artefacts (matching the #129 template)
elixir/test/backend_assurance/prim_str_append_test.exs— BEAMproperty tests over the legal codepoint space (surrogates
excluded). 4 properties + 1 boundary table covering all four UTF-8
encoding widths (ASCII / Latin-1 / CJK / astral).
elixir/test/backend_assurance/prim_str_substr_test.exs— BEAMproperty tests for the length bound. 4 properties + boundary
table covering len=0, start-past-end, full-string slice tightness,
and multi-byte codepoint slices.
docs/backend-assurance/prim__strAppend.md— trusted-extractionprose. Chez argued from R6RS §6.7 (string model) + §11.12
(
string-append); BEAM argued from UTF-8 prefix-free codepointcounting via Elixir
<>+String.length/1.docs/backend-assurance/prim__strSubstr.md— trusted-extractionprose. Chez argued from R6RS §11.12 (
substring) + codegen clamp;BEAM argued from
String.slice/3codepoint-counting clamp.docs/backend-assurance/README.md— coverage-table rows flippedfrom pending for both primitives.
PROOF-NEEDS.md— backend-assurance evidence column filled.Class column updated to J ✓ marker for the externally-
validated rows (a bare J indicates harness not yet landed).
Audit-table rationales updated to note the harness; remaining-
axiomatic-surface justifications reworded; standing-note
"primitives landed" line bumped to reflect 3/4 done.
The existing CI workflow
.github/workflows/backend-assurance.ymlalready path-filters on
elixir/test/backend_assurance/**anddocs/backend-assurance/**, so it picks up the new test modulesautomatically — no workflow change in this PR.
Constraints honoured
Per
docs/backend-assurance/README.md:believe_me. External evidence only.scope and documented; negative
len/startruled out by upstreamNattyping.Test plan
Code.string_to_quoted!)String.length/String.slice/3direct, outside the test harness. 25appendLengthSumcases + 80substrLengthBoundcases all hold.Heads-up for the reviewer
PROOF-NEEDS.mdnow uses a J ✓ marker on the audit-tableClasscolumn for the three externally-validated axioms(
charEqSound,charEqSymfrom #129;appendLengthSum,substrLengthBoundfrom this PR). The fifth axiom (unpackLength)remains bare J until slice 2/4 lands. The wording change is
deliberate per owner direction — the underlying classification is
still class (J); the marker reflects external validation.
Refs #87 (Tier C backend-assurance campaign).
🤖 Generated with Claude Code