Skip to content

feat(proof): backend-assurance harness for prim__strAppend (3/4) + prim__strSubstr (4/4)#139

Merged
hyperpolymath merged 1 commit into
mainfrom
feat/backend-assurance-string-length-arithmetic
May 21, 2026
Merged

feat(proof): backend-assurance harness for prim__strAppend (3/4) + prim__strSubstr (4/4)#139
hyperpolymath merged 1 commit into
mainfrom
feat/backend-assurance-string-length-arithmetic

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Continues the backend-assurance campaign for epic #87 Tier C. Slice
1/4 (prim__eqChar, covering charEqSound + charEqSym) landed via
#129. This PR bundles slices 3/4 and 4/4 per explicit owner
direction — both prim__strAppend and prim__strSubstr are
length-arithmetic on opaque strings and share the same Chez-vs-BEAM
trusted-extraction argument shape, so a combined slice trades the
nominal "one PR per primitive" rule for review ergonomics.

Slice 2/4 (prim__strToCharList, covering unpackLength) remains
pending; ships as a follow-on PR.

Discharges external evidence for two class-(J) believe_me
axioms in src/abi/Boj/SafetyLemmas.idr:

Site Axiom Type
:226 appendLengthSum length (s ++ t) = length s + length t
:233 substrLengthBound LTE (length (substr start len s)) len

The in-language proof does not change. The %unsafe /
believe_me sites stay in source. The harness shrinks the trusted
base from "we trust the backend" to "we read the lowering and
randomly tested the operation against the property".

Artefacts (matching the #129 template)

  • elixir/test/backend_assurance/prim_str_append_test.exs — BEAM
    property tests over the legal codepoint space (surrogates
    excluded). 4 properties + 1 boundary table covering all four UTF-8
    encoding widths (ASCII / Latin-1 / CJK / astral).
  • elixir/test/backend_assurance/prim_str_substr_test.exs — BEAM
    property tests for the length bound. 4 properties + boundary
    table covering len=0, start-past-end, full-string slice tightness,
    and multi-byte codepoint slices.
  • docs/backend-assurance/prim__strAppend.md — trusted-extraction
    prose. Chez argued from R6RS §6.7 (string model) + §11.12
    (string-append); BEAM argued from UTF-8 prefix-free codepoint
    counting via Elixir <> + String.length/1.
  • docs/backend-assurance/prim__strSubstr.md — trusted-extraction
    prose. Chez argued from R6RS §11.12 (substring) + codegen clamp;
    BEAM argued from String.slice/3 codepoint-counting clamp.
  • docs/backend-assurance/README.md — coverage-table rows flipped
    from pending for both primitives.
  • PROOF-NEEDS.md — backend-assurance evidence column filled.
    Class column updated to J ✓ marker for the externally-
    validated rows (a bare J indicates harness not yet landed).
    Audit-table rationales updated to note the harness; remaining-
    axiomatic-surface justifications reworded; standing-note
    "primitives landed" line bumped to reflect 3/4 done.

The existing CI workflow .github/workflows/backend-assurance.yml
already path-filters on elixir/test/backend_assurance/** and
docs/backend-assurance/**, so it picks up the new test modules
automatically — no workflow change in this PR.

Constraints honoured

Per docs/backend-assurance/README.md:

  • No new believe_me. External evidence only.
  • No in-language discharge. Ruled out by 2026-05-18 audit.
  • Two backends. Chez argued from R6RS; BEAM exercised by tests.
  • Honest framing. Surrogates excluded; normalisation out of
    scope and documented; negative len/start ruled out by upstream
    Nat typing.

Test plan

  • Elixir parse-check on both new test files (syntax-only, via
    Code.string_to_quoted!)
  • BEAM operation smoke on boundary strings — String.length /
    String.slice/3 direct, outside the test harness. 25
    appendLengthSum cases + 80 substrLengthBound cases all hold.
  • CI backend-assurance workflow: BEAM property tests green
  • CI: existing workflows unaffected (no path-filter collisions)

Note: this session could not run mix test --only backend_assurance
locally — the asdf elixir 1.18.4-otp-25 install on the work
machine is empty (only .mix archives, no binary), and system
elixir 1.14 fails the project's ~> 1.15 requirement. Same gap
pattern slice 1/4 followed; CI is the gate.

Heads-up for the reviewer

PROOF-NEEDS.md now uses a J ✓ marker on the audit-table
Class column for the three externally-validated axioms
(charEqSound, charEqSym from #129; appendLengthSum,
substrLengthBound from this PR). The fifth axiom (unpackLength)
remains bare J until slice 2/4 lands. The wording change is
deliberate per owner direction — the underlying classification is
still class (J); the marker reflects external validation.

Refs #87 (Tier C backend-assurance campaign).

🤖 Generated with Claude Code

…rSubstr (2/4 + 3/4)

Continues epic #87 Tier C backend-assurance campaign. Slice 1/4
(prim__eqChar) landed in #129; this PR bundles slices 3 and 4
(prim__strAppend + prim__strSubstr) per explicit owner direction —
both primitives are length-arithmetic on opaque strings and share
the same Chez-substring-vs-BEAM-UTF-8 trusted-extraction argument
shape, so a combined slice trades 'one PR per primitive' for review
ergonomics. The remaining primitive (prim__strToCharList, backing
unpackLength) ships separately.

Discharges external evidence for two class-(J) believe_me axioms in
src/abi/Boj/SafetyLemmas.idr:

  :226  appendLengthSum   : length (s ++ t) = length s + length t
  :233  substrLengthBound : LTE (length (substr start len s)) len

The in-language proof does NOT change. The %unsafe / believe_me
sites stay in SafetyLemmas.idr. The harness shrinks the trusted base
from "we trust the backend" to "we read the backend lowering and
randomly tested the operation against the property".

Artefacts (matching the #129 template):

  - elixir/test/backend_assurance/prim_str_append_test.exs
    BEAM property tests via stream_data over UTF-8 binaries.
    Properties: length additivity, left/right identity,
    three-way-associativity-on-length, + boundary table covering all
    four UTF-8 encoding widths (ASCII / Latin-1 / CJK / astral).

  - elixir/test/backend_assurance/prim_str_substr_test.exs
    Properties: length bound, len=0 corner, start-past-end clamp,
    full-string slice tightness, + boundary table.

  - docs/backend-assurance/prim__strAppend.md
    Trusted-extraction prose: Chez (R6RS string-append + string-length,
    §6.7 + §11.12) and BEAM (Elixir <> + String.length/1 over UTF-8,
    prefix-free codepoint counting).

  - docs/backend-assurance/prim__strSubstr.md
    Trusted-extraction prose: Chez (R6RS substring with codegen clamp)
    and BEAM (String.slice/3 codepoint-counting clamp).

  - docs/backend-assurance/README.md
    Coverage table flipped from _pending_ for both rows.

  - PROOF-NEEDS.md
    Backend-assurance evidence column filled for both axioms; class
    column updated to "J ✓" marker indicating externally-validated
    class-(J) status (a bare "J" indicates harness not yet landed).
    Audit-table rationale updated to note the harness; remaining-
    axiomatic-surface justification reworded; standing-note
    "primitives landed" line bumped to reflect 3/4 done.

The CI workflow .github/workflows/backend-assurance.yml already
path-filters on elixir/test/backend_assurance/** and
docs/backend-assurance/** so no workflow change is needed — the new
test modules are picked up automatically.

Constraints honoured (from docs/backend-assurance/README.md):

  - No new believe_me. External evidence only; the 2 axioms stay.
  - No in-language discharge. Ruled out by 2026-05-18 audit.
  - Two backends. Chez argued from R6RS; BEAM exercised by tests.
  - Honest framing. Surrogates excluded; normalisation out of scope
    and documented; negative len/start ruled out by upstream typing.

Test plan:

  - [ ] CI backend-assurance workflow: BEAM property tests green
  - [x] elixir parse-check on both new test files (syntax-only)
  - [x] BEAM operation smoke on boundary strings (appendLengthSum
        25 cases + substrLengthBound 80 cases) — direct String.length
        / String.slice/3 calls outside the test harness
  - [ ] CI: existing workflows unaffected (no path-filter collisions)

Note: this session could not run `mix test --only backend_assurance`
locally — the asdf elixir 1.18.4-otp-25 install on this machine is
empty (only .mix archives, no binary), and system elixir 1.14 fails
the project's `~> 1.15` requirement. Same gap pattern slice 1/4 hit;
CI is the gate.

Slice 2/4 (prim__strToCharList, unpackLength) is the only remaining
primitive after this PR merges. After that, the 5 class-(J) axioms
will all be externally validated and the backend-assurance campaign
closes — modulo any future backend (e.g. a JS or native codegen) for
which the trusted-extraction argument would need to be re-checked.

Refs #87 (Tier C backend-assurance campaign).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath hyperpolymath merged commit 8638068 into main May 21, 2026
16 of 17 checks passed
@hyperpolymath hyperpolymath deleted the feat/backend-assurance-string-length-arithmetic branch May 21, 2026 13:08
hyperpolymath added a commit that referenced this pull request May 21, 2026
Resolve backend-assurance coverage conflicts after PR #139 and keep all four primitive harnesses listed as validated. Adjust BEAM harness checks to measure codepoint counts explicitly instead of Elixir grapheme counts.
hyperpolymath added a commit that referenced this pull request May 21, 2026
…— campaign complete (#140)

## Summary

**Closes the backend-assurance campaign** for epic #87 Tier C. With
this slice merged alongside #129 (slice 1/4) and #139 (slices 3/4 +
4/4), all 5 class-(J) `believe_me` axioms in
`src/abi/Boj/SafetyLemmas.idr` have external evidence.

Discharges external evidence for the final class-(J) axiom:

| Site | Axiom | Type |
|------|-------|------|
| `:218` | `unpackLength` | `(s : String) -> length (unpack s) = length
s` |

The in-language proof does **not** change. The `%unsafe` /
`believe_me` site stays in source. The harness shrinks the trusted
base from "we trust the backend" to "we read the lowering and
randomly tested the operation against the property".

## Artefacts (matching the #129 / #139 template)

- `elixir/test/backend_assurance/prim_str_to_char_list_test.exs` —
  BEAM property tests over the legal codepoint space (surrogates
  excluded). 4 properties + 1 boundary table:
- length preservation: `length(String.to_charlist(s)) ==
String.length(s)`
  - empty-string corner: `String.to_charlist("") == []`
  - round-trip identity: `to_string(to_charlist(s)) == s`
  - charlist element-type sanity (legal codepoints, no surrogates)
- `docs/backend-assurance/prim__strToCharList.md` — trusted-extraction
  prose. Chez argued from R6RS §6.7 (string model) + §11.12
  (`string->list`); BEAM argued from UTF-8 prefix-free codepoint
  counting via Elixir `String.to_charlist/1` + `String.length/1`.
- `docs/backend-assurance/README.md` — coverage-table row flipped
  from _pending_.
- `PROOF-NEEDS.md` — backend-assurance evidence column filled for
  `unpackLength`. Class column updated to **J ✓** marker (convention
  introduced by #139). Site line number corrected from `:211` to
  `:218` (the file has shifted since the 2026-05-18 audit; previous
  PRs didn't sync this row).

The existing CI workflow `.github/workflows/backend-assurance.yml`
already path-filters on `elixir/test/backend_assurance/**` and
`docs/backend-assurance/**`, so it picks up the new test module
automatically — no workflow change in this PR.

## Constraints honoured

Per `docs/backend-assurance/README.md`:
- **No new `believe_me`.** External evidence only.
- **No in-language discharge.** Ruled out by 2026-05-18 audit.
- **Two backends.** Chez argued from R6RS; BEAM exercised by tests.
- **Honest framing.** Surrogates excluded; normalisation out of
  scope and documented; round-trip property guards against a
  backend whose charlist conversion drops or duplicates codepoints
  in a way that happens to preserve length.

## Test plan

- [x] Elixir parse-check on the new test file (syntax-only, via
  `Code.string_to_quoted!`)
- [x] BEAM operation smoke on boundary strings — 9 strings
  (empty, ASCII, Latin-1, CJK, astral, max codepoint, etc.); direct
  `String.to_charlist/1` + `String.length/1` + `to_string` calls
  outside the test harness. Length preservation and round-trip
  identity both hold across all 9.
- [ ] CI backend-assurance workflow: BEAM property tests green
- [ ] CI: existing workflows unaffected (no path-filter collisions)

> **Note:** this session could not run `mix test --only
backend_assurance`
> locally — the asdf `elixir 1.18.4-otp-25` install on the work
> machine is empty (only `.mix` archives, no binary), and system
> `elixir 1.14` fails the project's `~> 1.15` requirement. Same gap
> pattern slices 1/4 and 3/4+4/4 followed; CI is the gate.

## Expected conflicts with #139

Both this PR and #139 branch off `origin/main` (per saved
stacked-PR-avoidance feedback), so they share a base. Conflicts on
merge are limited and trivial:

| File | Conflict shape | Resolution |
|---|---|---|
| `docs/backend-assurance/README.md` | This PR flips the
`prim__strToCharList` row; #139 flips the `prim__strAppend` and
`prim__strSubstr` rows. | Different rows of the same table — line-level
conflict only; accept both. |
| `PROOF-NEEDS.md` audit table | This PR updates row 3; #139 updates
rows 1/2/4/5 and adds the **J ✓** explanation paragraph. | Different
rows; accept both. |
| `PROOF-NEEDS.md` remaining-axiomatic-surface table | This PR updates
`unpackLength` row; #139 updates the other four. | Different rows;
accept both. |
| `PROOF-NEEDS.md` standing note "Primitives validated so far" | #139
bumps it to 3/4; reviewer should apply 4/4 (campaign complete) once both
merge. | Manual one-line edit during merge of whichever PR lands second.
|

The **J ✓** marker is used on row 3 here even though the
explanation paragraph is contributed by #139 — post-#139-merge the
explanation is in place, and the marker's meaning is self-evident
from the value.

## Heads-up for the reviewer

After this PR + #139 merge, the 4-slice backend-assurance harness is
**complete**. All 5 class-(J) axioms in `SafetyLemmas.idr` have
external evidence:

- `charEqSound`, `charEqSym` via `prim__eqChar` (#129)
- `unpackLength` via `prim__strToCharList` (this PR)
- `appendLengthSum` via `prim__strAppend` (#139)
- `substrLengthBound` via `prim__strSubstr` (#139)

Future trusted-extraction work would only be needed if BoJ ships a
new codegen backend (e.g. JavaScript, native ML).

Refs #87 (Tier C backend-assurance campaign — closes the harness
sequence).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 30 issues detected

Severity Count
🔴 Critical 18
🟠 High 5
🟡 Medium 7

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Stale AI session file -- delete",
    "type": "stale",
    "file": "GEMINI.md",
    "action": "delete",
    "rule_module": "root_hygiene",
    "severity": "medium"
  },
  {
    "reason": "Issue in quality.yml",
    "type": "missing_workflow",
    "file": "quality.yml",
    "action": "create",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Issue in security-policy.yml",
    "type": "missing_workflow",
    "file": "security-policy.yml",
    "action": "create",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/boj-server/boj-server/cartridges/sanctify-mcp/adapter/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/boj-server/boj-server/cartridges/academic-workflow-mcp/adapter/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/boj-server/boj-server/cartridges/fireflag-mcp/adapter/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/boj-server/boj-server/cartridges/ephapax-mcp/adapter/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/boj-server/boj-server/cartridges/bofig-mcp/adapter/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/boj-server/boj-server/cartridges/hesiod-mcp/adapter/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant