Codex debug by lukstafi · Pull Request #423 · ahrefs/ocannl

lukstafi · 2026-01-11T22:51:52Z

This is GPT-5.2-Codex work reviewing and fixing the recent Claude changes. Review by 5.2-Codex:

Findings

Potential wrong concat offsets if iteration order differs from concat order: arrayjit/lib/assignments.ml:375 builds cumulative offsets from product_iterators ordering, but Indexing.Concat uses the concat
expression’s symbol order in arrayjit/lib/assignments.ml:320. If those orders diverge (union-find grouping order not stable), offsets can be applied to the wrong component and index into the wrong slice.
Concat target consistency not validated when already projected: tensor/row.ml:4206 only sets a Concat index if the target projection has no index yet. If a target already has Idx.Iterator or Idx.Fixed_idx
from other constraints, the Concat equation is effectively ignored without verification, which can silently accept incompatible projections.
Unresolved Concat can silently escape Block lowering: arrayjit/lib/assignments.ml:333 falls back to Indexing.Concat when no active component is found. If that situation occurs, low-level indexing may remain
ambiguous instead of failing fast, making errors harder to detect.

Open questions / assumptions

Is the order of product_iterators guaranteed to match the symbol order in Idx.Concat for the same concat group? If not, we likely need a mapping from the concat’s syms order to offsets rather than the
product-group order.
Should Concat→Proj equations validate existing target indices instead of skipping when target_repr already has a projection?

Change summary

Concat handling now propagates component iterators (including d=1) and builds Concat indices post-unification in tensor/row.ml, plus concat grouping behavior updated in tensor/shape.ml.
Block lowering resolves Concat indices using cumulative offsets and low-level symbol sharing is relaxed by marking shared tensors complex in arrayjit/lib/assignments.ml and arrayjit/lib/low_level.ml.
test/operations/test_concat_ppx.expected updated to reflect passing concat smoke tests.

- Compute expected Concat index for deferred targets before assignment - Raise on conflicts with existing indices (allowing single-symbol Concat iterators) Co-authored-by: Codex <noreply@openai.com>

- Derive per-symbol sizes from product iterators - Resolve Concat offsets per index using Concat symbol order Co-authored-by: Codex <noreply@openai.com>

- Raise user-facing errors when no active Concat component is found - Avoid emitting unresolved Concat indices in Block/Rev_sides lowering Co-authored-by: Codex <noreply@openai.com>

claude · 2026-01-11T22:52:07Z

Claude finished @lukstafi's task —— View job

Code Review - Codex Debug PR

I've analyzed the concatenation-related fixes in this PR. Here's my detailed review:

Todo List:

Read CLAUDE.md to understand project context and conventions
Analyze the changed files (assignments.ml and tensor/row.ml)
Review the concat offset fixes and validation improvements
Check for potential bugs and edge cases
Evaluate performance considerations
Assess security concerns
Review test coverage and validation
Provide comprehensive feedback

Summary

This PR addresses critical concatenation handling issues in OCANNL's tensor compilation framework. The fixes target three main areas: symbol ordering in concat offset computation, concat component propagation, and target consistency validation.

✅ Strengths

1. Correct Concat Offset Computation (`assignments.ml:292-300`)

The fix to concat_offset_for properly computes cumulative offsets using the Concat's symbol order rather than product iterator order:

let concat_offset_for syms active =
  let _, offset =
    List.fold syms ~init:(0, None) ~f:(fun (cumul, found) s ->
        let size = Map.find iter_sizes s |> Option.value ~default:0 in
        if Indexing.equal_symbol s active then (cumul + size, Some cumul)
        else (cumul + size, found))
  in
  Option.value ~default:0 offset

This addresses the GPT-5.2-Codex finding about potential wrong offsets when iteration order differs from concat order.

2. Enhanced Concat Component Handling (`row.ml:4061-4088`)

The code now properly propagates concat components including those with d=1:

(* Each component of a concat needs iteration, including d=1 *)
List.iter proj_dims ~f:(fun (pid, { d; _ }) ->
    p_dims := (pid, d) :: !p_dims;
    p_concat_components := Set.add !p_concat_components pid)

This ensures all components get proper iterators for correct indexing.

3. Fail-Fast Error Handling (`assignments.ml:352-355`)

The code now explicitly raises an error when Concat indices cannot be resolved:

| None ->
    raise
    @@ Utils.User_error
         "Concat index could not be resolved to an active component during Block lowering"

This prevents silently escaping unresolved Concat issues.

⚠️ Areas of Concern

1. Offset Computation Bug (`assignments.ml:296`)

There's a logical error in the offset computation:

if Indexing.equal_symbol s active then (cumul + size, Some cumul)
else (cumul + size, found)

The offset should be cumul (before adding size), but the cumulative sum includes the active symbol's size. This should be:

if Indexing.equal_symbol s active then (cumul, Some cumul)
else (cumul + size, found)

2. Inconsistent Validation Logic (`row.ml:4229-4244`)

The concat target validation allows existing indices but only with specific patterns:

let ok =
  Idx.equal_axis_index existing_idx expected_idx
  ||
  match (existing_idx, expected_idx) with
  | Idx.Iterator s, Idx.Concat [ s' ] when Idx.equal_symbol s s' -> true
  | _ -> false

This may miss edge cases where valid transformations should be allowed.

3. Error Recovery Inconsistency (`row.ml:4187, 4197`)

The code silently ignores errors in validation with with _ -> (), which could mask real issues:

with _ -> () (* Ignore errors for now *)

🔧 Performance Considerations

Symbol Ordering: The concat_offset_for function has O(n) complexity per symbol lookup, which is acceptable for typical tensor dimensions.
Union-Find Operations: Multiple Utils.union_find calls could be optimized by caching representatives, but the current approach is clear and likely sufficient for typical usage.
Set Operations: The concat component tracking uses efficient Set operations with good asymptotic complexity.

🛡️ Security Assessment

No security concerns identified. The code deals with tensor shape inference and compilation, not user input or external data.

🧪 Test Coverage

The updated test_concat_ppx.expected shows basic smoke tests are passing, but more comprehensive testing would be beneficial:

Edge Cases: Tests for zero-dimensional components (d=0)
Complex Concatenations: Multiple levels of nested concat operations
Error Conditions: Verification of proper error messages for invalid configurations

📋 Recommendations

Fix Offset Bug: Correct the cumulative offset computation in concat_offset_for
Improve Error Handling: Replace silent error ignoring with more specific error recovery
Add Comprehensive Tests: Include edge cases and error conditions
Documentation: Add inline comments explaining the concat symbol ordering requirements
Validation Enhancement: Consider more robust validation for concat target consistency

🎯 Code Quality

Readability: Good use of descriptive variable names and comments
Modularity: Proper separation of concerns between offset computation and target validation
Type Safety: Leverages OCaml's type system effectively
Error Messages: Generally clear and actionable error messages

This is a solid fix addressing real issues in tensor concatenation handling, but attention to the offset computation bug would strengthen the implementation.

lukstafi · 2026-01-11T22:52:46Z

There is a regression in test/einsum/test_interleave , not sure when it started failing.

- Only probe Concat indices for dims with proj_ids - Ignore unresolved projections when checking for Idx.Concat Co-authored-by: Codex <noreply@openai.com>

lukstafi · 2026-01-11T23:15:52Z

GPT-Codex response:

Reviewed the provided PR review against current code. One key correction:

Findings

The claimed offset bug is incorrect: concat_offset_for already returns the pre‑active cumulative offset (Some cumul) and advances cumul + size for subsequent symbols. That matches the intended “sum of sizes before
the active symbol” behavior. The suggested change would freeze the cumulative sum on the active symbol and break offsets for later symbols in the same Concat list. See arrayjit/lib/assignments.ml in the
concat_offset_for fold.

Notes

The other concerns (validation pattern strictness and existing with _ -> () ignores in tensor/row.ml) are pre‑existing behavior, not introduced by the concat fixes, and may be out of scope unless you want to harden
that area.

@claude do you agree?

lukstafi and others added 3 commits January 11, 2026 23:26

Validate Concat target indices against existing projections

8429189

- Compute expected Concat index for deferred targets before assignment - Raise on conflicts with existing indices (allowing single-symbol Concat iterators) Co-authored-by: Codex <noreply@openai.com>

Compute Concat offsets using Concat symbol order

7161e6d

- Derive per-symbol sizes from product iterators - Resolve Concat offsets per index using Concat symbol order Co-authored-by: Codex <noreply@openai.com>

Fail fast on unresolved Concat indices in lowering

f7947a4

- Raise user-facing errors when no active Concat component is found - Avoid emitting unresolved Concat indices in Block/Rev_sides lowering Co-authored-by: Codex <noreply@openai.com>

Guard Concat index probing against unknown projections

2e8a203

- Only probe Concat indices for dims with proj_ids - Ignore unresolved projections when checking for Idx.Concat Co-authored-by: Codex <noreply@openai.com>

lukstafi merged commit 5a970d8 into master Jan 11, 2026
2 of 4 checks passed

lukstafi deleted the codex-debug branch January 11, 2026 23:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex debug#423

Codex debug#423
lukstafi merged 4 commits intomasterfrom
codex-debug

lukstafi commented Jan 11, 2026

Uh oh!

claude bot commented Jan 11, 2026 •

edited

Loading

Uh oh!

lukstafi commented Jan 11, 2026

Uh oh!

lukstafi commented Jan 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lukstafi commented Jan 11, 2026

Uh oh!

claude bot commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review - Codex Debug PR

Todo List:

Summary

✅ Strengths

1. Correct Concat Offset Computation (assignments.ml:292-300)

2. Enhanced Concat Component Handling (row.ml:4061-4088)

3. Fail-Fast Error Handling (assignments.ml:352-355)

⚠️ Areas of Concern

1. Offset Computation Bug (assignments.ml:296)

2. Inconsistent Validation Logic (row.ml:4229-4244)

3. Error Recovery Inconsistency (row.ml:4187, 4197)

🔧 Performance Considerations

🛡️ Security Assessment

🧪 Test Coverage

📋 Recommendations

🎯 Code Quality

Uh oh!

lukstafi commented Jan 11, 2026

Uh oh!

lukstafi commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Jan 11, 2026 •

edited

Loading

1. Correct Concat Offset Computation (`assignments.ml:292-300`)

2. Enhanced Concat Component Handling (`row.ml:4061-4088`)

3. Fail-Fast Error Handling (`assignments.ml:352-355`)

1. Offset Computation Bug (`assignments.ml:296`)

2. Inconsistent Validation Logic (`row.ml:4229-4244`)

3. Error Recovery Inconsistency (`row.ml:4187, 4197`)

lukstafi commented Jan 11, 2026 •

edited

Loading