Skip to content

Remove memory-aware node bindings#123

Merged
Xeratec merged 6 commits intopulp-platform:develfrom
lukamac:remove-memory-aware-binding
Oct 26, 2025
Merged

Remove memory-aware node bindings#123
Xeratec merged 6 commits intopulp-platform:develfrom
lukamac:remove-memory-aware-binding

Conversation

@lukamac
Copy link
Contributor

@lukamac lukamac commented Oct 16, 2025

Remove memory-aware node bindings because it makes the great parser refactor easier.

The memory-aware node bindings exist only for Neureka to be able to separate bindings that use the dedicated weight memory vs. those who don't.
But those bindings can simply be rewritten to check whether the weights reside in weight memory and change behavior accordingly.

By removing the memory-aware node bindings, we remove another dependency on having hoisted buffers in the middle of parsing.

The RequantHelpers are a bonus that fixes the requantization mul and add hyperrectangles to keep the rank of the original tensors.

Added

  • RequantHelpers.py for Neureka's TileConstraints

Changed

  • Removed NodeMemoryLevelChecker, MemoryAwareNodeBinding
  • _parseNode from MemoryNetworkDeployer since we don't need the annotations before typeChecking anymore
  • Wmem variants of bindings and tile constraints from Neureka

Fixed

  • Keep mul/add rank of requantized Neureka tile constraints

PR Merge Checklist

  1. The PR is rebased on the latest devel commit and pointing to devel.
  2. Your PR reviewed and approved.
  3. All checks are passing.
  4. The CHANGELOG.md file has been updated.
  5. If the docker was modified, change back its link after review.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Fixed rank preservation for quantization-related parameters in tile constraints.
  • Changes

    • Simplified Neureka deployment bindings by removing memory-aware wrapper variants.
    • Streamlined per-node parsing flow by consolidating memory-level optimization into the main binding process.
    • Refactored tile constraint handling to improve quantization support.

Walkthrough

Removed memory-aware per-node parsing and memory-level binding wrappers; eliminated Wmem binding/tiling variants across Neureka target; added requantization helpers and refactored Neureka tile-constraint classes to branch on weight SRAM vs non‑SRAM and produce requant-aware load schedules. Memory-level per-node checks removed.

Changes

Cohort / File(s) Summary
Network deployer wrapper
Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py
Removed MemoryAwareDeployer _parseNode augmentation/delegation; trimmed typing imports (removed Tuple).
Memory-level extension core
Deeploy/MemoryLevelExtension/MemoryLevels.py, Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py
Deleted NodeMemoryLevelChecker, MemoryAwareNodeBinding, memoryAwareNodeBindingExtension; removed per-node _parseNode overrides in memory-level deployers; retained annotation optimization in bind(); removed ONNXLayer from public import list.
Neureka bindings / engine / tiler
Deeploy/Targets/Neureka/Bindings.py, Deeploy/Targets/Neureka/Engine.py, Deeploy/Targets/Neureka/Tiler.py
Removed all NeurekaWmem* binding and tiling-ready declarations/imports; simplified mapper definitions to direct parser→tiling-binding mappings; narrowed exported binding/tiling set to non‑Wmem and RQS variants.
Requant helpers
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py
Added requantAddGeometricalConstraint and requantLoadSchedule to produce requant geometrical constraints and per-output-cube mul/add load schedules.
TileConstraints — dense
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py
Introduced VariableBuffer usage and SRAM-aware weight handling in geometrical/policy constraints and serialization; removed per-cube weight-offset complexity; added NeurekaRQSDenseConv2DTileConstraint using requant helpers.
TileConstraints — depthwise
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py
Added VariableBuffer and requant helpers; changed geometrical constraints and serialization to branch on weight SRAM vs non‑SRAM; added NeurekaRQSDWConv2DTileConstraint integrating requant helpers.
TileConstraints — pointwise
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py
Consolidated PW constraints into NeurekaPWConv2DTileConstraint, added NeurekaRQSPWConv2DTileConstraint delegating to PW constraint then applying requant helpers; integrated SRAM-aware weight handling and requantLoadSchedule.
Changelog
CHANGELOG.md
Documented removals of memory-aware bindings/parse paths, addition of RequantHelpers, and related cleanup entries.

Sequence Diagram(s)

sequenceDiagram
    participant Parser as Parser
    participant Engine as Engine/Mapper
    participant Tiler as Tiler
    participant Binding as TilingBinding

    Note over Parser,Engine: Before — parser → (Wmem + non‑Wmem bindings)
    Parser->>Engine: map node to (WmemBinding + Binding)
    Engine->>Tiler: provide Wmem + non‑Wmem tiling bindings
    Tiler->>Binding: use Wmem-aware constraints

    Note over Parser,Engine: After — parser → Binding (direct)
    Parser->>Engine: map node to Binding
    Engine->>Tiler: provide non‑Wmem tiling bindings
    Tiler->>Binding: use unified / requant-aware constraints
Loading
sequenceDiagram
    participant Constraint as TileConstraint
    participant Ctxt as NetworkContext
    participant Requant as RequantHelpers
    participant Schedule as TilingSchedule

    Constraint->>Ctxt: lookup weight buffer (VariableBuffer)
    alt weight memory == SRAM
        Ctxt-->>Constraint: SRAM
        Constraint->>Schedule: compute per-cube weight_addr_offset (SRAM path)
    else weight memory != SRAM
        Ctxt-->>Constraint: non‑SRAM
        Constraint->>Requant: requantLoadSchedule(absoluteOutputCubes, ctxt, op)
        Requant-->>Schedule: mul/add load schedule per cube
        Constraint->>Schedule: merge weight info into input/load schedule
    end
    Schedule-->>Constraint: final tiling load schedule
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

Feature

Suggested reviewers

  • Victor-Jung

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "Remove memory-aware node bindings" directly and clearly summarizes the primary objective of the changeset. The raw summary shows that the core changes involve removing NodeMemoryLevelChecker, MemoryAwareNodeBinding, and _parseNode methods across multiple modules, as well as eliminating all Wmem (memory-aware) variants of bindings and tile constraints from Neureka. The title is concise, specific, and accurately captures the main purpose without being vague or generic. While the PR also includes the addition of RequantHelpers.py and fixes to requantization behavior, the title appropriately focuses on the primary change rather than attempting to cover every detail.
Description Check ✅ Passed The PR description is well-related to the changeset and provides meaningful context. It explains the motivation (simplifying the parser refactor and removing dependency on hoisted buffers during parsing), clearly lists what was added (RequantHelpers.py), what was changed (removal of memory-aware components and Wmem variants), and what was fixed (requantization mul/add rank preservation). All items mentioned in the description align with the actual changes documented in the raw summary, and the description is specific and informative rather than vague.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1)

5-64: Restore NetworkContext import to avoid runtime failure

NetworkContext is still used in the type hints (e.g., Line 51) but is no longer imported. Because these annotations are evaluated at module import time (no from __future__ import annotations safeguard), the module will now raise a NameError as soon as it loads. Please reintroduce the import (or switch to a quoted forward reference).

🧹 Nitpick comments (12)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)

227-227: Add strict=True to zip() for safety.

Python 3.10+ supports strict=True to catch length mismatches between iterables. This prevents silent bugs if outputCubes and inputLoadSchedule diverge in length.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

257-261: Add strict=True to zip() for safety.

Similar to line 227, this zip() should include strict=True to catch length mismatches.

Apply this diff:

         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (1)

64-68: Add strict=True to zip() for safety.

For consistency and to catch potential length mismatches, add strict=True.

Apply this diff:

     newInputLoadSchedule = [{
         **schedule,
         "mul": mul,
         "add": add,
-    } for schedule, mul, add in zip(tilingSchedule.inputLoadSchedule, inputMulCubes, inputAddCubes)]
+    } for schedule, mul, add in zip(tilingSchedule.inputLoadSchedule, inputMulCubes, inputAddCubes, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (4)

161-162: Consider using _ prefix for unused unpacked variables.

Static analysis flags BatchOffset, HOffset, WOffset, and BatchSize as unused. While the current code works, prefixing with _ makes the intent explicit.

Apply this diff:

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

212-232: Code duplication with NeurekaDepthwiseConstraint.

This weight handling logic is nearly identical to lines 210-230 in NeurekaDepthwiseConstraint.py. The only difference is the weight cube construction (different dimension ordering).

Consider extracting this pattern into a helper function in RequantHelpers.py to reduce duplication across Dense, Depthwise, and Pointwise constraint classes.


229-229: Add strict=True to zip() for safety.

For consistency with best practices in Python 3.10+, add strict=True.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

259-263: Add strict=True to zip() for safety.

Apply this diff:

         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5)

48-52: Document why SRAM weights are constrained to Max().

Unlike the Dense and Depthwise variants, the Pointwise constraint explicitly constrains weightOutChannelVar == weightOutChannelVar.Max() for SRAM weights. This difference suggests Pointwise has specific requirements for SRAM weight tiling.

Consider adding a comment explaining why Pointwise forces full channel tiling for SRAM weights while Dense/Depthwise don't.


191-192: Consider using _ prefix for unused unpacked variables.

Static analysis flags BatchOffset, HOffset, WOffset, and BatchSize as unused.

Apply this diff:

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

242-262: Code duplication with Dense and Depthwise constraints.

This weight handling logic is duplicated across NeurekaDenseConstraint.py (lines 212-232), NeurekaDepthwiseConstraint.py (lines 210-230), and here. Consider consolidating this pattern into a shared helper function.


259-259: Add strict=True to zip() for safety.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

289-293: Add strict=True to zip() for safety.

Apply this diff:

         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15c4a23 and 0bfd7fb.

📒 Files selected for processing (10)
  • Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1 hunks)
  • Deeploy/MemoryLevelExtension/MemoryLevels.py (1 hunks)
  • Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1 hunks)
  • Deeploy/Targets/Neureka/Bindings.py (0 hunks)
  • Deeploy/Targets/Neureka/Engine.py (1 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (7 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (6 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (1 hunks)
  • Deeploy/Targets/Neureka/Tiler.py (1 hunks)
💤 Files with no reviewable changes (1)
  • Deeploy/Targets/Neureka/Bindings.py
🧰 Additional context used
🧬 Code graph analysis (8)
Deeploy/Targets/Neureka/Engine.py (2)
Deeploy/DeeployTypes.py (1)
  • NodeMapper (1716-1872)
Deeploy/Targets/Neureka/Parsers.py (6)
  • NeurekaRQSPWConv2DParser (140-162)
  • NeurekaPWConv2DParser (125-137)
  • NeurekaRQSDWConv2DParser (99-122)
  • NeurekaDWConv2DParser (80-96)
  • NeurekaRQSDenseConv2DParser (180-202)
  • NeurekaDenseConv2DParser (165-177)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (6)
Deeploy/DeeployTypes.py (2)
  • NetworkContext (564-1076)
  • add (737-774)
Deeploy/TilingExtension/MemoryConstraints.py (1)
  • NodeMemoryConstraint (95-167)
Deeploy/TilingExtension/TilerModel.py (3)
  • TilerModel (34-402)
  • addTensorDimToModel (143-157)
  • getTensorDimVar (131-135)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • AbsoluteHyperRectangle (39-49)
  • HyperRectangle (24-35)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (2)
  • serializeTilingSolution (92-236)
  • serializeTilingSolution (247-268)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1)
Deeploy/DeeployTypes.py (3)
  • CodeGenVerbosity (49-55)
  • NetworkDeployer (3231-3596)
  • ONNXLayer (1875-2203)
Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • NetworkOptimizationPass (2263-2283)
  • NetworkOptimizer (2286-2309)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (5)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (14-34)
  • requantLoadSchedule (76-94)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (5)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (14-34)
  • requantLoadSchedule (76-94)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/Targets/Neureka/Tiler.py (3)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (2)
  • NeurekaDenseConv2DTileConstraint (21-236)
  • NeurekaRQSDenseConv2DTileConstraint (239-268)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)
  • NeurekaDWConv2DTileConstraint (21-234)
  • NeurekaRQSDWConv2DTileConstraint (237-266)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (2)
  • NeurekaPWConv2DTileConstraint (21-266)
  • NeurekaRQSPWConv2DTileConstraint (269-298)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (14-34)
  • requantLoadSchedule (76-94)
Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py (1)
  • computeInputCube (320-347)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
🪛 Ruff (0.14.0)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py

48-48: Undefined name cls

(F821)


68-68: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py

227-227: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


261-261: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py

161-161: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


162-162: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


229-229: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


263-263: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py

191-191: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


192-192: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


259-259: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


293-293: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🔇 Additional comments (7)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)

210-230: LGTM! Clean SRAM-aware weight handling.

The conditional weight handling properly separates the SRAM path (computing offsets per tile) from the non-SRAM path (merging base offsets and adding weight to load schedule). This aligns with the PR's goal to simplify by removing memory-aware bindings while still supporting SRAM weights.


237-266: LGTM! Requantization subclass correctly delegates.

The NeurekaRQSDWConv2DTileConstraint class properly delegates to the base class and then applies requantization-specific constraints via requantAddGeometricalConstraint and requantLoadSchedule. This composition pattern is clean and maintainable.

Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)

14-34: LGTM! Clean constraint addition for requantization.

This helper correctly adds dimension variables for mul/add buffers and constrains their channel dimensions to match the output channel. The logic is straightforward and well-commented.


76-94: LGTM! Clean load schedule generation.

This helper correctly constructs HyperRectangles for mul/add buffers based on output channel offsets and sizes. The logic matches the pattern used in the tile constraint classes.

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (2)

54-58: LGTM! Clean conditional weight constraint.

Unlike the depthwise variant, this implementation doesn't have a TODO and cleanly handles the SRAM case with a pass statement, making the intent clear.


239-268: LGTM! Requantization integration follows established pattern.

The NeurekaRQSDenseConv2DTileConstraint class correctly implements the same pattern as the depthwise variant, delegating to the base class and then applying requantization constraints.

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (1)

269-298: LGTM! Consistent requantization pattern.

The NeurekaRQSPWConv2DTileConstraint class follows the same clean pattern as Dense and Depthwise variants.

@lukamac lukamac force-pushed the remove-memory-aware-binding branch 2 times, most recently from f0d4de7 to 28faade Compare October 16, 2025 15:58
@lukamac lukamac changed the title Remove memory-aware node bindings DRAFT: Remove memory-aware node bindings Oct 16, 2025
@lukamac lukamac marked this pull request as draft October 16, 2025 16:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (1)

86-88: Bug: kernel dims swapped (height vs width).

Height should compare to dim_kernel_y, width to dim_kernel_x.

-        tilerModel.addConstraint(inputHeightVar >= parseDict['dim_kernel_x'])
-        tilerModel.addConstraint(inputWidthVar >= parseDict['dim_kernel_y'])
+        tilerModel.addConstraint(inputHeightVar >= parseDict['dim_kernel_y'])
+        tilerModel.addConstraint(inputWidthVar >= parseDict['dim_kernel_x'])
♻️ Duplicate comments (1)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (1)

52-58: Clarify the TODO comment or remove it.

The TODO at line 54 suggests uncertainty about whether to uncomment the constraint weightOutChannelVar == weightOutChannelVar.Max() for SRAM weights. This conditional logic is critical for handling weights in dedicated weight memory.

If skipping the constraint is intentional for SRAM weights, document why. Otherwise, either uncomment it or remove the TODO.

Could you clarify the intended behavior for SRAM weights? Should the constraint be:

  1. Skipped entirely (current behavior)
  2. Uncommented to enforce full channel loading
  3. Something else?
🧹 Nitpick comments (7)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)

210-230: Consider adding strict=True to zip() call.

The conditional weight handling logic correctly separates SRAM and non-SRAM paths:

  • SRAM weights: pre-calculate offsets for direct access
  • Non-SRAM weights: include in load schedule like other inputs

However, static analysis flags the zip() call at line 227 for missing strict= parameter. While the lists should have matching lengths by construction, adding strict=True makes this assumption explicit and will fail fast if violated.

Consider applying this change:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

237-266: LGTM: Requantized DW conv constraint correctly extends base class.

The NeurekaRQSDWConv2DTileConstraint class properly layers requantization support onto the base DW convolution constraints by:

  • Calling base addGeometricalConstraint then applying requantAddGeometricalConstraint
  • Calling base serializeTilingSolution then merging in requant load schedules

This design allows requantization to be added without duplicating the base convolution tiling logic.

Consider adding strict=True to the zip() call at line 261 for the same reason as mentioned in the previous comment:

-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (2)

190-195: Silence unused tuple-unpack vars.

Prefix unused items with underscores to satisfy linters and reduce noise.

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, _COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

259-262: Use zip(..., strict=True) to guard against length mismatches
Since the project requires Python >=3.10, update your loops as follows:

- for cube, load in zip(outputCubes, inputLoadSchedule):
+ for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):
- } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+ } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (3)

65-66: Avoid hard-coded 3x3; use parsed kernel dims (optional).

Use dim_kernel_y/x to generalize and reduce magic numbers.

-        tilerModel.addConstraint((outputHeightVar == (effectiveHeight - (3 - 1) - 1) // strides[0] + 1))
-        tilerModel.addConstraint((outputWidthVar == (effectiveWidth - (3 - 1) - 1) // strides[1] + 1))
+        kH, kW = parseDict["dim_kernel_y"], parseDict["dim_kernel_x"]
+        tilerModel.addConstraint((outputHeightVar == (effectiveHeight - (kH - 1) - 1) // strides[0] + 1))
+        tilerModel.addConstraint((outputWidthVar == (effectiveWidth - (kW - 1) - 1) // strides[1] + 1))

161-163: Silence unused tuple-unpack vars.

Prefix unused items to satisfy linters.

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, _COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

229-231: Add strict=True to zip() calls (optional defensive programming).

Since the project requires Python >= 3.10 (per pyproject.toml), use strict=True unconditionally rather than conditional logic. This ensures the iterables always have matching lengths:

  • Line 229: for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):
  • Line 263: for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bfd7fb and 28faade.

📒 Files selected for processing (11)
  • CHANGELOG.md (4 hunks)
  • Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1 hunks)
  • Deeploy/MemoryLevelExtension/MemoryLevels.py (1 hunks)
  • Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1 hunks)
  • Deeploy/Targets/Neureka/Bindings.py (0 hunks)
  • Deeploy/Targets/Neureka/Engine.py (1 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (7 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (6 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (1 hunks)
  • Deeploy/Targets/Neureka/Tiler.py (1 hunks)
💤 Files with no reviewable changes (1)
  • Deeploy/Targets/Neureka/Bindings.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • Deeploy/MemoryLevelExtension/MemoryLevels.py
  • Deeploy/Targets/Neureka/Engine.py
🧰 Additional context used
🧬 Code graph analysis (6)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (3)
Deeploy/DeeployTypes.py (1)
  • NetworkContext (564-1076)
Deeploy/TilingExtension/TilerModel.py (3)
  • TilerModel (34-402)
  • addTensorDimToModel (143-157)
  • getTensorDimVar (131-135)
Deeploy/TilingExtension/TilingCodegen.py (2)
  • AbsoluteHyperRectangle (39-49)
  • HyperRectangle (24-35)
Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1)
Deeploy/DeeployTypes.py (4)
  • NetworkContext (564-1076)
  • NetworkOptimizationPass (2263-2283)
  • NetworkOptimizer (2286-2309)
  • StructBuffer (489-509)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (4)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/TilingExtension/TilingCodegen.py (2)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (3)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilingCodegen.py (2)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
Deeploy/Targets/Neureka/Tiler.py (3)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (1)
  • NeurekaDenseConv2DTileConstraint (21-236)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (1)
  • NeurekaDWConv2DTileConstraint (21-234)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (1)
  • NeurekaPWConv2DTileConstraint (21-266)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (6)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py (1)
  • computeInputCube (320-347)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
🪛 Ruff (0.14.0)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py

227-227: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


261-261: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py

191-191: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


192-192: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


259-259: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


293-293: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py

161-161: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


162-162: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


229-229: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


263-263: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (150)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
🔇 Additional comments (5)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)

12-32: LGTM: Requantization constraint logic is correct.

The function properly adds mul and add tensor dimensions to the tiler model and enforces channel equality constraints. The implementation correctly uses the TilerModel API to retrieve dimension variables and add constraints.


35-53: LGTM: Load schedule generation is correct.

The function correctly generates per-cube load schedules for mul and add buffers by:

  • Extracting channel offset and size from each output cube
  • Creating HyperRectangles with zeros for all dimensions except the last (channel) dimension
  • Properly preserving the rank of the mul/add tensors as mentioned in the PR objectives

This implementation aligns with the NHWC layout convention used by Neureka.

Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (1)

9-9: LGTM: Required imports added.

The imports for VariableBuffer and requantization helpers (requantAddGeometricalConstraint, requantLoadSchedule) are necessary for the new memory-level checking and requantization functionality introduced in this PR.

Also applies to: 12-12

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (1)

48-52: SRAM-aware weight constraint is correct.

Using weightBuffer._memoryLevel to switch channel constraint is the right replacement for Wmem variants.

Deeploy/Targets/Neureka/Tiler.py (1)

6-13: Wmem bindings removal verified as complete.

Ripgrep confirms zero stale references to removed Wmem variants, MemoryAwareNodeBinding, or NodeMemoryLevelChecker across the codebase. The CHANGELOG documents the intentional removal at lines 78–80, and the import cleanup in Tiler.py is consistent with the broader cleanup. Code changes are sound.

@lukamac lukamac force-pushed the remove-memory-aware-binding branch from e9f8bef to b27c5ac Compare October 16, 2025 17:36
@lukamac lukamac marked this pull request as ready for review October 16, 2025 18:53
@lukamac lukamac changed the title DRAFT: Remove memory-aware node bindings Remove memory-aware node bindings Oct 16, 2025
@Xeratec Xeratec added Feature Addition of new features Refactor Changes or improvements to existing features and removed Feature Addition of new features labels Oct 23, 2025
@Xeratec Xeratec added this to Deeploy Oct 23, 2025
@Xeratec Xeratec moved this to Need Reviewer in Deeploy Oct 23, 2025
@Xeratec Xeratec added this to the Release 0.2.1 milestone Oct 23, 2025
@Xeratec Xeratec moved this from Need Reviewer to Ready for Merge in Deeploy Oct 23, 2025
@Xeratec Xeratec moved this from Ready for Merge to In review in Deeploy Oct 23, 2025
Copy link
Member

@Xeratec Xeratec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the open TODO comment, otherwise it look good.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)

225-227: Add strict=True to zip() for safety.

The zip on line 225 iterates over outputCubes and inputLoadSchedule, which should have matching lengths by construction. Adding strict=True will catch bugs if they ever fall out of sync.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):
                 COffset, CSize = cube.offset[-1], cube.dims[-1]
                 load['weight'] = HyperRectangle((COffset, 0, 0), (CSize, weightShape[-2], weightShape[-1]))

255-263: Add strict=True to zip() for safety.

The zip on line 259 pairs each input load with its corresponding requant load. Both collections are derived from the same output cubes and should have matching lengths. Adding strict=True will catch any mismatch bugs.

Apply this diff:

         requantSchedule = requantLoadSchedule(absoluteOutputCubes, ctxt, operatorRepresentation)
         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (3)

161-162: Consider prefixing unused unpacked variables with underscore.

Variables BatchOffset, HOffset, WOffset, and BatchSize are unpacked but never used. While explicit unpacking aids readability, prefixing unused variables with _ follows Python conventions and silences linter warnings.

Apply this diff:

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

Alternatively, use direct indexing if only a few values are needed:

COffset = cube.offset[-1]
HSize, WSize, CSize = cube.dims[1:]

229-231: Add strict=True to zip() for safety.

The zip on line 229 iterates over outputCubes and inputLoadSchedule, which should have matching lengths by construction. Adding strict=True will catch bugs if they ever fall out of sync.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):
                 COffset, CSize = cube.offset[-1], cube.dims[-1]
                 load['weight'] = HyperRectangle((COffset, 0, 0), (CSize, weightShape[-2], weightShape[-1]))

259-263: Add strict=True to zip() for safety.

The zip on line 263 pairs each input load with its corresponding requant load. Both collections are derived from the same output cubes and should have matching lengths. Adding strict=True will catch any mismatch bugs.

Apply this diff:

         requantSchedule = requantLoadSchedule(absoluteOutputCubes, ctxt, operatorRepresentation)
         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b27c5ac and ad350cb.

📒 Files selected for processing (2)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (7 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (6 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (5)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (4)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilingCodegen.py (2)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
🪛 Ruff (0.14.1)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py

225-225: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


259-259: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py

161-161: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


162-162: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


229-229: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


263-263: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
CHANGELOG.md (1)

78-80: Move removal entries from "Changed" to "Removed" section.

These entries describe deletions, not modifications. Per this changelog's own structure and standard conventions, they belong in the "Removed" section (which starts at line 93).

### Changed
- ...
  Deployer workflow now uses `prepare(...)` instead of `generateFunction(...)`.
- Removed NodeMemoryLevelChecker, MemoryAwareNodeBinding
- Removed _parseNode from MemoryNetworkDeployer since we don't need the annotations before typeChecking anymore
- Removed Wmem variants of bindings and tile constraints from Neureka

### Fixed

Apply this diff to relocate them:

### Removed
+ - NodeMemoryLevelChecker, MemoryAwareNodeBinding
+ - _parseNode from MemoryNetworkDeployer since we don't need the annotations before typeChecking anymore
+ - Wmem variants of bindings and tile constraints from Neureka
- Delete outdated and unused `.gitlab-ci.yml` file
- dory_dma.c and dory_dma.h
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad350cb and 25cd83a.

📒 Files selected for processing (1)
  • CHANGELOG.md (4 hunks)

@Xeratec Xeratec force-pushed the remove-memory-aware-binding branch from 25cd83a to 82323f8 Compare October 24, 2025 20:56
Copy link
Member

@Xeratec Xeratec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will merge it once the CI is through.

@Xeratec Xeratec mentioned this pull request Oct 24, 2025
6 tasks
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (2)

110-125: Fix mutable default arguments in constructors.

memoryLevelAnnotationPasses: List[NetworkOptimizationPass] = [] shares state across instances. Use None and create a new list per call.

-                 memoryLevelAnnotationPasses: List[NetworkOptimizationPass] = []):
+                 memoryLevelAnnotationPasses: Optional[List[NetworkOptimizationPass]] = None):
@@
-        if len(memoryLevelAnnotationPasses) == 0:
-            memoryLevelAnnotationPasses.append(AnnotateDefaultMemoryLevel(self.Platform.memoryHierarchy))
-        self.memoryLevelAnnotationOptimizer = NetworkOptimizer(memoryLevelAnnotationPasses)
+        passes = list(memoryLevelAnnotationPasses) if memoryLevelAnnotationPasses is not None else []
+        if not passes:
+            passes.append(AnnotateDefaultMemoryLevel(self.Platform.memoryHierarchy))
+        self.memoryLevelAnnotationOptimizer = NetworkOptimizer(passes)

Apply the same pattern to:

  • Lines 153-169 in MemoryLevelAwareSignPropDeployer.__init__
  • Lines 197-204 in MemoryDeployerWrapper.__init__

Also applies to: 151-169, 197-204


80-106: Guard against division by zero in memory summary.

memLevel.size can be 0 (default) → division by zero. Treat non-positive capacity as unknown and print N/A.

-            if memLevel is None or getattr(memLevel, "size", None) is None:
+            if memLevel is None or getattr(memLevel, "size", None) is None or memLevel.size <= 0:
                 log.info(f"  {str(level):<20} {'N/A':>10} {total:10,d} "
                          f"({staticSize:10,d} + {dynamicSize:10,d}) "
                          f"({'N/A':>5})")
             else:
                 capacity = memLevel.size
                 log.info(f"  {str(level):<20} {capacity:10,} {total:10,d} "
                          f"({staticSize:10,d} + {dynamicSize:10,d}) "
                          f"({total / capacity * 100:5.1f}%)")
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (1)

31-31: Remove unused dilation.

Variable is assigned but never used.

-        dilation = parseDict["dilations"]
🧹 Nitpick comments (14)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5)

191-193: Silence unused tuple items (Ruff RUF059).

Only COffset/CSize are used. Replace unused bindings with _ to avoid warnings and keep intent clear.

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_, _, _, COffset) = cube.offset
+            (_, HSize, WSize, CSize) = cube.dims

187-196: Use looked-up output buffer for computeInputCube (good); minor simplification.

You already looked up outputBuffer. Reuse it to avoid a second lookup and keep shapes consistent.

-            InCube, padding_tuple = Conv2DTileConstraint.computeInputCube((weightH, weightW), pads, strides, weightC,
-                                                                          cube, outputBuffer.shape)
+            InCube, padding_tuple = Conv2DTileConstraint.computeInputCube(
+                (weightH, weightW), pads, strides, weightC, cube, outputBuffer.shape
+            )

48-53: SRAM gating logic looks consistent with Dense/DW; consider a shared helper.

The _memoryLevel == "WeightMemory_SRAM" pattern is duplicated across tile constraints. Extract a tiny helper (e.g., force_full_weight_channels_if_in_sram(tilerModel, weightBufferName, outVar)) to centralize behavior and avoid drift.


289-294: Add strict=True when zipping requant load schedules.

Prevents silent truncation if schedules diverge.

-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]

If Python < 3.10, assert equal lengths before zipping.

Also applies to: 289-293


246-262: Use strict=True on zip() to catch schedule-length mismatches.

The project supports Python 3.10, 3.11, and 3.12, so zip()'s strict parameter is available to raise ValueError when iterables have different lengths. Update line 259 to:

for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

Since the error occurs when iteration stops, this catches accidental schedule-length drifts at runtime without needing explicit length checks.

Deeploy/MemoryLevelExtension/MemoryLevels.py (1)

22-28: Harden __eq__ against non-MemoryLevel comparisons.

Return NotImplemented when other isn’t MemoryLevel; avoids attribute access on foreign types and follows Python equality conventions.

-    def __eq__(self, other):
-
-        ret = [neighbour_name in other.neighbourNames for neighbour_name in self.neighbourNames]
+    def __eq__(self, other):
+        if not isinstance(other, MemoryLevel):
+            return NotImplemented
+        ret = [neighbour_name in other.neighbourNames for neighbour_name in self.neighbourNames]
         ret += [neighbour_name in self.neighbourNames for neighbour_name in other.neighbourNames]
         ret += [self.name == other.name, self.size == other.size]
         return all(ret)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (6)

29-33: SRAM gating pattern: consistent; consider helper extraction.

Same duplication as PW/Dense. A shared helper would reduce drift across ops.

Also applies to: 52-57


73-75: Use canonical tensor name via lookup when fetching dim vars.

To avoid alias/name mismatches, resolve the buffer first and pass buffer.name to getTensorDimVar (as done elsewhere).

-        inputHeightVar = tilerModel.getTensorDimVar(tensorName = parseDict['data_in'], dimIdx = 1)
-        inputWidthVar  = tilerModel.getTensorDimVar(tensorName = parseDict['data_in'], dimIdx = 2)
+        _in = ctxt.lookup(parseDict['data_in'])
+        inputHeightVar = tilerModel.getTensorDimVar(tensorName = _in.name, dimIdx = 1)
+        inputWidthVar  = tilerModel.getTensorDimVar(tensorName = _in.name, dimIdx = 2)

152-162: Reuse looked-up outputBuffer to compute input cube (micro‑cleanup).

-        outputBuffer = ctxt.lookup(varOut)
-        assert isinstance(outputBuffer, VariableBuffer)
-        for cube in outputCubes:
+        outputBuffer = ctxt.lookup(varOut)
+        assert isinstance(outputBuffer, VariableBuffer)
+        for cube in outputCubes:
@@
-            InCube, padding_tuple = Conv2DTileConstraint.computeInputCube((weightH, weightW), pads, strides, weightC,
-                                                                          cube,
-                                                                          ctxt.lookup(varOut).shape)
+            InCube, padding_tuple = Conv2DTileConstraint.computeInputCube(
+                (weightH, weightW), pads, strides, weightC, cube, outputBuffer.shape
+            )

156-158: Silence unused tuple items (Ruff RUF059).

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_, _, _, COffset) = cube.offset
+            (_, HSize, WSize, CSize) = cube.dims

225-227: Add strict=True to zip when merging weight tiles.

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):
                 COffset, CSize = cube.offset[-1], cube.dims[-1]
                 load['weight'] = HyperRectangle((COffset, 0, 0), (CSize, weightShape[-2], weightShape[-1]))

Fallback for Python < 3.10: assert equal lengths before zipping.


255-259: Add strict=True when zipping requant schedules.

-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (2)

223-231: Add explicit strict= parameter to zip().

The zip() call should include strict=True to catch length mismatches between outputCubes and inputLoadSchedule, ensuring all cubes are paired correctly.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

259-263: Add explicit strict= parameter to zip().

The zip() call merging load schedules should include strict=True to catch any length mismatches between tilingSchedule.inputLoadSchedule and requantSchedule.

Apply this diff:

         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 25cd83a and 82323f8.

📒 Files selected for processing (11)
  • CHANGELOG.md (4 hunks)
  • Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1 hunks)
  • Deeploy/MemoryLevelExtension/MemoryLevels.py (1 hunks)
  • Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (4 hunks)
  • Deeploy/Targets/Neureka/Bindings.py (0 hunks)
  • Deeploy/Targets/Neureka/Engine.py (1 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (7 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (6 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (1 hunks)
  • Deeploy/Targets/Neureka/Tiler.py (1 hunks)
💤 Files with no reviewable changes (1)
  • Deeploy/Targets/Neureka/Bindings.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py
  • Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py
  • CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (6)
Deeploy/Targets/Neureka/Tiler.py (3)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (2)
  • NeurekaDenseConv2DTileConstraint (21-236)
  • NeurekaRQSDenseConv2DTileConstraint (239-268)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)
  • NeurekaDWConv2DTileConstraint (21-232)
  • NeurekaRQSDWConv2DTileConstraint (235-264)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (2)
  • NeurekaPWConv2DTileConstraint (21-266)
  • NeurekaRQSPWConv2DTileConstraint (269-298)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (4)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (508-1020)
  • VariableBuffer (232-360)
  • lookup (720-752)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (4)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (508-1020)
  • VariableBuffer (232-360)
  • lookup (720-752)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1)
Deeploy/DeeployTypes.py (5)
  • NetworkContext (508-1020)
  • NetworkOptimizationPass (2207-2227)
  • NetworkOptimizer (2230-2253)
  • optimize (2184-2204)
  • optimize (2234-2253)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (4)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (508-1020)
  • VariableBuffer (232-360)
  • lookup (720-752)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/Targets/Neureka/Engine.py (2)
Deeploy/DeeployTypes.py (1)
  • NodeMapper (1660-1816)
Deeploy/Targets/Neureka/Parsers.py (6)
  • NeurekaRQSPWConv2DParser (140-162)
  • NeurekaPWConv2DParser (125-137)
  • NeurekaRQSDWConv2DParser (99-122)
  • NeurekaDWConv2DParser (80-96)
  • NeurekaRQSDenseConv2DParser (180-202)
  • NeurekaDenseConv2DParser (165-177)
🪛 Ruff (0.14.1)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py

225-225: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


259-259: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py

161-161: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


162-162: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


229-229: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


263-263: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py

191-191: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


192-192: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


259-259: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


293-293: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🔇 Additional comments (9)
Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1)

131-145: Annotation-before-bind flow: LGTM.

Running the memory-level annotation optimizer before and after bind/codeTransform is consistent with hoisting behavior. No issues.

If performance becomes a concern, consider logging per-pass timing to confirm negligible overhead.

Also applies to: 175-189, 210-224

Deeploy/Targets/Neureka/Tiler.py (1)

6-13: Imports and tiling-ready bindings consolidation: LGTM.

Wmem variants removed; non-Wmem bindings wired to corresponding tile constraints. Consistent with Engine/Bindings changes.

Please confirm Engine.py/Bindings.py no longer reference removed Wmem symbols.

Also applies to: 16-29

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (5)

9-9: LGTM: New imports support runtime memory-level branching.

The imports of VariableBuffer and the requant helpers properly support the refactored architecture where bindings check weight memory location at runtime rather than at parse time.

Also applies to: 12-12


54-58: Well-structured runtime branching for weight memory levels.

The refactored logic correctly differentiates SRAM-resident weights (which must use full output channels) from non-SRAM weights (which can be tiled by output channel). The hasattr check provides appropriate fallback for weights without an explicit memory-level attribute.


157-159: Good defensive type checking.

The explicit isinstance assertion ensures outputBuffer is a VariableBuffer before accessing shape and other attributes, preventing runtime errors.


212-222: Clean SRAM weight handling.

The SRAM branch correctly computes per-tile weight offsets as flat byte addresses, leveraging the fact that SRAM-resident weights are fully loaded and accessed via computed offsets rather than scheduled load operations.


239-244: Well-designed requantized tile constraint class.

The NeurekaRQSDenseConv2DTileConstraint class properly extends the base class and integrates requantization helpers by calling the base implementation and then augmenting it with requant-specific constraints.

Deeploy/Targets/Neureka/Engine.py (2)

15-15: Import correctly updated to remove Wmem variants.

The import statement now excludes Wmem tiling bindings, consistent with the PR objective of removing memory-aware binding variants.


18-25: Simplified mapper definitions align with refactoring goals.

The removal of Wmem binding concatenation successfully moves memory-level decisions from parse-time binding selection to runtime branching within the tile constraints. This simplification eliminates the dependency on hoisted buffers during parsing, as stated in the PR objectives.

@Xeratec Xeratec merged commit 5db6de3 into pulp-platform:devel Oct 26, 2025
134 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in Deeploy Oct 26, 2025
This was referenced Feb 5, 2026
Victor-Jung added a commit that referenced this pull request Feb 5, 2026
This release includes improvements to the tiling and DMA code
generation, new networks and operators, improved CI workflows, migration
to PyTest, and support for PyPi package releases.

Note: Since the release tag references the Docker container tagged with
the release tag (ghcr.io/pulp-platform/deeploy:v0.2.1), the CI will
initially fail. The Deeploy Docker image must be built after the release
PR is merged and the CI restarted.

### List of Pull Requests
- PyPi Package Deployment + Remove Banshee Dept
[#154](#154)
- PyTest Migration
[#144](#144)
- Update submodule `pulp-nn-mixed`
[#145](#145)
- Improve Profiling
[#138](#138)
- FP32 ReduceMean operator improvement
[#137](#137)
- Support for RMSNorm (Pow and Sqrt operators)
[#136](#136)
- Demo TinyViT compatibility with tiled Siracusa
[#124](#124)
- TinyViT on non-tiled Siracusa
[#117](#117)
- Support Fully Asynchronous DMAs
[#114](#114)
- Disallow shape inference
[#128](#128)
- Remove memory-aware node bindings
[#123](#123)
- Fix missing const's layout transformation and refactor NCHWtoNHWC
passes [#122](#122)
- Fix aliasing [#125](#125)
- Support for 1D Autoencoder
[#98](#98)
- Refactor Logging for Improved Debugging
[#115](#115)
- Add reuse-tool as an SPDX license header linter
[#113](#113)
- Bug fixes, API Cleanup and Reduce Compiler Warning on PULP
[#112](#112)
- Fix PULP GEMM `batch` serialization
[#109](#109)
- Split CI Workflows by Platform and Task, Improve Formatting and
Linting Reliability
[#108](#108)
- Refactor tiling code generation
[#105](#105)
- Change order of typeMatching entries
[#68](#68)
- Node Mangling to avoid duplication
[#93](#93)
- Prepare Post v0.2.0 Release
[#104](#104)
- Use Docker digests instead of arch-specific tags
[#106](#106)
- Fix `Unsqueeze` Op. when using ONNX opset 13 or higher (from attribute
to input) [#119](#119)
- Fix bias hoisting in generic GEMM with no bias
[#126](#126)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Refactor Changes or improvements to existing features

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants