Skip to content

Improve Sweeping Strategy: Per-site Support and Runtime Optimization for Grouping=2 #31

@emapuljak

Description

@emapuljak

Feature Requests

  1. Support for Per-Site Sweeping
    The current implementation supports grouped sweeping (e.g., grouping=2), but lacks per-site sweeping.
    Tasks:

    • Add support for grouping=1 (a few fixes in code).
    • Ensure compatibility with both training and non-training passes.
    • Avoid unnecessary tensor grouping/splitting when operating on single sites.
  2. Runtime Optimization for Grouping=2 with Two-Way Sweep
    The two-way sweeping approach (left-to-right and right-to-left) is currently inefficient in evaluation or inference mode when using grouping=2.
    Tasks:

    • Find the bottleneck.
    • Minimize redundant operations during forward passes.
    • Cache intermediate environments where feasible.
    • Benchmark against one-way sweeping to confirm speed-up.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions