Skip to content

Implement ANTLR4 grammar parsing and migration from regex focus (refs #147)#149

Merged
justinfx merged 17 commits intomasterfrom
v3
Feb 21, 2026
Merged

Implement ANTLR4 grammar parsing and migration from regex focus (refs #147)#149
justinfx merged 17 commits intomasterfrom
v3

Conversation

@justinfx
Copy link
Owner

#147

Migrate to ANTLR4 grammar-based parsing (v3)

Summary

Replaces regex-based parsing with shared ANTLR4 grammar used by Go and C++ implementations. All 181 tests passing.

Breaking Changes

Removed API:

  • FileSequence.SPLIT_RE, FileSequence.DISK_RE class variables
  • constants.SPLIT_PATTERN, constants.SPLIT_RE, constants.SPLIT_SUB_PATTERN, constants.SPLIT_SUB_RE

Removed Files:

  • setup.py → replaced with pyproject.toml
  • src/fileseq/__version__.py → automatic versioning via setuptools-scm

Behavior:

  • Auto-padding now only applies to single-frame files (foo.100.exr)
  • Explicit padding preserved (foo.1@@@@.exr keeps 4 chars)

New Features

  • Decimal frame ranges: foo.1-5x0.25#.exr
  • Subframe sequences: foo.#.#.exr, foo.1-5#.10-20@@.exr
  • Fixed hidden file parsing: .bar1000.exr → basename=.bar, frame=1000, ext=.exr
  • Cross-platform path handling (both / and \\)

Implementation

  • Grammar: grammar/fileseq.g4 (shared with Go/C++)
  • Parser generator: hatch run generate or python src/fileseq/grammar/generate.py
  • Modern packaging: PEP 517/518 with pyproject.toml
  • CI: Grammar validation + version verification on deploy

Performance

Zero regression vs v2.x regex parsing:

  • Simple patterns: ~240 μs
  • Complex patterns: ~445 μs

@justinfx justinfx added this to the v3 milestone Feb 11, 2026
@justinfx justinfx self-assigned this Feb 11, 2026
@justinfx justinfx added the v3 label Feb 11, 2026
Replace fully-expanded _items (frozenset) and _order (tuple) with a
compact list of Range objects. Memory reduction of 99.9%+ for typical
ranges (100k frames: 7.8MB -> ~536 bytes).

Bug fixes:
- isConsecutive(): rewrite as O(n) range-based algorithm; fixes incorrect
  True for interleaved ranges and IndexError on empty FrameSet
- hasSubFrames(): correctly returns True for decimal notation like 1.0-5.0
  where normalizeFrame collapses values to integers before storage
- Stagger modifier: deduplicate frames across stagger iterations
- MAX_FRAME_SIZE check: calculate size mathematically for x and plain
  ranges instead of materializing all frames

API compatibility: no breaking changes. .items and .order remain public
with DeprecationWarning. All 181 existing tests pass.
@justinfx justinfx merged commit 7d5a8c7 into master Feb 21, 2026
10 checks passed
@justinfx justinfx deleted the v3 branch February 21, 2026 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant