Skip to content

Commit d265e14

Browse files
authored
Merge pull request #150 from justinfx/v3_ranges
Refactor FrameSet to use range-based storage (fixes #148)
2 parents ceac963 + e26f7b3 commit d265e14

File tree

4 files changed

+569
-176
lines changed

4 files changed

+569
-176
lines changed

CHANGES.md

Lines changed: 53 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -2,70 +2,70 @@
22

33
## v3.0.0 (TBD)
44

5-
### Major Changes - ANTLR4 Grammar-Based Parsing
5+
### Major Changes
66

7-
This is a **major version** with breaking changes. FileSeq v3 migrates from regex-based parsing to ANTLR4 grammar-based
8-
parsing, aligning the Python implementation with the Go and C++ implementations for consistency and maintainability.
7+
This is a **major version** with breaking changes. FileSeq v3 includes two significant architectural improvements:
98

10-
#### Breaking Changes
9+
1. **ANTLR4 Grammar-Based Parsing** (#149) - Migrates from regex-based parsing to ANTLR4 grammar-based parsing
10+
2. **Range-Based FrameSet Storage** (#150) - Replaces fully-expanded frame storage with memory-efficient ranges
1111

12-
**Removed API:**
12+
#### ANTLR4 Grammar-Based Parsing (#149)
13+
14+
Aligns the Python implementation with the Go and C++ implementations for consistency and maintainability.
15+
16+
**Breaking Changes:**
1317
- `FileSequence.SPLIT_RE` class variable (custom regex pattern override)
1418
- `FileSequence.DISK_RE` class variable (custom disk scanning override)
1519
- `constants.SPLIT_PATTERN`, `constants.SPLIT_RE` (use grammar-based parsing API)
1620
- `constants.SPLIT_SUB_PATTERN`, `constants.SPLIT_SUB_RE` (use grammar-based parsing API)
17-
18-
**Removed Files:**
1921
- `setup.py` (replaced with modern `pyproject.toml`)
2022
- `src/fileseq/__version__.py` (version now managed by `setuptools-scm` from git tags)
2123

22-
**Behavioral Changes:**
23-
- **Auto-padding:** Now only applies to single-frame files without explicit padding
24-
- `foo.100.exr` → gets auto-padding based on frame width (backward compatible)
25-
- `foo.1@@@@.exr` → preserves explicit padding (previously would auto-pad)
26-
- **Pattern parsing:** Uses shared ANTLR4 grammar instead of regex
27-
- More consistent behavior across languages
28-
- Better error messages for invalid patterns
29-
30-
#### New Features
31-
24+
**New Features:**
3225
- **Decimal frame ranges:** Support for decimal step values
3326
- `foo.1-5x0.25#.exr` → frames 1, 1.25, 1.5, 1.75, 2, 2.25...
34-
- Single token parsing for better performance
35-
3627
- **Subframe sequences (Python-specific):**
3728
- Dual range: `foo.1-5#.10-20@@.exr` (main frames + subframes)
3829
- Composite padding: `foo.1-5@.#.exr` (frame padding + subframe padding)
3930
- Pattern-only: `foo.#.#.exr` (wildcard for both components)
40-
4131
- **Better hidden file support:**
4232
- `.bar1000.exr` now correctly parses as basename=`.bar`, frame=`1000`, ext=`.exr`
43-
- Previously treated `.bar1000` as single extension
44-
4533
- **Cross-platform path handling:**
4634
- Correctly handles both Unix (`/`) and Windows (`\`) path separators
47-
- Mixed separators normalized properly
4835

49-
#### Implementation Details
36+
**Behavioral Changes:**
37+
- **Auto-padding:** Now only applies to single-frame files without explicit padding
38+
- `foo.100.exr` → gets auto-padding based on frame width (backward compatible)
39+
- `foo.1@@@@.exr` → preserves explicit padding (previously would auto-pad)
40+
- **Pattern parsing:** Uses shared ANTLR4 grammar instead of regex
41+
- More consistent behavior across languages
42+
- Better error messages for invalid patterns
5043

51-
- **Grammar-based parsing:** Shared ANTLR4 grammar (`grammar/fileseq.g4`) with Go and C++ implementations
52-
- **Parser generator:** `src/fileseq/grammar/generate.py` tool for regenerating parser from grammar
53-
- **Build system:** Modern `pyproject.toml` with PEP 517/518 support
54-
- **Version management:** Automatic versioning via `setuptools-scm` from git tags
55-
- **CI/CD:** Grammar validation in CI to ensure consistency across languages
56-
- **Documentation:** Comprehensive migration guide and benchmarks included
44+
#### Range-Based FrameSet Storage (#150)
5745

58-
#### Performance
46+
Migrates `FrameSet` from fully-expanded storage to range-based storage for memory efficiency.
5947

60-
Grammar-based parsing provides comparable performance to v2.x regex parsing:
61-
- Simple patterns: ~240 μs per parse
62-
- Complex patterns: ~445 μs per parse
63-
- FrameSet operations: ~13 μs for simple ranges
64-
- Disk scanning: <1 ms for typical directories
48+
**Breaking Changes:**
49+
- `.items` and `.order` properties now deprecated with `DeprecationWarning`
50+
- Still functional but expand lazily and warn on access
51+
- Use `set(frameset)` and `list(frameset)` instead
52+
53+
**New Features:**
54+
- **Memory-efficient storage:** 99.9%+ memory reduction for large ranges
55+
- 100k frames: 7.8MB → 536 bytes
56+
- Stores ranges instead of fully-expanded frames
57+
- **Performance improvements:** Range-based algorithms for operations like `isConsecutive()`
58+
59+
**Bug Fixes:**
60+
- `isConsecutive()` now correctly handles interleaved ranges and empty framesets
61+
- `hasSubFrames()` correctly detects decimal notation like `"1.0-5.0"`
62+
- Stagger modifier (`:`) now properly deduplicates frames
6563

6664
#### Migration Guide
6765

68-
**If you were using custom regex patterns:**
66+
**ANTLR4 Grammar Changes:**
67+
68+
If you were using custom regex patterns:
6969
```python
7070
# v2.x - REMOVED in v3
7171
class MySequence(FileSequence):
@@ -75,7 +75,7 @@ class MySequence(FileSequence):
7575
seq = FileSequence(pattern) # Grammar handles parsing
7676
```
7777

78-
**If you relied on auto-padding for explicit padding patterns:**
78+
If you relied on auto-padding for explicit padding patterns:
7979
```python
8080
# v2.x behavior
8181
seq = FileSequence("foo.1@@@@.exr")
@@ -86,6 +86,21 @@ seq = FileSequence("foo.1@@@@.exr")
8686
# Preserves @@@@ as specified (4 chars)
8787
```
8888

89+
**Range-Based FrameSet Changes:**
90+
91+
If you accessed FrameSet internals:
92+
```python
93+
# v2.x/v3 - Now deprecated with warnings
94+
fs = FrameSet("1-1000")
95+
frames = fs.items # ⚠️ DeprecationWarning, expands all frames
96+
ordered = fs.order # ⚠️ DeprecationWarning, expands all frames
97+
98+
# v3 - Use public iteration API
99+
frames = set(fs) # ✅ Lazy iteration
100+
ordered = list(fs) # ✅ Lazy iteration
101+
contains = 500 in fs # ✅ Efficient range-based lookup
102+
```
103+
89104
---
90105

91106
## v2.3.1 (2025-02-21)

src/fileseq/filesequence.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -844,7 +844,7 @@ def __getitem__(self, idx: typing.Any) -> T | Self:
844844
frames = self._frameSet[idx]
845845

846846
if not hasattr(idx, 'start'):
847-
return self.frame(frames)
847+
return self.frame(frames) # type: ignore[arg-type]
848848

849849
fset = FrameSet(frames)
850850
if fset.is_null:

0 commit comments

Comments
 (0)