Commit 6b5a584
authored
fix(zephyr): route block_size/cache_type to file opener, not FS constructor (#3121)
Fixes #3117. Sole remaining CW canary blocker.
## Problem
`fsspec.open()` routes all `**kwargs` to the filesystem constructor, not
to `fs.open()`. `S3FileSystem.__init__` has `default_block_size` (not
`block_size`), so the kwarg leaks into `**kwargs` →
`AioSession.__init__()`, which `aiobotocore 2.26.0` rejects:
```
AioSession.__init__() got an unexpected keyword argument 'block_size'
```
This means `block_size`, `cache_type`, and `maxblocks` were **never
controlling S3 buffering** — they silently leaked to the session
constructor, and older aiobotocore ignored them.
<details>
<summary>fsspec 2025.3.0 source trace</summary>
1. **`open()` (L491-500)**: passes `**kwargs` to `open_files()`.
2. **`open_files()` (L300)**: `get_fs_token_paths(urlpath, mode,
storage_options=kwargs)` — all kwargs become filesystem constructor
args.
3. **`open_files()` (L313-322)**: constructs `OpenFile(fs, path, mode,
compression, ...)` — **no kwargs forwarded**.
4. **`OpenFile.__enter__()` (L105)**: `f = self.fs.open(self.path,
mode=mode)` — **only path and mode** reach `fs.open()`.
</details>
## Solution
Replace `fsspec.open()` with `url_to_fs()` + `fs.open()` so that
`block_size`, `cache_type`, and `cache_options` reach the file opener
(`AbstractBufferedFile`) instead of the filesystem constructor.
`AbstractFileSystem.open()` (spec.py L1310-1316) passes `block_size`,
`cache_options`, and `**kwargs` directly to `_open()`, and handles
`compression` at L1318-1324.
**`readers.py`** — `open_file()`:
```python
# Before: kwargs go to FS constructor → leak to AioSession
with fsspec.open(file_path, mode, compression=compression,
block_size=16_000_000, cache_type="background", maxblocks=2) as f:
# After: kwargs go to fs.open() → reach AbstractBufferedFile/S3File
fs, resolved_path = fsspec.core.url_to_fs(file_path)
with fs.open(resolved_path, mode,
block_size=_READ_BLOCK_SIZE, cache_type=_READ_CACHE_TYPE,
cache_options={"maxblocks": _READ_MAX_BLOCKS}, compression=compression) as f:
```
**`writers.py`** — 4 call sites:
```python
# Before: block_size goes to FS constructor → AioSession crash
with fsspec.open(temp_path, "wb", block_size=64 * 1024 * 1024) as f:
# After: block_size goes to fs.open() → controls multipart upload part size
fs, resolved_temp = fsspec.core.url_to_fs(temp_path)
with fs.open(resolved_temp, "wb", block_size=_WRITE_BLOCK_SIZE) as f:
```
<details>
<summary>Backend routing</summary>
**S3**: `S3FileSystem._open(block_size=16M, cache_type="background",
cache_options={"maxblocks": 2})` → `S3File` →
`BackgroundBlockCache(blocksize=16M, maxblocks=2)`.
**Local**: `LocalFileSystem._open(block_size=16M, **kwargs)` —
`block_size` is a named param (silently ignored), remaining kwargs
absorbed by `LocalFileOpener(**kwargs)`. No crash, no effect.
</details>
<details>
<summary>Note: compresslevel=1 was also misrouted
(pre-existing)</summary>
The `.gz` writer branch had `compresslevel=1` in `fsspec.open()`. This
also leaked to the FS constructor — fsspec's compression wrapper calls
`compress(f, mode=mode[0])` with no extra kwargs. Dropped in this PR;
fixing gzip compression level is a separate concern.
</details>
## Safety
This change makes the buffering settings actually work for the first
time. Previously, S3 reads used fsspec defaults (5MB blocks, "readahead"
cache). After: 16MB blocks with background prefetch. S3 writes go from
50MB to 64MB multipart parts. Local file IO is unaffected.
## Testing
- [x] `test_backends.py` (9/9), full zephyr suite (351/351)
- [x] Manual R2 test: reproduced `AioSession` crash with old code,
confirmed fix with new code
- [ ] Post-merge: re-run CW canary ferry1 parent 8d752a7 commit 6b5a584
File tree
3 files changed
+30
-10
lines changed- lib
- iris/src/iris/cluster/k8s
- zephyr/src/zephyr
3 files changed
+30
-10
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| |||
115 | 116 | | |
116 | 117 | | |
117 | 118 | | |
| 119 | + | |
118 | 120 | | |
| 121 | + | |
119 | 122 | | |
120 | | - | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
121 | 126 | | |
122 | 127 | | |
123 | 128 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
28 | 33 | | |
29 | 34 | | |
30 | 35 | | |
| |||
77 | 82 | | |
78 | 83 | | |
79 | 84 | | |
80 | | - | |
81 | | - | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
82 | 92 | | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
83 | 96 | | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | 97 | | |
88 | 98 | | |
89 | 99 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
24 | 27 | | |
25 | 28 | | |
26 | 29 | | |
| |||
81 | 84 | | |
82 | 85 | | |
83 | 86 | | |
| 87 | + | |
84 | 88 | | |
85 | 89 | | |
86 | 90 | | |
87 | 91 | | |
88 | | - | |
| 92 | + | |
89 | 93 | | |
90 | 94 | | |
91 | 95 | | |
92 | 96 | | |
93 | 97 | | |
94 | | - | |
| 98 | + | |
95 | 99 | | |
96 | 100 | | |
97 | 101 | | |
98 | 102 | | |
99 | | - | |
| 103 | + | |
100 | 104 | | |
101 | 105 | | |
102 | 106 | | |
| |||
367 | 371 | | |
368 | 372 | | |
369 | 373 | | |
370 | | - | |
| 374 | + | |
| 375 | + | |
371 | 376 | | |
372 | 377 | | |
373 | 378 | | |
| |||
0 commit comments