You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Add RangedS3Reader for efficient byte range requests with adaptive buffering
- Update Rust components to support range parameters in mountpoint client
- Introduce S3ReaderConstructor to configure reader types and parameters,
maintaining sequential reader as default for backwards compatibility
- Expose reader_constructor in S3Client, datasets, and DCP interfaces
- Update user agent to include dataset and reader types
- Extend test coverage with parametrized tests for both reader implementations
- Add configurable S3Reader to s3torchbenchmarking module
- Update documentation with usage examples and performance considerations
Copy file name to clipboardExpand all lines: README.md
+122-7Lines changed: 122 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -115,13 +115,16 @@ For example, assuming the following directory bucket name `my-test-bucket--usw2-
115
115
usw2-az1, then the URI used will look like: `s3://my-test-bucket--usw2-az1--x-s3/<PREFIX>` (**please note that the
116
116
prefix for Amazon S3 Express One Zone should end with '/'**), paired with region us-west-2.
117
117
118
+
118
119
## Distributed checkpoints
119
120
120
121
### Overview
121
122
122
123
Amazon S3 Connector for PyTorch provides robust support for PyTorch distributed checkpoints. This feature includes:
123
124
124
-
-`S3StorageWriter` and `S3StorageReader`: Implementations of PyTorch's StorageWriter and StorageReader interfaces.
125
+
-`S3StorageWriter`: Implementation of PyTorch's StorageWriter interface.
126
+
127
+
-`S3StorageReader`: Implementation of PyTorch's StorageReader interface. Supports configurable reading strategies via the `reader_constructor` parameter (see [Reader Configurations](#reader-configurations)).
125
128
-`S3FileSystem`: An implementation of PyTorch's FileSystemBase.
126
129
127
130
These tools enable seamless integration of Amazon S3 with
@@ -187,7 +190,7 @@ the load across multiple S3 partitions.
187
190
#### 1. RoundRobinPrefixStrategy
188
191
Distributes checkpoints across specified prefixes in a round-robin fashion, ideal for balancing data across multiple storage locations.
189
192
190
-
```python
193
+
```py
191
194
from s3torchconnector.dcp import RoundRobinPrefixStrategy, S3StorageWriter
192
195
193
196
model = torchvision.models.resnet18()
@@ -234,7 +237,7 @@ CHECKPOINT_URI
234
237
235
238
Generates binary (base-2) prefixes for optimal partitioning in distributed environments.
236
239
237
-
```python
240
+
```py
238
241
from s3torchconnector.dcp import BinaryPrefixStrategy
239
242
240
243
strategy = BinaryPrefixStrategy(
@@ -261,7 +264,7 @@ s3://my-bucket/checkpoints/
261
264
#### 3. HexPrefixStrategy
262
265
263
266
Uses hexadecimal (base-16) prefixes for a balance of efficiency and readability.
264
-
```
267
+
```py
265
268
from s3torchconnector.dcp import HexPrefixStrategy
266
269
267
270
strategy = HexPrefixStrategy(
@@ -288,7 +291,7 @@ s3://my-bucket/checkpoints/
288
291
### Creating Custom Strategies
289
292
290
293
You can implement custom prefix strategies by extending the S3PrefixStrategyBase class:
291
-
```
294
+
```py
292
295
from s3torchconnector.dcp import S3PrefixStrategyBase
293
296
294
297
classCustomPrefixStrategy(S3PrefixStrategyBase):
@@ -312,7 +315,7 @@ The S3IterableDataset can be directly passed to PyTorch's DataLoader for paralle
312
315
By default, all worker processes will share the same list of training objects. However,
313
316
if you need each worker to have access to a unique portion of the dataset for better parallelization,
314
317
you can enable dataset sharding using the `enable_sharding` parameter.
@@ -324,7 +327,7 @@ Each worker, regardless of its host, will load and process a distinct subset of
324
327
For the S3MapDataset, you need to pass it to DataLoader along with a [DistributedSampler](https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler) wrapped around it.
325
328
The DistributedSampler ensures that each worker or node receives a unique subset of the dataset,
326
329
enabling efficient parallel and distributed training.
Amazon S3 Connector for PyTorch supports two types of readers, configurable through `S3ReaderConstructor`.
405
+
406
+
### Reader Types
407
+
408
+
#### 1. Sequential Reader (Default)
409
+
410
+
- Downloads and buffers the entire S3 object in memory.
411
+
- Prioritizes performance over memory usage by buffering entire objects.
412
+
413
+
#### 2. Range-based Reader
414
+
415
+
- Performs byte-range requests to read specific portions of S3 objects without downloading the entire file.
416
+
- Prioritizes memory efficiency, with performance gains only for sparse partial reads.
417
+
- Features adaptive buffering with forward overlap handling:
418
+
-**Small reads** (< `buffer_size`): Use internal buffer to reduce S3 API calls.
419
+
-**Large reads** (≥ `buffer_size`): Bypass buffer for direct transfer.
420
+
421
+
### When to Use Each Reader
422
+
423
+
-**Sequential Reader**: For processing entire files, and when repeated access to the data is required. Best for most general use cases.
424
+
-**Range-based Reader**: For larger objects (100MB+) that require sparse partial reads, and in memory-constrained environments.
425
+
426
+
**Note**: S3Reader instances are not thread-safe and should not be shared across threads. For multiprocessing with DataLoader, each worker process creates its own S3Reader instance automatically.
427
+
428
+
### Examples
429
+
430
+
Direct method - `S3Client` usage with range-based reader without buffer:
431
+
```py
432
+
# Direct S3Client usage for zero-copy partial reads into pre-allocated buffers, for memory efficiency and fast data transfer
For `S3ReaderConstructor` usage details, please refer to the [`S3ReaderConstructor` documentation](https://awslabs.github.io/s3-connector-for-pytorch/autoapi/s3torchconnector/s3reader/constructor/index.html).
488
+
374
489
## Contributing
375
490
376
491
We welcome contributions to Amazon S3 Connector for PyTorch. Please see [CONTRIBUTING](CONTRIBUTING.md) for more
0 commit comments