Skip to content

Commit ec97567

Browse files
authored
README.md updates for contributions (#130)
Add section on compatibility with other storage services. Align content to fit 120 characters line.
1 parent 17ac40d commit ec97567

File tree

1 file changed

+47
-23
lines changed

1 file changed

+47
-23
lines changed

README.md

Lines changed: 47 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,18 @@
11
# Amazon S3 Connector for PyTorch
2-
The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access or store data in Amazon S3. Using the S3 Connector for PyTorch
3-
automatically optimizes performance when downloading training data from and writing checkpoints to Amazon S3, eliminating the need to write your own code to list S3 buckets and manage concurrent requests.
4-
5-
6-
Amazon S3 Connector for PyTorch provides implementations of PyTorch's [dataset primitives](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html) that you can use to load training data from Amazon S3.
7-
It supports both [map-style datasets](https://pytorch.org/docs/stable/data.html#map-style-datasets) for random data access patterns and
8-
[iterable-style datasets](https://pytorch.org/docs/stable/data.html#iterable-style-datasets) for streaming sequential data access patterns.
9-
The S3 Connector for PyTorch also includes a checkpointing interface to save and load checkpoints directly to Amazon S3, without first saving to local storage.
2+
The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access or store data in
3+
Amazon S3. Using the S3 Connector for PyTorch
4+
automatically optimizes performance when downloading training data from and writing checkpoints to Amazon S3,
5+
eliminating the need to write your own code to list S3 buckets and manage concurrent requests.
6+
7+
8+
Amazon S3 Connector for PyTorch provides implementations of PyTorch's
9+
[dataset primitives](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html) that you can use to load
10+
training data from Amazon S3.
11+
It supports both [map-style datasets](https://pytorch.org/docs/stable/data.html#map-style-datasets) for random data
12+
access patterns and [iterable-style datasets](https://pytorch.org/docs/stable/data.html#iterable-style-datasets) for
13+
streaming sequential data access patterns.
14+
The S3 Connector for PyTorch also includes a checkpointing interface to save and load checkpoints directly to
15+
Amazon S3, without first saving to local storage.
1016

1117

1218
## Getting Started
@@ -22,25 +28,30 @@ automatically optimizes performance when downloading training data from and writ
2228
pip install s3torchconnector
2329
```
2430

25-
Amazon S3 Connector for PyTorch supports only Linux via Pip for now. For other platforms, see [DEVELOPMENT](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/doc/DEVELOPMENT.md) for build instructions.
31+
Amazon S3 Connector for PyTorch supports only Linux via Pip for now. For other platforms, see
32+
[DEVELOPMENT](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/doc/DEVELOPMENT.md) for build instructions.
2633

2734
### Configuration
2835

2936
To use `s3torchconnector`, AWS credentials must be provided through one of the following methods:
3037

31-
- If you are using this library on an EC2 instance, specify an IAM role and then give the EC2 instance access to that role.
38+
- If you are using this library on an EC2 instance, specify an IAM role and then give the EC2 instance access to
39+
that role.
3240
- Install and configure [`awscli`](https://aws.amazon.com/cli/) and run `aws configure`.
33-
- Set credentials in the AWS credentials profile file on the local system, located at: `~/.aws/credentials` on Unix or macOS.
41+
- Set credentials in the AWS credentials profile file on the local system, located at: `~/.aws/credentials`
42+
on Unix or macOS.
3443
- Set the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables.
3544

3645
### Examples
3746

3847
[API docs](http://awslabs.github.io/s3-connector-for-pytorch) are showing API of the public components.
39-
End to end example of how to use `s3torchconnector` can be found under the [examples](https://github.com/awslabs/s3-connector-for-pytorch/tree/main/examples) directory.
48+
End to end example of how to use `s3torchconnector` can be found under the
49+
[examples](https://github.com/awslabs/s3-connector-for-pytorch/tree/main/examples) directory.
4050

4151
#### Sample Examples
4252

43-
The simplest way to use the S3 Connector for PyTorch is to construct a dataset, either a map-style or iterable-style dataset, by specifying an S3 URI (a bucket and optional prefix) and the region the bucket is located in:
53+
The simplest way to use the S3 Connector for PyTorch is to construct a dataset, either a map-style or iterable-style
54+
dataset, by specifying an S3 URI (a bucket and optional prefix) and the region the bucket is located in:
4455
```shell
4556
from s3torchconnector import S3MapDataset, S3IterableDataset
4657

@@ -67,7 +78,8 @@ for object in iterable_dataset:
6778

6879
```
6980
70-
In addition to data loading primitives, the S3 Connector for PyTorch also provides an interface for saving and loading model checkpoints directly to and from an S3 bucket.
81+
In addition to data loading primitives, the S3 Connector for PyTorch also provides an interface for saving and loading
82+
model checkpoints directly to and from an S3 bucket.
7183
7284
```shell
7385
from s3torchconnector import S3Checkpoint
@@ -92,28 +104,40 @@ with checkpoint.reader(CHECKPOINT_URI + "epoch0.ckpt") as reader:
92104
model.load_state_dict(state_dict)
93105
```
94106
95-
Using datasets or checkpoints with [Amazon S3 Express One Zone](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-one-zone.html)
107+
Using datasets or checkpoints with
108+
[Amazon S3 Express One Zone](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-one-zone.html)
96109
directory buckets requires only to update the URI, following `base-name--azid--x-s3` bucket name format.
97110
For example, assuming the following directory bucket name `my-test-bucket--usw2-az1--x-s3` with the Availability Zone ID
98-
usw2-az1, then the URI used will look like: `s3://my-test-bucket--usw2-az1--x-s3/<PREFIX>` (**please note that the prefix
99-
for Amazon S3 Express One Zone should end with '/'**), paired with region us-west-2.
111+
usw2-az1, then the URI used will look like: `s3://my-test-bucket--usw2-az1--x-s3/<PREFIX>` (**please note that the
112+
prefix for Amazon S3 Express One Zone should end with '/'**), paired with region us-west-2.
100113
101114
## Contributing
102-
We welcome contributions to Amazon S3 Connector for PyTorch. Please see [CONTRIBUTING](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/doc/CONTRIBUTING.md) For more information on how to report bugs or submit pull requests.
115+
We welcome contributions to Amazon S3 Connector for PyTorch. Please
116+
see [CONTRIBUTING](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/doc/CONTRIBUTING.md)
117+
For more information on how to report bugs or submit pull requests.
103118
104119
### Development
105-
See [DEVELOPMENT](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/doc/DEVELOPMENT.md) for information about code style,
106-
development process, and guidelines.
120+
See [DEVELOPMENT](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/doc/DEVELOPMENT.md) for information
121+
about code style, development process, and guidelines.
107122
123+
### Compatibility with other storage services
124+
S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access or store data in Amazon S3.
125+
While it may be functional against other storage services that use S3-like APIs, they may inadvertently break when we
126+
make changes to better support Amazon S3. We welcome contributions of minor compatibility fixes or performance
127+
improvements for these services if the changes can be tested against Amazon S3.
108128
109129
### Security issue notifications
110-
If you discover a potential security issue in this project we ask that you notify AWS Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/).
130+
If you discover a potential security issue in this project we ask that you notify AWS Security via our
131+
[vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/).
111132
112133
### Code of conduct
113134
114-
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). See [CODE_OF_CONDUCT.md](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/doc/CODE_OF_CONDUCT.md) for more details.
135+
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
136+
See [CODE_OF_CONDUCT.md](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/doc/CODE_OF_CONDUCT.md) for
137+
more details.
115138
116139
## License
117140
118-
Amazon S3 Connector for PyTorch has a BSD 3-Clause License, as found in the [LICENSE](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/LICENSE) file.
141+
Amazon S3 Connector for PyTorch has a BSD 3-Clause License, as found in the
142+
[LICENSE](https://github.com/awslabs/s3-connector-for-pytorch/blob/main/LICENSE) file.
119143

0 commit comments

Comments
 (0)