Skip to content

Commit 1869891

Browse files
authored
Upgrade to mountpoint-s3-client-0.8.0 and update logging (#176)
mountpoint-s3-client-0.8.0 contains changes incompatible with the logging setup. Given the previous issues we had with this, we are separating these as follows: - Logging in the main Python component remains handled by Python logger. - Logging in Rust components is handled by tracing_subscriber, set up through S3_TORCH_CONNECTOR_DEBUG_LOGS and S3_TORCH_CONNECTOR_LOGS_DIR_PATH environment variables.
1 parent cbf5c4c commit 1869891

File tree

8 files changed

+293
-131
lines changed

8 files changed

+293
-131
lines changed

CHANGELOG.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,12 @@
1+
## Unreleased
2+
3+
### New features
4+
5+
### Breaking changes
6+
* Separate completely Rust logs and Python logs. Logs from Rust components, used for debugging purposes
7+
are configured through the following environment variables: S3_TORCH_CONNECTOR_DEBUG_LOGS,
8+
S3_TORCH_CONNECTOR_LOGS_DIR_PATH.
9+
110
## v1.2.0 (March 13, 2024)
211

312
### New features
@@ -11,6 +20,7 @@
1120
### Breaking changes
1221
* No breaking changes.
1322

23+
1424
## v1.1.4 (February 26, 2024)
1525

1626
### New features
@@ -37,7 +47,6 @@
3747
* No breaking changes.
3848

3949

40-
4150
## v1.1.2 (January 19, 2024)
4251

4352
### New features
@@ -48,7 +57,6 @@
4857
* No breaking changes.
4958

5059

51-
5260
## v1.1.1 (December 11, 2023)
5361

5462
### New features
@@ -69,12 +77,11 @@
6977
* No breaking changes.
7078

7179

72-
7380
## v1.0.0 (November 22, 2023)
7481
* The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in Amazon S3.
7582

7683
### New features
77-
* S3IterableDataset and S3MapDataset, which allow building either an iterable-style or map-style dataset, using your S3
84+
* S3IterableDataset and S3MapDataset, which allow building either an iterable-style or map-style dataset, using your S3
7885
stored data, by specifying an S3 URI (a bucket and optional prefix) and the region the bucket is in.
7986
* Support for multiprocess data loading for the above datasets.
8087
* S3Checkpoint, an interface for saving and loading model checkpoints directly to and from an S3 bucket.

doc/DEVELOPMENT.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,3 +92,49 @@ Fill in the path of the Python executable in your virtual environment (`venv/bin
9292
as the program argument.
9393
Then put a breakpoint in the Rust/C code and try running it.
9494

95+
#### Enabling Debug Logging
96+
The [Python logger](https://docs.python.org/3/library/logging.html) handles logging messages from the Python-side
97+
of our implementation.
98+
For debug purposes, you can also enable the logs for our Rust components, which are off by default.
99+
These are handled by [tracing_subscriber](https://docs.rs/tracing-subscriber/latest/tracing_subscriber/) and can be
100+
configured through the following environment variables:
101+
- `S3_TORCH_CONNECTOR_DEBUG_LOGS` - Configured similarly to the
102+
[RUST_LOG](https://docs.rs/env_logger/latest/env_logger/#enabling-logging) variable for
103+
filtering logs from our Rust components. This includes finer granularity logs from
104+
[AWS Common Runtime (CRT)](https://docs.aws.amazon.com/sdkref/latest/guide/common-runtime.html).
105+
**Please note that the AWS CRT logs are very noisy. We recommend to filter them out by appending `"awscrt=off"` to
106+
your S3_TORCH_CONNECTOR_DEBUG_LOGS setup.**
107+
- `S3_TORCH_CONNECTOR_LOGS_DIR_PATH` - The path to a local directory where you have write permissions.
108+
When configured, the logs from the Rust components will be appended to a file at this location.
109+
This will result in a log file located at
110+
`${S3_TORCH_CONNECTOR_LOGS_DIR_PATH}/s3torchconnectorclient.log.yyyy-MM-dd-HH`, rolled on an hourly basis.
111+
The log messages of the latest run are appended to the end of the most recent log file.
112+
113+
**Examples**
114+
- Configure INFO level logs to be written to STDOUT:
115+
```sh
116+
export S3_TORCH_CONNECTOR_DEBUG_LOGS=info
117+
```
118+
119+
- Enable TRACE level logs (most verbose) to be written at `/tmp/s3torchconnector-logs`:
120+
```sh
121+
export S3_TORCH_CONNECTOR_DEBUG_LOGS=trace
122+
export S3_TORCH_CONNECTOR_LOGS_DIR_PATH="/tmp/s3torchconnector-logs"
123+
```
124+
After running your script, you will find the logs under `/tmp/s3torchconnector-logs`.
125+
The file will include AWS CRT logs.
126+
127+
- Enable TRACE level logs with AWS CRT logs filtered out, written at `/tmp/s3torchconnector-logs`:
128+
```sh
129+
export S3_TORCH_CONNECTOR_DEBUG_LOGS=trace,awscrt=off
130+
export S3_TORCH_CONNECTOR_LOGS_DIR_PATH="/tmp/s3torchconnector-logs"
131+
```
132+
133+
- Set up different levels for inner components:
134+
```sh
135+
export S3_TORCH_CONNECTOR_DEBUG_LOGS=trace,mountpoint_s3_client=debug,awscrt=error
136+
```
137+
This will set the log level to TRACE by default, DEBUG for mountpoint-s3-client and ERROR for AWS CRT.
138+
139+
For more examples please check the
140+
[env_logger documentation](https://docs.rs/env_logger/latest/env_logger/#enabling-logging).

s3torchconnectorclient/Cargo.lock

Lines changed: 39 additions & 32 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

s3torchconnectorclient/Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ built = "0.7"
1919
pyo3 = { version = "0.19.2" }
2020
pyo3-log = "0.8.3"
2121
futures = "0.3.28"
22-
mountpoint-s3-client = { version = "0.7.0", features = ["mock"] }
23-
mountpoint-s3-crt = "0.6.1"
22+
mountpoint-s3-client = { version = "0.8.0", features = ["mock"] }
23+
mountpoint-s3-crt = "0.6.2"
2424
log = "0.4.20"
2525
tracing = { version = "0.1.40", default-features = false, features = ["std", "log"] }
2626
tracing-subscriber = { version = "0.3.18", features = ["fmt", "env-filter"]}

0 commit comments

Comments
 (0)