Skip to content

Commit 7a5c6ba

Browse files
authored
Add documentation about AWS region (#213)
1 parent 67e60bf commit 7a5c6ba

File tree

2 files changed

+86
-0
lines changed

2 files changed

+86
-0
lines changed

docs/troubleshooting/aws.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# Troubleshooting Amazon S3
2+
3+
## Region required
4+
5+
All requests to S3 must include the region. An error will occur on requests when you don't pass the correct region.
6+
7+
For example, trying to list the [`sentinel-cogs`](https://registry.opendata.aws/sentinel-2-l2a-cogs/) open bucket without passing a region will fail:
8+
9+
```py
10+
import obstore as obs
11+
from obstore.store import S3Store
12+
13+
store = S3Store("sentinel-cogs", skip_signature=True)
14+
next(obs.list(store))
15+
```
16+
17+
raises (note, the error text may change in the future)
18+
19+
```
20+
GenericError: Generic {
21+
store: "S3",
22+
source: ListRequest {
23+
source: BareRedirect,
24+
},
25+
}
26+
```
27+
28+
We can fix this by passing the correct region:
29+
30+
```py
31+
import obstore as obs
32+
from obstore.store import S3Store
33+
34+
store = S3Store("sentinel-cogs", skip_signature=True, region="us-west-2")
35+
next(obs.list(store))
36+
```
37+
38+
this prints:
39+
40+
```py
41+
[{'path': 'sentinel-s2-l2a-cogs/1/C/CV/2018/10/S2B_1CCV_20181004_0_L2A/AOT.tif',
42+
'last_modified': datetime.datetime(2020, 9, 30, 20, 25, 56, tzinfo=datetime.timezone.utc),
43+
'size': 50510,
44+
'e_tag': '"2e24c2ee324ea478f2f272dbd3f5ce69"',
45+
'version': None},
46+
...
47+
```
48+
49+
### Inferring the bucket region
50+
51+
Note that it's possible to infer the S3 bucket region from an arbitrary `HEAD` request.
52+
53+
Here, we show an example of using `requests` to find the bucket region, but you can use any HTTP client:
54+
55+
```py
56+
import requests
57+
58+
def find_bucket_region(bucket_name: str) -> str:
59+
resp = requests.head(f"https://{bucket_name}.s3.amazonaws.com")
60+
return resp.headers["x-amz-bucket-region"]
61+
```
62+
63+
Applying this to our previous example, we can use this to find the region of the `sentinel-cogs` bucket:
64+
65+
```py
66+
find_bucket_region("sentinel-cogs")
67+
# 'us-west-2'
68+
```
69+
70+
Or we can pass this directly into the region:
71+
72+
```py
73+
bucket_name = "sentinel-cogs"
74+
store = S3Store(
75+
bucket_name, skip_signature=True, region=find_bucket_region(bucket_name)
76+
)
77+
```
78+
79+
Finding the bucket region in this way works **both for public and non-public buckets**.
80+
81+
This `HEAD` request can also tell you if the bucket is public or not by checking the [HTTP response code](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status) (accessible in `requests` via [`resp.status_code`](https://requests.readthedocs.io/en/latest/api/#requests.Response.status_code)):
82+
83+
- [`200`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/200): public bucket.
84+
- [`403`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403): private bucket.

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ nav:
4848
- api/exceptions.md
4949
- api/file.md
5050
- fsspec Integration: api/fsspec.md
51+
- Troubleshooting:
52+
- AWS: troubleshooting/aws.md
5153
- Advanced:
5254
- advanced/pickle.md
5355
- Developer Docs:

0 commit comments

Comments
 (0)