Skip to content

Commit 86b178b

Browse files
authored
Merge pull request #110 from vcon-dev/vigorous-kare
Fix S3 storage to honor region configuration
2 parents c4c87c5 + 16ed828 commit 86b178b

File tree

6 files changed

+595
-57
lines changed

6 files changed

+595
-57
lines changed

example_config.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,9 @@ storages:
134134
aws_access_key_id: some_key
135135
aws_secret_access_key: some_secret
136136
aws_bucket: some_bucket
137+
aws_region: us-east-1 # AWS region where the bucket is located
138+
# endpoint_url: null # Optional: custom endpoint for S3-compatible services
139+
# s3_path: vcons # Optional: prefix for S3 keys
137140
milvus:
138141
module: storage.milvus
139142
options:

server/storage/s3/README.md

Lines changed: 54 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -8,31 +8,51 @@ S3 storage provides scalable, durable object storage capabilities, making it ide
88

99
## Configuration
1010

11-
Required configuration options:
11+
Configuration options:
1212

1313
```yaml
1414
storages:
1515
s3:
1616
module: storage.s3
1717
options:
18-
bucket: your-bucket-name # S3 bucket name
19-
region: us-west-2 # AWS region
20-
access_key: your-access-key # AWS access key
21-
secret_key: your-secret-key # AWS secret key
22-
prefix: vcons/ # Optional: key prefix
23-
endpoint_url: null # Optional: custom endpoint
18+
# Required options
19+
aws_access_key_id: your-access-key # AWS access key ID
20+
aws_secret_access_key: your-secret-key # AWS secret access key
21+
aws_bucket: your-bucket-name # S3 bucket name
22+
23+
# Optional options
24+
aws_region: us-east-1 # AWS region (recommended to avoid cross-region errors)
25+
endpoint_url: null # Custom endpoint for S3-compatible services (e.g., MinIO)
26+
s3_path: vcons # Prefix for S3 keys (optional)
27+
```
28+
29+
### Configuration Options
30+
31+
| Option | Required | Description |
32+
|--------|----------|-------------|
33+
| `aws_access_key_id` | Yes | AWS access key ID for authentication |
34+
| `aws_secret_access_key` | Yes | AWS secret access key for authentication |
35+
| `aws_bucket` | Yes | Name of the S3 bucket to store vCons |
36+
| `aws_region` | No | AWS region where the bucket is located (e.g., `us-east-1`, `us-west-2`, `eu-west-1`). **Recommended** to avoid "AuthorizationHeaderMalformed" errors when the bucket is in a different region than the default. |
37+
| `endpoint_url` | No | Custom endpoint URL for S3-compatible services like MinIO, LocalStack, or other providers |
38+
| `s3_path` | No | Prefix path for organizing vCon objects within the bucket |
39+
40+
### Region Configuration
41+
42+
**Important:** If your S3 bucket is in a region other than `us-east-1`, you should explicitly set the `aws_region` option. Without this, you may encounter errors like:
43+
44+
```
45+
AuthorizationHeaderMalformed: The authorization header is malformed;
46+
the region 'us-east-1' is wrong; expecting 'us-east-2'
2447
```
2548
2649
## Features
2750
28-
- Object storage
29-
- High availability
30-
- Durability
31-
- Versioning support
32-
- Lifecycle management
33-
- Automatic metrics logging
34-
- Encryption support
35-
- Access control
51+
- Object storage with automatic date-based key organization (`YYYY/MM/DD/uuid.vcon`)
52+
- High availability and durability
53+
- Support for custom S3-compatible endpoints (MinIO, LocalStack, etc.)
54+
- Configurable key prefix for organizing objects
55+
- Automatic error logging
3656
3757
## Usage
3858
@@ -42,21 +62,24 @@ from storage import Storage
4262
# Initialize S3 storage
4363
s3_storage = Storage("s3")
4464
45-
# Save vCon data
65+
# Save vCon data (retrieves from Redis and stores in S3)
4666
s3_storage.save(vcon_id)
4767
4868
# Retrieve vCon data
4969
vcon_data = s3_storage.get(vcon_id)
5070
```
5171

52-
## Implementation Details
72+
## Key Structure
73+
74+
vCons are stored with keys following this pattern:
75+
```
76+
[s3_path/]YYYY/MM/DD/uuid.vcon
77+
```
5378

54-
The S3 storage implementation:
55-
- Uses boto3 for AWS S3 operations
56-
- Implements retry logic
57-
- Supports multipart uploads
58-
- Provides encryption
59-
- Includes automatic metrics logging
79+
For example, a vCon created on January 15, 2024 with UUID `abc123` and `s3_path: vcons` would be stored at:
80+
```
81+
vcons/2024/01/15/abc123.vcon
82+
```
6083

6184
## Dependencies
6285

@@ -65,15 +88,11 @@ The S3 storage implementation:
6588

6689
## Best Practices
6790

68-
1. Secure credential management
69-
2. Implement proper access control
70-
3. Use appropriate storage classes
71-
4. Enable versioning
72-
5. Configure lifecycle rules
73-
6. Implement proper error handling
74-
7. Use appropriate encryption
75-
8. Monitor costs
76-
9. Implement retry logic
77-
10. Use appropriate regions
78-
11. Enable logging
79-
12. Regular backup verification
91+
1. Always configure `aws_region` to match your bucket's region
92+
2. Use IAM roles with least-privilege access
93+
3. Enable bucket versioning for data protection
94+
4. Configure lifecycle rules for cost optimization
95+
5. Enable server-side encryption
96+
6. Use VPC endpoints for private connectivity
97+
7. Monitor with CloudWatch metrics
98+
8. Enable access logging for auditing

server/storage/s3/__init__.py

Lines changed: 39 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,46 @@
33
from lib.logging_utils import init_logger
44
from server.lib.vcon_redis import VconRedis
55
import boto3
6-
from datetime import datetime
76

87
logger = init_logger(__name__)
98

109

1110
default_options = {}
1211

1312

13+
def _create_s3_client(opts: dict):
14+
"""Create an S3 client with the provided options.
15+
16+
Required options:
17+
aws_access_key_id: AWS access key ID
18+
aws_secret_access_key: AWS secret access key
19+
20+
Optional options:
21+
aws_region: AWS region (e.g., 'us-east-1', 'us-west-2')
22+
endpoint_url: Custom endpoint URL for S3-compatible services
23+
"""
24+
client_kwargs = {
25+
"aws_access_key_id": opts["aws_access_key_id"],
26+
"aws_secret_access_key": opts["aws_secret_access_key"],
27+
}
28+
29+
if opts.get("aws_region"):
30+
client_kwargs["region_name"] = opts["aws_region"]
31+
32+
if opts.get("endpoint_url"):
33+
client_kwargs["endpoint_url"] = opts["endpoint_url"]
34+
35+
return boto3.client("s3", **client_kwargs)
36+
37+
38+
def _build_s3_key(vcon_uuid: str, s3_path: Optional[str] = None) -> str:
39+
"""Build the S3 object key for a vCon."""
40+
key = f"{vcon_uuid}.vcon"
41+
if not s3_path:
42+
return key
43+
return f"{s3_path.rstrip('/')}/{key}"
44+
45+
1446
def save(
1547
vcon_uuid,
1648
opts=default_options,
@@ -19,19 +51,9 @@ def save(
1951
try:
2052
vcon_redis = VconRedis()
2153
vcon = vcon_redis.get_vcon(vcon_uuid)
22-
s3 = boto3.client(
23-
"s3",
24-
aws_access_key_id=opts["aws_access_key_id"],
25-
aws_secret_access_key=opts["aws_secret_access_key"],
26-
)
54+
s3 = _create_s3_client(opts)
2755

28-
s3_path = opts.get("s3_path")
29-
created_at = datetime.fromisoformat(vcon.created_at)
30-
timestamp = created_at.strftime("%Y/%m/%d")
31-
key = vcon_uuid + ".vcon"
32-
destination_directory = f"{timestamp}/{key}"
33-
if s3_path:
34-
destination_directory = s3_path + "/" + destination_directory
56+
destination_directory = _build_s3_key(vcon_uuid, opts.get("s3_path"))
3557
s3.put_object(
3658
Bucket=opts["aws_bucket"], Key=destination_directory, Body=vcon.dumps()
3759
)
@@ -45,15 +67,10 @@ def save(
4567
def get(vcon_uuid: str, opts=default_options) -> Optional[dict]:
4668
"""Get a vCon from S3 by UUID."""
4769
try:
48-
s3 = boto3.client(
49-
"s3",
50-
aws_access_key_id=opts["aws_access_key_id"],
51-
aws_secret_access_key=opts["aws_secret_access_key"],
52-
)
53-
54-
s3_path = opts.get("s3_path", "")
55-
key = f"{s3_path}/{vcon_uuid}.vcon" if s3_path else f"{vcon_uuid}.vcon"
56-
70+
s3 = _create_s3_client(opts)
71+
72+
key = _build_s3_key(vcon_uuid, opts.get("s3_path"))
73+
5774
response = s3.get_object(Bucket=opts["aws_bucket"], Key=key)
5875
return json.loads(response['Body'].read().decode('utf-8'))
5976

tests/__init__.py

Whitespace-only changes.

tests/storage/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)