Skip to content

Commit c91f490

Browse files
committed
add redirects script, docs and a test file with a single redirect
1 parent d23c6f5 commit c91f490

File tree

4 files changed

+323
-0
lines changed

4 files changed

+323
-0
lines changed

.circleci/config.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -206,6 +206,20 @@ jobs:
206206
set -e
207207
echo "[INFO] Deploying production site..."
208208
aws s3 sync "$BUILD_DIRECTORY" "s3://$AWS_S3_BUCKET/"
209+
- run:
210+
name: Install Python Dependencies for Redirects
211+
command: |
212+
set -e
213+
echo "[INFO] Installing PyYAML for redirect processing..."
214+
sudo pip3 install PyYAML
215+
- run:
216+
name: Deploy Redirects to S3
217+
command: |
218+
AWS_S3_BUCKET=<< parameters.bucket_name >>
219+
220+
set -e
221+
echo "[INFO] Deploying redirects with batch processing..."
222+
bash scripts/deploy-redirects-batch.sh "$AWS_S3_BUCKET" "scripts/test-single-redirect.yml"
209223
- notify_error:
210224
message: "Production deployment job failed for branch ${CIRCLE_BRANCH}"
211225

scripts/README-redirects.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# S3 Redirects Deployment System
2+
3+
This documentation explains how to deploy URL redirects for the CircleCI documentation site using AWS S3's website redirect functionality.
4+
5+
## Overview
6+
7+
The redirect system uses AWS S3's built-in website redirect feature by creating empty objects with redirect metadata. When a user visits an old URL, S3 automatically redirects them to the new URL with a 301 (permanent redirect) status.
8+
9+
## Files
10+
11+
- `scripts/redirects_v2.yml` - YAML file containing all redirect mappings
12+
- `scripts/deploy-redirects-batch.sh` - Optimized batch deployment script
13+
14+
## How It Works
15+
16+
1. **Redirect File Format**: The `redirects_v2.yml` file contains redirect mappings:
17+
```yaml
18+
- old: /about-circleci/
19+
new: /guides/about-circleci/about-circleci/index.html
20+
```
21+
22+
2. **Deployment Process**:
23+
- Parse the YAML file
24+
- For each redirect, create an S3 object at the old path
25+
- Set the `x-amz-website-redirect-location` metadata to the new path
26+
- S3 automatically handles the redirect
27+
28+
3. **URL Mapping**:
29+
- Old paths like `/about-circleci/` become S3 objects at `about-circleci/index.html`
30+
- When accessed, S3 returns a 301 redirect to the new location
31+
32+
## Usage
33+
34+
### Deploy Redirects
35+
36+
The redirect deployment is integrated into the CircleCI pipeline and runs automatically after the main site deployment.
37+
38+
Manual deployment:
39+
```bash
40+
bash scripts/deploy-redirects-batch.sh "bucket-name" "scripts/redirects_v2.yml"
41+
```
42+
43+
44+
45+
## Performance Considerations
46+
47+
### Batch Processing
48+
The `deploy-redirects-batch.sh` script is optimized for handling redirects efficiently:
49+
- Processes redirects in batches of 50
50+
- Uses concurrent uploads (10 parallel requests)
51+
- Includes error handling and retry logic
52+
- Handles hundreds of redirects quickly
53+
54+
### Rate Limiting
55+
- The deployment script includes appropriate rate limiting
56+
- Batch script uses thread pools to manage concurrency
57+
58+
## CircleCI Integration
59+
60+
The redirect deployment is integrated into the `deploy-production` job:
61+
62+
1. **Install Dependencies**: Installs PyYAML for YAML parsing
63+
2. **Deploy Main Site**: Syncs the Antora build to S3
64+
3. **Deploy Redirects**: Creates redirect objects using the batch script
65+
66+
## Adding New Redirects
67+
68+
1. Edit `scripts/redirects_v2.yml`
69+
2. Add new redirect mapping:
70+
```yaml
71+
- old: /old-path/
72+
new: /new-path/index.html
73+
```
74+
3. Commit and push - redirects will be deployed automatically
75+
76+
## Redirect Format Guidelines
77+
78+
- **Old paths**: Should start with `/` and can end with or without `/`
79+
- **New paths**: Should be the full path to the new location
80+
- **Index files**: Old paths without file extensions automatically get `index.html` appended
81+
- **Trailing slashes**: Old paths are normalized (trailing slashes removed)
82+
83+
## Troubleshooting
84+
85+
### Common Issues
86+
87+
1. **Permission Errors**: Ensure AWS credentials have S3 write permissions
88+
2. **YAML Parse Errors**: Validate YAML syntax in redirects file
89+
3. **S3 Bucket Errors**: Verify bucket name and region settings
90+
91+
### Checking Redirect Status
92+
93+
Use curl to test individual redirects:
94+
```bash
95+
curl -I "https://circleci.com/docs/about-circleci/"
96+
```
97+
98+
Should return:
99+
```
100+
HTTP/1.1 301 Moved Permanently
101+
Location: /guides/about-circleci/about-circleci/index.html
102+
```
103+
104+
### Debugging
105+
106+
1. Check CircleCI logs for deployment errors
107+
2. Test redirects manually with curl or browser
108+
3. Manually inspect S3 objects and their metadata
109+
110+
## Best Practices
111+
112+
1. **Test Manually**: Use curl or browser to spot-check redirects when needed
113+
2. **Batch Operations**: Use the batch script for large numbers of redirects
114+
3. **Monitor Performance**: Keep an eye on deployment times and error rates
115+
4. **Clean URLs**: Ensure redirect paths are clean and consistent
116+
5. **Backup**: Keep backup of working redirect files before major changes
117+
118+
## AWS S3 Website Configuration
119+
120+
Ensure your S3 bucket is configured for static website hosting:
121+
```bash
122+
aws s3 website s3://your-bucket-name --index-document index.html --error-document 404.html
123+
```
124+
125+
The redirect functionality requires website hosting to be enabled on the bucket.

scripts/deploy-redirects-batch.sh

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
#!/bin/bash
2+
3+
set -e
4+
5+
# Check required parameters
6+
if [ -z "$1" ] || [ -z "$2" ]; then
7+
echo "Usage: $0 <bucket_name> <redirects_file>"
8+
echo "Example: $0 circleci-docs-platform-assets/docs-preview scripts/redirects_v2.yml"
9+
exit 1
10+
fi
11+
12+
BUCKET_NAME="$1"
13+
REDIRECTS_FILE="$2"
14+
TEMP_DIR=$(mktemp -d)
15+
BATCH_SIZE=50
16+
17+
echo "[INFO] Processing redirects from $REDIRECTS_FILE to bucket s3://$BUCKET_NAME"
18+
echo "[INFO] Using temporary directory: $TEMP_DIR"
19+
20+
# Cleanup function
21+
cleanup() {
22+
echo "[INFO] Cleaning up temporary files..."
23+
rm -rf "$TEMP_DIR"
24+
}
25+
trap cleanup EXIT
26+
27+
# Check if redirects file exists
28+
if [ ! -f "$REDIRECTS_FILE" ]; then
29+
echo "[ERROR] Redirects file not found: $REDIRECTS_FILE"
30+
exit 1
31+
fi
32+
33+
# Parse YAML and create redirect objects in batches
34+
python3 << 'EOF'
35+
import yaml
36+
import subprocess
37+
import sys
38+
import os
39+
import tempfile
40+
import json
41+
from concurrent.futures import ThreadPoolExecutor, as_completed
42+
import threading
43+
44+
bucket_name = sys.argv[1]
45+
redirects_file = sys.argv[2]
46+
temp_dir = sys.argv[3]
47+
batch_size = int(sys.argv[4])
48+
49+
print(f"[INFO] Loading redirects from {redirects_file}")
50+
51+
def create_redirect_object(bucket, key, redirect_location):
52+
"""Create a single redirect object using aws s3api put-object"""
53+
try:
54+
cmd = [
55+
'aws', 's3api', 'put-object',
56+
'--bucket', bucket,
57+
'--key', key,
58+
'--website-redirect-location', redirect_location,
59+
'--content-type', 'text/html',
60+
'--content-length', '0'
61+
]
62+
63+
result = subprocess.run(cmd, capture_output=True, text=True, check=True, timeout=30)
64+
return True, key, None
65+
except subprocess.CalledProcessError as e:
66+
return False, key, f"S3 API error: {e.stderr.strip()}"
67+
except subprocess.TimeoutExpired:
68+
return False, key, "Request timeout"
69+
except Exception as e:
70+
return False, key, f"Unexpected error: {str(e)}"
71+
72+
def process_batch(batch_redirects):
73+
"""Process a batch of redirects with thread pool"""
74+
success_count = 0
75+
error_count = 0
76+
errors = []
77+
78+
# Use ThreadPoolExecutor for concurrent uploads
79+
with ThreadPoolExecutor(max_workers=10) as executor:
80+
future_to_redirect = {}
81+
82+
for redirect in batch_redirects:
83+
old_path = redirect['old'].rstrip('/')
84+
new_path = redirect['new']
85+
86+
# Ensure paths start with /
87+
if not old_path.startswith('/'):
88+
old_path = '/' + old_path
89+
if not new_path.startswith('/'):
90+
new_path = '/' + new_path
91+
92+
# S3 object key (remove leading slash for S3)
93+
s3_key = old_path.lstrip('/')
94+
95+
# If the old path doesn't end with a file extension, add index.html
96+
if not s3_key.endswith('.html') and not s3_key.endswith('/'):
97+
s3_key += '/index.html'
98+
elif s3_key.endswith('/'):
99+
s3_key += 'index.html'
100+
101+
future = executor.submit(create_redirect_object, bucket_name, s3_key, new_path)
102+
future_to_redirect[future] = (old_path, new_path, s3_key)
103+
104+
# Collect results
105+
for future in as_completed(future_to_redirect):
106+
old_path, new_path, s3_key = future_to_redirect[future]
107+
success, key, error = future.result()
108+
109+
if success:
110+
success_count += 1
111+
print(f"[SUCCESS] {old_path} -> {new_path}")
112+
else:
113+
error_count += 1
114+
error_msg = f"{old_path} -> {new_path}: {error}"
115+
errors.append(error_msg)
116+
print(f"[ERROR] {error_msg}")
117+
118+
return success_count, error_count, errors
119+
120+
try:
121+
with open(redirects_file, 'r') as f:
122+
redirects = yaml.safe_load(f)
123+
124+
if not redirects:
125+
print("[WARN] No redirects found in file")
126+
sys.exit(0)
127+
128+
total_redirects = len(redirects)
129+
print(f"[INFO] Found {total_redirects} redirects to process")
130+
print(f"[INFO] Processing in batches of {batch_size} with concurrent uploads")
131+
132+
total_success = 0
133+
total_errors = 0
134+
all_errors = []
135+
136+
# Process redirects in batches
137+
for i in range(0, total_redirects, batch_size):
138+
batch = redirects[i:i + batch_size]
139+
batch_num = (i // batch_size) + 1
140+
total_batches = (total_redirects + batch_size - 1) // batch_size
141+
142+
print(f"[INFO] Processing batch {batch_num}/{total_batches} ({len(batch)} redirects)")
143+
144+
success_count, error_count, errors = process_batch(batch)
145+
total_success += success_count
146+
total_errors += error_count
147+
all_errors.extend(errors)
148+
149+
print(f"[INFO] Batch {batch_num} complete: {success_count} success, {error_count} errors")
150+
151+
print(f"[INFO] Redirect processing complete:")
152+
print(f"[INFO] - Total redirects processed: {total_redirects}")
153+
print(f"[INFO] - Successfully created: {total_success}")
154+
print(f"[INFO] - Errors: {total_errors}")
155+
156+
if all_errors:
157+
print(f"[ERROR] Failed redirects:")
158+
for error in all_errors[:10]: # Show first 10 errors
159+
print(f"[ERROR] - {error}")
160+
if len(all_errors) > 10:
161+
print(f"[ERROR] ... and {len(all_errors) - 10} more errors")
162+
163+
# Exit with error if too many failed
164+
if total_errors > total_redirects * 0.1: # More than 10% failed
165+
print(f"[ERROR] Too many redirects failed ({total_errors}/{total_redirects})")
166+
sys.exit(1)
167+
168+
if total_errors > 0:
169+
print(f"[WARN] {total_errors} redirects failed, but continuing since error rate is acceptable")
170+
171+
except Exception as e:
172+
print(f"[ERROR] Failed to process redirects: {e}")
173+
import traceback
174+
traceback.print_exc()
175+
sys.exit(1)
176+
177+
EOF "$BUCKET_NAME" "$REDIRECTS_FILE" "$TEMP_DIR" "$BATCH_SIZE"
178+
179+
echo "[INFO] Redirects deployment completed successfully"

scripts/test-single-redirect.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Test redirect file with a single redirect
2+
# Format: old (Jekyll) -> new (Antora)
3+
4+
- old: /about-circleci/
5+
new: /guides/about-circleci/about-circleci/index.html

0 commit comments

Comments
 (0)