Skip to content

Commit 46ab7c7

Browse files
sragssclaude
andcommitted
docs: add EC2 infrastructure reference and resize recommendations
Documents current instance specs, volume history (25GB→100GB expansion), disk usage breakdown, and recommended next resize path (t3.xlarge) with rationale and procedure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c2394ad commit 46ab7c7

File tree

1 file changed

+67
-0
lines changed

1 file changed

+67
-0
lines changed

.claude/infrastructure.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# EC2 Infrastructure
2+
3+
## Current Setup
4+
- **Instance:** `i-0e2fb16c6e0fb83d0` (t3.large — 2 vCPU, 8 GB RAM)
5+
- **Volume:** `vol-0ce6db6bfe7959e24` (gp3, 100 GB)
6+
- **Elastic IP:** `100.31.229.192` (eipalloc-0155d2c49a8667e13)
7+
- **Security group:** `sg-083530ad37dce5694` (craigclaw-sg)
8+
- **Region:** us-east-1
9+
- **SSH:** `ssh -i ~/.ssh/craigclaw-key.pem ubuntu@100.31.229.192`
10+
11+
## Volume History
12+
- Originally 25 GB (default), expanded to 100 GB on 2026-02-13
13+
- Snapshot `snap-02aef7fb07c1e218a` taken pre-resize as backup
14+
- Expansion was in-place (growpart + resize2fs), no reboot needed
15+
16+
## Instance History
17+
- Started as t3.small (2 GB RAM). OOM incident 2026-02-11 — OpenClaw gateway
18+
hit ~866 MB RSS, OOM killer looped, SSH starved. Resized to t3.large (8 GB).
19+
20+
## Disk Usage (as of 2026-02-13, post-expansion)
21+
Major consumers: pnpm store (~6 GB), poncho + StableStudio + x402email repos
22+
with node_modules (~7.4 GB), npm cache (~1.3 GB). These are leftover from
23+
other projects and can be cleaned up if space gets tight again.
24+
25+
## Recommended Next Resize: t3.xlarge
26+
27+
When Craig needs more power, the recommended upgrade is **t3.xlarge**
28+
(4 vCPU, 16 GB RAM, ~$121/mo vs current ~$61/mo).
29+
30+
**Why t3.xlarge:**
31+
- Craig's workload is bursty (agent gets task, builds, idles). Burstable
32+
instances are designed for this — accumulate CPU credits while idle, spend
33+
them during builds.
34+
- 4 vCPU doubles parallel build capacity.
35+
- 16 GB RAM gives headroom for OpenClaw gateway (~900 MB peak) + builds +
36+
pnpm installs running simultaneously. No more OOM risk.
37+
- Zero-migration resize: stop instance, change type, start. ~1 min downtime.
38+
Elastic IP stays attached.
39+
40+
**Why not dedicated CPU (c7i/m7i):**
41+
- Dedicated cores matter for sustained CPU (CI servers, databases). Craig is
42+
idle most of the time — paying for dedicated cores he won't use is waste.
43+
- c7i.xlarge is $130/mo for the same 4 vCPU / 8 GB but without burst credits.
44+
45+
**Why not ARM/Graviton (c7g/m7g):**
46+
- ~20% cheaper per core, but requires full instance migration (new AMI, rebuild
47+
environment, transfer secrets, re-test deploy pipeline). Hours of work to
48+
save ~$15-20/mo.
49+
- Revisit if we ever rebuild the box from scratch.
50+
51+
**Resize procedure:**
52+
```bash
53+
# 1. Stop instance
54+
aws ec2 stop-instances --instance-ids i-0e2fb16c6e0fb83d0 --region us-east-1
55+
aws ec2 wait instance-stopped --instance-ids i-0e2fb16c6e0fb83d0 --region us-east-1
56+
57+
# 2. Change type
58+
aws ec2 modify-instance-attribute --instance-id i-0e2fb16c6e0fb83d0 \
59+
--instance-type '{"Value":"t3.xlarge"}' --region us-east-1
60+
61+
# 3. Start instance (Elastic IP re-associates automatically)
62+
aws ec2 start-instances --instance-ids i-0e2fb16c6e0fb83d0 --region us-east-1
63+
aws ec2 wait instance-running --instance-ids i-0e2fb16c6e0fb83d0 --region us-east-1
64+
65+
# 4. Verify
66+
ssh -i ~/.ssh/craigclaw-key.pem ubuntu@100.31.229.192 "nproc && free -h"
67+
```

0 commit comments

Comments
 (0)