Terraform Infrastructure for SLEAP-RTC Signaling Server

This directory contains Terraform configuration for deploying the SLEAP-RTC signaling server infrastructure on AWS.

Overview

The Terraform configuration provides:

Elastic IP for stable DNS across instance replacements
EC2 instance running the signaling server in Docker
Security groups for controlled network access
IAM roles for CloudWatch and ECR permissions
Automated startup via user-data script
Health checks with automatic container restart
Multi-environment support (dev, staging, production)

Prerequisites

Required Tools

Terraform >= 1.5

# macOS
brew install terraform

# Linux
wget https://releases.hashicorp.com/terraform/1.6.0/terraform_1.6.0_linux_amd64.zip
unzip terraform_1.6.0_linux_amd64.zip
sudo mv terraform /usr/local/bin/

AWS CLI

# macOS
brew install awscli

# Linux
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

Required AWS Permissions

Your AWS user/role needs permissions for:

EC2 (instances, security groups, elastic IPs)
IAM (roles, policies, instance profiles)
VPC (describe VPCs)

Example IAM policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:*",
        "iam:CreateRole",
        "iam:DeleteRole",
        "iam:PutRolePolicy",
        "iam:DeleteRolePolicy",
        "iam:GetRole",
        "iam:PassRole",
        "iam:CreateInstanceProfile",
        "iam:DeleteInstanceProfile",
        "iam:AddRoleToInstanceProfile",
        "iam:RemoveRoleFromInstanceProfile"
      ],
      "Resource": "*"
    }
  ]
}

AWS Credentials Setup

Configure AWS credentials using the AWS CLI:

aws configure

You'll be prompted for:

AWS Access Key ID
AWS Secret Access Key
Default region (e.g., us-west-1)
Output format (e.g., json)

Verify configuration:

aws sts get-caller-identity

Directory Structure

terraform/
├── modules/
│   └── signaling-server/       # Reusable server module
│       ├── main.tf             # EC2, EIP, security groups, IAM
│       ├── variables.tf        # Configurable inputs
│       ├── outputs.tf          # EIP, URLs, instance ID
│       └── user-data.sh        # Automated startup script
└── environments/
    ├── dev/                    # Development environment
    │   ├── main.tf
    │   ├── variables.tf
    │   └── terraform.tfvars.example
    └── production/             # Production environment
        ├── main.tf
        ├── variables.tf
        └── terraform.tfvars.example

First-Time Deployment

1. Choose Environment

cd terraform/environments/dev  # or production

2. Create Configuration File

Copy the example and customize:

cp terraform.tfvars.example terraform.tfvars

Edit terraform.tfvars with your values:

Docker image
Cognito configuration
Network CIDR blocks (restrict in production!)
Your admin IP for SSH access

IMPORTANT: Never commit terraform.tfvars if it contains sensitive data!

3. Initialize Terraform

terraform init

This downloads the AWS provider and initializes the backend.

4. Review Plan

terraform plan

Review what resources will be created. Look for:

1 EC2 instance
1 Elastic IP
1 Security group
1 IAM role + instance profile
2 IAM policies

5. Apply Configuration

terraform apply

Type yes to confirm. Deployment takes ~5 minutes.

6. Save Outputs

terraform output

Example output:

signaling_server_ip = "54.176.92.10"
websocket_url = "ws://54.176.92.10:8080"
http_url = "http://54.176.92.10:8001"
instance_id = "i-0123456789abcdef"

Save the Elastic IP - this is your stable address that won't change!

Updating Infrastructure

Updating Instance Size

Edit terraform.tfvars:

instance_type = "t3.medium"  # was t3.small

Apply changes:
```
terraform apply
```

Terraform will:

Create new instance with new size
Move Elastic IP to new instance (same IP!)
Destroy old instance

Downtime: ~30 seconds (while EIP moves)

Updating Docker Image Version

Edit terraform.tfvars:

docker_image = "ghcr.io/talmolab/webrtc-server:new-version"

Apply changes:
```
terraform apply
```

The instance will be recreated with the new image.

Updating Security Rules

Edit terraform.tfvars:

allowed_cidr_blocks = ["192.168.1.0/24"]  # More restrictive

Apply changes:
```
terraform apply
```

Security group rules update immediately (no instance recreation).

Destroying Infrastructure

To tear down all resources:

terraform destroy

Type yes to confirm.

WARNING: This deletes:

EC2 instance
Elastic IP (will be released)
Security group
IAM roles

All data on the instance is lost. The Elastic IP can be reallocated when you redeploy.

Verification and Testing

Check Instance Status

# Get instance ID from terraform output
INSTANCE_ID=$(terraform output -raw instance_id)

# Check instance status
aws ec2 describe-instances --instance-ids $INSTANCE_ID

SSH to Instance (for debugging)

# Get Elastic IP from terraform output
EIP=$(terraform output -raw signaling_server_ip)

# SSH as ubuntu user
ssh ubuntu@$EIP

Once connected:

# Check Docker container status
docker ps

# View container logs
docker logs sleap-rtc-signaling

# Check health check logs
cat /var/log/healthcheck.log

Test Connectivity

# Get URLs from terraform output
WEBSOCKET_URL=$(terraform output -raw websocket_url)
HTTP_URL=$(terraform output -raw http_url)

# Test HTTP API (if health endpoint exists)
curl $HTTP_URL/health

# Test WebSocket (requires WebSocket client)
# Use your SLEAP-RTC client or wscat:
# wscat -c $WEBSOCKET_URL

Cost Estimates

Per Environment (Monthly, us-west-1)

Development (t3.small):

EC2 instance: ~$15/month
Elastic IP: Free (while attached)
Data transfer out: ~$0.09/GB
Total: ~$15-20/month

Production (t3.medium):

EC2 instance: ~$30/month
Elastic IP: Free (while attached)
Data transfer out: ~$0.09/GB
Total: ~$30-40/month

Cost Optimizations:

Dev: Stop instance overnight (aws ec2 stop-instances) - saves ~60%
Staging: Destroy when not testing (terraform destroy)
Production: Run 24/7

Note: Elastic IP costs $3.60/month if allocated but not attached to a running instance.

Troubleshooting

Container Not Starting

SSH to instance:
```
ssh ubuntu@<elastic-ip>
```
Check Docker status:
```
systemctl status docker
```

Check user-data execution:

cat /var/log/user-data-complete.log
tail -50 /var/log/cloud-init-output.log

Try starting container manually:

docker start sleap-rtc-signaling
docker logs sleap-rtc-signaling

Terraform Apply Fails

Error: VPC not found

Solution: Ensure default VPC exists in your AWS account

Error: IAM permissions denied

Solution: Check your AWS user has required permissions (see Prerequisites)

Error: Instance type not available

Solution: Try different instance type or different AWS region

Cannot Connect to Server

Check security group:

# Verify your IP is in allowed_cidr_blocks
curl ifconfig.me  # Get your IP

Test from different network to rule out firewall issues

Check container is running:

ssh ubuntu@<elastic-ip>
docker ps | grep sleap-rtc-signaling

Elastic IP Changed

Elastic IPs should persist across terraform apply. If it changed:

Check terraform state: terraform show | grep aws_eip
Possible cause: Used terraform destroy (releases EIP)
Next terraform apply allocates a new EIP

To keep same EIP: don't destroy, just update with terraform apply.

Advanced: State Management

Local State (Current Setup)

Terraform state is stored locally in terraform.tfstate.

Pros: Simple Cons: Can't collaborate, no locking

Remote State (Recommended for Teams)

Use S3 + DynamoDB for shared state:

Create S3 bucket and DynamoDB table:

aws s3 mb s3://sleap-rtc-terraform-state
aws dynamodb create-table \
  --table-name terraform-locks \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST

Add backend to main.tf:

terraform {
  backend "s3" {
    bucket         = "sleap-rtc-terraform-state"
    key            = "dev/terraform.tfstate"
    region         = "us-west-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

Migrate state:
```
terraform init -migrate-state
```

Multi-Environment Deployment

Deploy to Multiple Environments

# Deploy dev
cd environments/dev
terraform apply

# Deploy production (isolated state)
cd ../production
terraform apply

Each environment:

Has separate Elastic IP
Has independent infrastructure
Can be deployed/destroyed independently

Staging Environment (Optional)

# Create staging from dev template
cp -r environments/dev environments/staging
cd environments/staging

# Edit terraform.tfvars for staging-specific values
# Edit main.tf to change environment = "staging"

terraform init
terraform apply

Next Steps

Test the deployment: Deploy to dev, verify connectivity
Update documentation: Save your Elastic IPs in project README
Set up monitoring: Consider CloudWatch alarms for instance health
Enable SSL (future): Add SSL certificates for wss:// instead of ws://
DNS setup (future): Point a friendly domain to your Elastic IP

Support

For issues with:

Terraform configuration: Check this README
Signaling server: See main repository README
AWS permissions: Consult AWS documentation

Report infrastructure bugs at https://github.com/talmolab/webRTC-connect/issues

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Terraform Infrastructure for SLEAP-RTC Signaling Server

Overview

Prerequisites

Required Tools

Required AWS Permissions

AWS Credentials Setup

Directory Structure

First-Time Deployment

1. Choose Environment

2. Create Configuration File

3. Initialize Terraform

4. Review Plan

5. Apply Configuration

6. Save Outputs

Updating Infrastructure

Updating Instance Size

Updating Docker Image Version

Updating Security Rules

Destroying Infrastructure

Verification and Testing

Check Instance Status

SSH to Instance (for debugging)

Test Connectivity

Cost Estimates

Per Environment (Monthly, us-west-1)

Troubleshooting

Container Not Starting

Terraform Apply Fails

Cannot Connect to Server

Elastic IP Changed

Advanced: State Management

Local State (Current Setup)

Remote State (Recommended for Teams)

Multi-Environment Deployment

Deploy to Multiple Environments

Staging Environment (Optional)

Next Steps

Support