Infrastructure-as-Code setup for a powerful AI development environment using Google Cloud Vertex AI Workbench, optimized for coding AI models like Codestral, Mistral, and compatible with Continue.dev.
This project creates a cloud-based AI development environment featuring:
- π§ Vertex AI Workbench: Managed Jupyter environment with GPU support
- π€ Ollama: Self-hosted LLM server for coding AI models
- π» code-server: Web-based VS Code accessible from any device
- π§ Continue.dev: Pre-configured AI coding assistant
- βοΈ Cloud Storage: Model storage and backup
- π‘οΈ Security: VPC isolation and IAM controls
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β GCP Project β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β β VPC Network β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β Vertex AI Workbench β β β
β β β β β β
β β β JupyterLab (:8888) β β β
β β β code-server (:8080) βββ β β β
β β β Ollama API (:11434) βββ β β β
β β β ββ β β β
β β βββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββSSH Tunnel ββββββββββββ
ββ
βββββββββββ
β β
Your Local VS Code β
with Continue.dev β
β
Your Mobile Browser
- Google Cloud SDK (
gcloud
) installed and authenticated - Terraform >= 1.0 installed
- A GCP project with billing enabled
# Clone and setup
git clone <this-repo>
cd ai-development-server
# Create your config
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your GCP project details
# Run the interactive setup
./scripts/quick-start.sh
Or manual setup:
# Choose your configuration based on needs and budget
make config-cpu # CPU-only: ~$35-70/month for 3-6 hours/day
make config-t4 # T4 GPU: ~$67-135/month for 3-6 hours/day
make config-l4 # L4 GPU: ~$125-250/month for 3-6 hours/day
# Deploy
make init
make apply
# Check status
make status
# Create SSH tunnel for local development
make tunnel
# Or SSH directly to the workbench
make ssh
Choose based on your needs:
Machine Type | vCPUs | RAM | Best For |
---|---|---|---|
n1-standard-8 |
8 | 30GB | General coding models |
n1-standard-16 |
16 | 60GB | Larger models |
n1-highmem-8 |
8 | 52GB | Memory-intensive models |
c2-standard-16 |
16 | 64GB | High-performance CPU |
Recommended for faster inference:
GPU Type | Memory | Best For | Cost Level |
---|---|---|---|
NVIDIA_TESLA_T4 |
16GB | Cost-effective inference | $ |
NVIDIA_L4 |
24GB | Latest gen, great performance | $$ |
NVIDIA_TESLA_V100 |
16GB | High-end training/inference | $$$ |
The workbench comes with coding-optimized models:
- Codestral 22B: Mistral's specialized coding model
- Mistral Nemo 12B: Latest general-purpose model
- Llama 3.1 8B: Meta's efficient model
- DeepSeek Coder 6.7B: Specialized for code generation
-
Via SSH Tunnel (Recommended):
make tunnel # Creates local tunnel # Access code-server at http://localhost:8080 # Configure Continue.dev with endpoint: http://localhost:11434
-
Get Continue.dev Configuration:
make continue-config # Shows JSON config to copy
# List installed models
make models
# Install a new model
make install-model MODEL=llama3.1:70b
# Check service status
make service-status
-
Local VS Code + SSH Tunnel:
- Run
make tunnel
- Connect VS Code to tunnel
- Use Continue.dev with local Ollama endpoint
- Run
-
Browser-based Development:
- Access JupyterLab directly via GCP Console
- Use built-in code-server at
:8080
-
Mobile Development:
- Access via mobile browser through SSH tunnel
- Full VS Code experience on phone/tablet
make help # Show all commands
make status # Instance status
make logs # Setup logs
make setup-status # Check if setup complete
make restart-services # Restart Ollama/code-server
make backup # Backup data
make cost-estimate # Cost estimation
Perfect for your use case (few hours throughout the day with 15-minute auto-shutdown):
Configuration | Hourly Rate | 3 hrs/day Cost | 6 hrs/day Cost | 12 hrs/day Cost |
---|---|---|---|---|
CPU-only (n1-standard-8) | ~$0.38 | ~$35/month | ~$70/month | ~$140/month |
T4 GPU (n1-standard-8 + T4) | ~$0.73 | ~$67/month | ~$135/month | ~$270/month |
L4 GPU (n1-standard-16 + L4) | ~$1.36 | ~$125/month | ~$250/month | ~$500/month |
make config-cpu # Switch to CPU-only for light work
make config-t4 # Switch to T4 GPU for moderate AI tasks
make config-l4 # Switch to L4 GPU for heavy workloads
make apply # Apply the new configuration
Cost-saving features built-in:
- β±οΈ Auto-shutdown after 15 minutes idle (configurable)
- π Easy config switching without data loss
- πΎ Persistent storage - models and data survive config changes
- π Usage tracking via GCP billing
- Network Isolation: Resources in dedicated VPC
- Firewall Rules: Restricted access by IP ranges
- IAM Controls: Least-privilege service accounts
- Private Access: Option to disable public IPs
Production Security:
# In terraform.tfvars
no_public_ip = true
allowed_ip_ranges = ["YOUR.OFFICE.IP/32"]
Access your full AI development environment from your phone:
- Setup SSH tunnel from your phone using apps like Termius
- Access code-server in mobile browser
- Use Continue.dev for AI assistance on mobile
Perfect for:
- Code reviews on the go
- Quick bug fixes
- Learning and experimentation
# Create backup
make backup
# Backup includes:
# - Ollama models and configs
# - code-server settings
# - workspace files
# - Continue.dev configuration
make setup-status # Check setup completion
make logs # View setup logs
make ssh # Direct access to troubleshoot
make service-status # Check Ollama/code-server
make restart-services # Restart services
make status # Check instance status
gcloud compute instances list # Direct GCP check
- Fork the repository
- Create your feature branch
- Test with a dev environment
- Submit a pull request
MIT License - see LICENSE file for details.
Ready to supercharge your AI development? π
make setup # Get started in minutes!