Skip to content

Commit 2c60529

Browse files
authored
Add GitHub CI Bootstrap Terraform Module (#2)
## Summary: This PR introduces a new Terraform module `github-ci-bootstrap` that automates the setup of GitHub Actions CI/CD infrastructure for GCP projects. The module creates the necessary service accounts, IAM permissions, and Workload Identity Federation configuration to enable secure, keyless authentication between GitHub Actions and Google Cloud Platform. Key features include: - **Service Account Creation**: Creates dedicated service accounts for GitHub Actions with descriptive naming - **Workload Identity Federation**: Implements modern, keyless authentication eliminating the need for service account keys - **Least Privilege Security**: Grants only the minimum permissions required for specified GCP services - **Modular Service Support**: Configurable permissions for Cloud Functions, Cloud Run, Cloud Storage, Compute Engine, and more - **Secret Management**: Optional integration with Google Secret Manager for secure credential handling - **Repository Scoping**: Restricts access to specific GitHub repositories for enhanced security The module includes a complete example (`bootstrap-with-module`) demonstrating how to use it in practice, along with comprehensive documentation covering all configuration options and use cases. This addition enhances our infrastructure-as-code capabilities by providing a standardized, secure way to set up CI/CD pipelines for GCP projects. Note, we will need to add support for additional GCP resources to support broader adoption. This changeset includes only what's required for setting up cronjobs in cloud run. Issue: INFRA-10721 ## Test plan: - Run the bootstrap setup for culture-cron: ```bash ❯ ../.venv/bin/terraform apply module.github_ci_bootstrap.data.google_project.current: Reading... module.github_ci_bootstrap.google_service_account.github_ci: Refreshing state... [id=projects/khan-internal-services/serviceAccounts/[email protected]] module.github_ci_bootstrap.google_project_iam_member.ci_storage_admin[0]: Refreshing state... [id=khan-internal-services/roles/storage.admin/serviceAccount:[email protected]] module.github_ci_bootstrap.google_project_iam_member.ci_sa_user[0]: Refreshing state... [id=khan-internal-services/roles/iam.serviceAccountUser/serviceAccount:[email protected]] module.github_ci_bootstrap.google_storage_bucket_iam_member.ci_state_bucket_access: Refreshing state... [id=b/terraform-khan-academy/roles/storage.objectAdmin/serviceAccount:[email protected]] module.github_ci_bootstrap.google_storage_bucket_iam_member.ci_state_bucket_reader: Refreshing state... [id=b/terraform-khan-academy/roles/storage.legacyBucketReader/serviceAccount:[email protected]] module.github_ci_bootstrap.google_storage_bucket_iam_member.ci_storage_legacy_bucket_owner: Refreshing state... [id=b/terraform-khan-academy/roles/storage.legacyBucketOwner/serviceAccount:[email protected]] module.github_ci_bootstrap.google_project_iam_member.ci_sa_admin: Refreshing state... [id=khan-internal-services/roles/iam.serviceAccountAdmin/serviceAccount:[email protected]] module.github_ci_bootstrap.google_project_iam_member.ci_iam_admin: Refreshing state... [id=khan-internal-services/roles/resourcemanager.projectIamAdmin/serviceAccount:[email protected]] module.github_ci_bootstrap.google_project_iam_member.ci_cloud_scheduler_admin[0]: Refreshing state... [id=khan-internal-services/roles/cloudscheduler.admin/serviceAccount:[email protected]] module.github_ci_bootstrap.google_project_iam_member.ci_cloudfunctions_admin[0]: Refreshing state... [id=khan-internal-services/roles/cloudfunctions.admin/serviceAccount:[email protected]] module.github_ci_bootstrap.google_secret_manager_secret_iam_member.ci_secret_access["projects/khan-academy/secrets/google_api_service_account__for_alertlib_"]: Refreshing state... [id=projects/khan-academy/secrets/google_api_service_account__for_alertlib_/roles/secretmanager.secretAccessor/serviceAccount:[email protected]] module.github_ci_bootstrap.data.google_project.current: Read complete after 0s [id=projects/khan-internal-services] module.github_ci_bootstrap.google_secret_manager_secret_iam_member.ci_secret_access["projects/khan-academy/secrets/Slack__API_token_for_alertlib"]: Refreshing state... [id=projects/khan-academy/secrets/Slack__API_token_for_alertlib/roles/secretmanager.secretAccessor/serviceAccount:[email protected]] module.github_ci_bootstrap.google_project_iam_member.ci_pubsub_admin[0]: Refreshing state... [id=khan-internal-services/roles/pubsub.admin/serviceAccount:[email protected]] module.github_ci_bootstrap.google_project_iam_member.ci_secretmanager_viewer[0]: Refreshing state... [id=khan-academy/roles/secretmanager.viewer/serviceAccount:[email protected]] module.github_ci_bootstrap.google_iam_workload_identity_pool.github_ci_pool: Refreshing state... [id=projects/526011289882/locations/global/workloadIdentityPools/culture-cron-github-ci-pool] module.github_ci_bootstrap.google_service_account_iam_member.github_ci_identity_binding: Refreshing state... [id=projects/khan-internal-services/serviceAccounts/[email protected]/roles/iam.workloadIdentityUser/principalSet://iam.googleapis.com/projects/526011289882/locations/global/workloadIdentityPools/culture-cron-github-ci-pool/attribute.repository/Khan/culture-cron] module.github_ci_bootstrap.google_iam_workload_identity_pool_provider.github_ci_provider: Refreshing state... [id=projects/526011289882/locations/global/workloadIdentityPools/culture-cron-github-ci-pool/providers/culture-cron-github-ci-provider] No changes. Your infrastructure matches the configuration. Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed. Apply complete! Resources: 0 added, 0 changed, 0 destroyed. Outputs: github_repository = "Khan/culture-cron" project_id = "khan-internal-services" terraform_service_account_email = "[email protected]" workload_identity_provider = "projects/526011289882/locations/global/workloadIdentityPools/culture-cron-github-ci-pool/providers/culture-cron-github-ci-provider" ❯ git status On branch INFRA-10713 Your branch is up to date with 'origin/INFRA-10713'. nothing to commit, working tree clean ❯ git push Everything up-to-date ``` Author: jwbron Reviewers: csilvers, jwbron Required Reviewers: Approved By: csilvers Checks: ✅ 1 check was successful Pull Request URL: #2
1 parent aea8405 commit 2c60529

File tree

9 files changed

+766
-6
lines changed

9 files changed

+766
-6
lines changed
Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
# GitHub Terraform CI Bootstrap Module
2+
3+
This module creates the necessary infrastructure for **GitHub Terraform CI** - managing Terraform infrastructure through GitHub Actions CI/CD pipelines. It provisions service accounts with appropriate GCP permissions and uses Workload Identity Federation for keyless authentication to run `terraform plan` and `terraform apply` operations.
4+
5+
## Purpose
6+
7+
Each module invocation creates a dedicated service account for a complete Terraform configuration managed in CI. This enables:
8+
9+
- **Isolated Terraform CI**: Each Terraform setup gets its own service account and state bucket for CI operations
10+
- **Secure GitHub Actions**: Run `terraform plan` and `terraform apply` in GitHub Actions without storing keys
11+
- **Cross-Project Deployments**: Single service account can manage Terraform resources across multiple GCP projects
12+
- **Environment Separation**: Separate CI service accounts for prod, staging, dev, etc.
13+
14+
## Features
15+
16+
- **Shared Infrastructure**: Uses a single Workload Identity Pool in khan-internal-services for all GitHub Terraform CI
17+
- **Dedicated Service Accounts**: Creates unique service accounts for each Terraform configuration managed in CI
18+
- **Workload Identity Federation**: Uses modern, keyless authentication for GitHub Actions
19+
- **Cross-Project Support**: Service accounts can deploy Terraform resources across multiple GCP projects
20+
- **Least Privilege**: Only grants permissions for specified GCP services in target projects
21+
- **Terraform State Management**: Automatic permissions for GCS-based Terraform state buckets
22+
- **Secret Management**: Optional access to Google Secret Manager secrets needed by Terraform
23+
- **Configurable Services**: Enable only the GCP services your Terraform configuration manages
24+
- **Repository Scoped**: Restricts access to a specific GitHub repository containing Terraform code
25+
26+
## Architecture
27+
28+
All GitHub Terraform CI infrastructure is centralized in the `khan-internal-services` project:
29+
- **Single Pool**: `khan-internal-services-github-ci` pool shared by all Terraform configurations managed in CI
30+
- **Unique Providers**: Each Terraform configuration gets its own provider within the shared pool
31+
- **Cross-Project Permissions**: Service accounts get permissions in target projects for Terraform resource management
32+
- **State Bucket Access**: Service accounts get appropriate permissions for Terraform state storage in CI
33+
34+
## Usage
35+
36+
```hcl
37+
# Bootstrap GitHub Terraform CI for the culture-cron production configuration
38+
module "culture_cron_terraform_ci" {
39+
source = "git::https://github.com/Khan/terraform-modules.git//terraform/modules/github-ci-bootstrap?ref=v1.0.0"
40+
41+
# Terraform configuration managed in CI
42+
service_name = "culture-cron-prod" # YOU choose this name: project + environment
43+
github_repository = "Khan/culture-cron" # GitHub repo containing the Terraform code
44+
45+
# Target projects where this Terraform configuration deploys resources via CI
46+
target_projects = {
47+
"khan-academy" = {
48+
required_services = ["cloudfunctions", "storage", "pubsub", "scheduler"]
49+
}
50+
}
51+
52+
# Terraform state bucket (optional - defaults to terraform-khan-<github_repository>-<service_name>)
53+
# terraform_state_bucket = "custom-bucket-name"
54+
55+
# Secrets that the Terraform configuration needs access to (optional)
56+
secret_ids = [
57+
"projects/khan-academy/secrets/slack-token"
58+
]
59+
}
60+
```
61+
62+
## Inputs
63+
64+
| Name | Description | Type | Default | Required |
65+
|------|-------------|------|---------|:--------:|
66+
| `service_name` | User-defined unique identifier for this Terraform configuration and environment (e.g., 'culture-cron-prod', 'webapp-staging') | `string` | n/a | yes |
67+
| `github_repository` | GitHub repository containing the Terraform configuration in format 'org/repo' | `string` | n/a | yes |
68+
| `target_projects` | Map of GCP projects where this Terraform configuration will deploy resources. Keys are project IDs. | `map(object)` | `{}` | no |
69+
| `terraform_state_bucket` | GCS bucket name for storing Terraform state for this configuration | `string` | `terraform-{org}-{repo}-{service}` | no |
70+
| `secrets_project_id` | Project ID where secrets needed by the Terraform configuration are stored | `string` | `"khan-academy"` | no |
71+
| `secret_ids` | List of secret IDs that the Terraform configuration needs access to | `list(string)` | `[]` | no |
72+
73+
### Target Projects Structure
74+
75+
The `target_projects` variable accepts a map where each key is a GCP project ID:
76+
77+
```hcl
78+
target_projects = {
79+
"khan-academy" = {
80+
required_services = ["storage", "pubsub"] # Services needed in this project
81+
}
82+
"khan-academy-staging" = {
83+
required_services = ["cloudfunctions"]
84+
}
85+
}
86+
```
87+
88+
### Available Services
89+
90+
These services correspond to GCP resources that your Terraform configuration can deploy and manage:
91+
92+
- `cloudfunctions` - Enables deploying and managing Cloud Functions via Terraform
93+
- `storage` - Enables creating and managing Cloud Storage buckets via Terraform
94+
- `pubsub` - Enables creating and managing Pub/Sub topics and subscriptions via Terraform
95+
- `scheduler` - Enables creating and managing Cloud Scheduler jobs via Terraform
96+
97+
### Terraform State Bucket Default
98+
99+
If `terraform_state_bucket` is not specified, the module automatically generates a bucket name based on your GitHub repository and service name:
100+
101+
- **Pattern**: `terraform-{org}-{repo}-{service}` (normalized for GCS bucket naming rules)
102+
- **Normalization**: Converted to lowercase, underscores replaced with hyphens
103+
- **Example**: `Khan/culture-cron` + `culture-cron-prod``terraform-khan-culture-cron-culture-cron-prod`
104+
- **Example**: `Khan/webapp` + `webapp-staging``terraform-khan-webapp-webapp-staging`
105+
- **Example**: `Khan/Mobile_App` + `mobile_app_prod``terraform-khan-mobile-app-mobile-app-prod`
106+
107+
This ensures each Terraform setup gets its own isolated state bucket while maintaining consistent, predictable naming that complies with GCS bucket naming requirements.
108+
109+
### Service Name Guidelines
110+
111+
The `service_name` is a **user-defined identifier** that you choose yourself to distinguish different Terraform configurations managed in CI. This is not something you need to look up - you get to assign it based on your own naming conventions.
112+
113+
#### How to Choose a Service Name
114+
115+
**You should choose a name that clearly identifies:**
116+
1. **What service/application** this Terraform configuration manages
117+
2. **Which environment** (prod, staging, dev, etc.)
118+
3. **What scope** (if you have multiple Terraform configurations per service)
119+
120+
#### Recommended Patterns
121+
122+
- **Basic**: `{service}-{environment}` (e.g., `culture-cron-prod`, `webapp-staging`)
123+
- **With scope**: `{service}-{scope}-{environment}` (e.g., `webapp-frontend-prod`, `webapp-backend-staging`)
124+
- **Shared resources**: `{purpose}-{environment}` (e.g., `shared-infra-prod`, `monitoring-dev`)
125+
126+
#### Examples by Use Case
127+
128+
| Scenario | Service Name | What It Represents |
129+
|----------|--------------|-------------------|
130+
| Culture Cron production | `culture-cron-prod` | Production deployment of Culture Cron service |
131+
| Webapp staging environment | `webapp-staging` | Staging environment for the main webapp |
132+
| API development environment | `api-dev` | Development environment for API service |
133+
| Shared infrastructure | `shared-infra-prod` | Production shared infrastructure (networking, etc.) |
134+
| Multiple configs per service | `webapp-frontend-prod`<br/>`webapp-backend-prod` | Separate Terraform configs for frontend and backend |
135+
136+
#### Technical Requirements
137+
138+
- **Characters**: Lowercase letters, numbers, and hyphens only (no underscores)
139+
- **Uniqueness**: Must be unique across all your Terraform CI configurations
140+
- **Purpose**: Creates isolated CI infrastructure for each configuration
141+
- **Usage**: Used to generate service account names, state bucket names, and provider IDs
142+
143+
#### Multi-Configuration Repositories
144+
145+
A single GitHub repository can have multiple `service_name` values for different purposes:
146+
- Different environments (`myapp-prod`, `myapp-staging`, `myapp-dev`)
147+
- Different components (`myapp-frontend-prod`, `myapp-backend-prod`)
148+
- Different deployment scopes (`myapp-us-prod`, `myapp-eu-prod`)
149+
150+
Each `service_name` gets its own isolated:
151+
- Service account (`{service_name}-ci`)
152+
- Terraform state bucket (`terraform-{org}-{repo}-{service_name}`)
153+
- Workload Identity provider (`{service_name}-provider`)
154+
155+
**Note**: GitHub repository names may contain underscores, which will be automatically converted to hyphens in generated bucket names to comply with GCS naming requirements.
156+
157+
## Outputs
158+
159+
| Name | Description |
160+
|------|-------------|
161+
| `service_account_email` | Email of the created service account |
162+
| `workload_identity_provider` | Full resource name of the Workload Identity provider |
163+
| `terraform_state_bucket` | The GCS bucket name used for Terraform state (computed or provided) |
164+
| `service_name` | The unique identifier for this Terraform configuration and environment |
165+
| `target_projects` | Map of target projects configured |
166+
167+
## GitHub Actions Configuration
168+
169+
After applying this module, configure your GitHub Actions workflow to manage Terraform in CI:
170+
171+
```yaml
172+
permissions:
173+
contents: read
174+
id-token: write
175+
176+
jobs:
177+
deploy:
178+
runs-on: ubuntu-latest
179+
steps:
180+
- uses: actions/checkout@v4
181+
182+
- name: Authenticate to Google Cloud
183+
uses: google-github-actions/auth@v2
184+
with:
185+
workload_identity_provider: ${{ outputs.workload_identity_provider }}
186+
service_account: ${{ outputs.service_account_email }}
187+
188+
- name: Set up Cloud SDK
189+
uses: google-github-actions/setup-gcloud@v2
190+
```
191+
192+
## Security Features
193+
194+
- **No Service Account Keys**: Uses Workload Identity Federation for keyless auth
195+
- **Repository Scoped**: Access restricted to specified GitHub repository
196+
- **Least Privilege**: Only grants permissions for enabled services in target projects
197+
- **Secret Scoping**: Fine-grained access to specific secrets only
198+
- **Centralized Management**: All CI infrastructure managed in khan-internal-services project
199+
200+
## Examples
201+
202+
### Single Project Terraform Configuration (Using Default State Bucket)
203+
```hcl
204+
# CI for culture-cron production Terraform configuration
205+
module "culture_cron_prod_ci" {
206+
source = "git::https://github.com/Khan/terraform-modules.git//terraform/modules/github-ci-bootstrap?ref=v1.0.0"
207+
208+
service_name = "culture-cron-prod"
209+
github_repository = "Khan/culture-cron"
210+
211+
# This Terraform config deploys resources to khan-academy project
212+
target_projects = {
213+
"khan-academy" = {
214+
required_services = ["cloudfunctions", "storage", "pubsub", "scheduler"]
215+
}
216+
}
217+
218+
# Terraform state bucket defaults to: terraform-khan-culture-cron-culture-cron-prod
219+
}
220+
```
221+
222+
### Multi-Project Terraform Configuration
223+
```hcl
224+
# CI for webapp staging Terraform configuration that deploys across multiple projects
225+
module "webapp_staging_ci" {
226+
source = "git::https://github.com/Khan/terraform-modules.git//terraform/modules/github-ci-bootstrap?ref=v1.0.0"
227+
228+
service_name = "webapp-staging"
229+
github_repository = "Khan/webapp"
230+
231+
# This Terraform config deploys resources to multiple projects
232+
target_projects = {
233+
"khan-academy-staging" = {
234+
required_services = ["storage", "pubsub"]
235+
}
236+
"khan-shared-services" = {
237+
required_services = ["storage"]
238+
}
239+
}
240+
241+
# Terraform state bucket defaults to: terraform-khan-webapp-webapp-staging
242+
}
243+
```
244+
245+
### Terraform Configuration with Secrets Access (Custom State Bucket)
246+
```hcl
247+
# CI for API production Terraform configuration that needs access to secrets
248+
module "api_prod_ci" {
249+
source = "git::https://github.com/Khan/terraform-modules.git//terraform/modules/github-ci-bootstrap?ref=v1.0.0"
250+
251+
service_name = "api-prod"
252+
github_repository = "Khan/api"
253+
254+
target_projects = {
255+
"khan-academy" = {
256+
required_services = ["cloudfunctions", "storage"]
257+
}
258+
}
259+
260+
# Use custom state bucket instead of default (terraform-khan-api-api-prod)
261+
terraform_state_bucket = "shared-terraform-state"
262+
263+
# Secrets that the Terraform configuration needs access to
264+
secret_ids = [
265+
"projects/khan-academy/secrets/api-key",
266+
"projects/khan-academy/secrets/database-url"
267+
]
268+
}
269+
```
270+
271+
### Terraform Configuration with Storage-Only Access
272+
```hcl
273+
# CI for static site Terraform configuration that only manages storage buckets
274+
module "static_site_prod_ci" {
275+
source = "git::https://github.com/Khan/terraform-modules.git//terraform/modules/github-ci-bootstrap?ref=v1.0.0"
276+
277+
service_name = "static-site-prod"
278+
github_repository = "Khan/static-site"
279+
280+
# This Terraform config only creates storage buckets
281+
target_projects = {
282+
"khan-academy" = {
283+
required_services = ["storage"]
284+
}
285+
}
286+
287+
# Terraform state bucket defaults to: terraform-khan-static-site-static-site-prod
288+
}
289+
```
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Culture Cron GitHub Terraform CI Bootstrap Example
2+
3+
This example demonstrates how to use the GitHub Terraform CI Bootstrap module from the shared Terraform modules repository to set up CI/CD infrastructure for managing the Culture Cron Terraform configuration in GitHub Actions.
4+
5+
## Overview
6+
7+
This configuration uses the reusable `github-ci-bootstrap` module to create:
8+
9+
- Service account for running Terraform operations in GitHub Actions (in khan-internal-services project)
10+
- Workload Identity Federation using shared pool for keyless authentication
11+
- IAM permissions for deploying Cloud Functions, Storage, Pub/Sub, and Scheduler resources via Terraform
12+
- Access to secrets in Google Secret Manager that the Terraform configuration needs
13+
- Permissions for Terraform state bucket management
14+
15+
## Architecture
16+
17+
- **Service Account**: `culture-cron-prod-ci` created in `khan-internal-services` project for Terraform operations
18+
- **Shared Pool**: Uses `khan-internal-services-github-ci` pool (shared by all Terraform CI setups)
19+
- **Unique Provider**: `culture-cron-prod-provider` within the shared pool
20+
- **Target Project**: Terraform configuration deploys to `khan-internal-services` with permissions for specified services
21+
- **State Isolation**: Dedicated state bucket for this Terraform configuration
22+
23+
## Purpose
24+
25+
This creates the necessary infrastructure for managing Terraform in GitHub Actions CI:
26+
- `terraform plan` - Review infrastructure changes in CI
27+
- `terraform apply` - Deploy infrastructure changes via CI
28+
- `terraform destroy` - (if needed) Clean up resources via CI
29+
30+
Each Terraform configuration managed in CI gets its own service account to ensure:
31+
- **Isolation**: Separate permissions and state for prod/staging/dev configurations
32+
- **Security**: Least privilege access to only required GCP services
33+
- **Traceability**: Clear audit trail of which Terraform CI account made which changes
34+
35+
## Usage
36+
37+
1. **Navigate to this directory:**
38+
```bash
39+
cd terraform/examples/bootstrap-with-module
40+
```
41+
42+
2. **Initialize Terraform:**
43+
```bash
44+
terraform init
45+
```
46+
47+
3. **Review the plan:**
48+
```bash
49+
terraform plan
50+
```
51+
52+
4. **Apply the configuration:**
53+
```bash
54+
terraform apply
55+
```
56+
57+
## Comparison
58+
59+
### Before (Direct Resources)
60+
The original bootstrap configuration had ~150 lines of Terraform with explicit resource definitions for service accounts, IAM bindings, and Workload Identity setup.
61+
62+
### After (Module)
63+
This example reduces the configuration to ~25 lines by using the reusable module, making it:
64+
- **Easier to maintain** - Updates happen in one place
65+
- **Less error-prone** - Tested, reusable components
66+
- **More consistent** - Standardized Terraform CI setup across projects
67+
- **Better documented** - Module includes comprehensive documentation
68+
- **Shared Infrastructure** - Uses centralized Workload Identity Pool
69+
70+
## Configuration
71+
72+
The module is configured for the Culture Cron production Terraform configuration managed in CI:
73+
74+
- **Terraform Configuration**: `culture-cron-prod` (production environment managed in CI)
75+
- **Repository**: `Khan/culture-cron` (GitHub repository containing Terraform code)
76+
- **Target Project**: `khan-internal-services` with Cloud Functions, Storage, Pub/Sub, Scheduler services
77+
- **State Bucket**: `terraform-khan-culture-cron-culture-cron-prod` (automatically computed from repository and service)
78+
- **Secrets**: `khan-academy` (Slack token storage needed by the Terraform configuration)
79+
80+
## Outputs
81+
82+
After applying, you'll get the service account email and Workload Identity provider needed for configuring GitHub Actions workflows to manage Terraform in CI.
83+
84+
## Migration
85+
86+
To migrate from an existing bootstrap setup:
87+
88+
1. **Backup current state:**
89+
```bash
90+
cd ../../bootstrap
91+
terraform state pull > backup.tfstate
92+
```
93+
94+
2. **Apply this example configuration**
95+
96+
3. **Update GitHub Actions workflows** to use the new service account for managing Terraform in CI
97+
98+
The new architecture uses shared infrastructure, so the first GitHub Terraform CI setup to be deployed will create the shared pool, and subsequent Terraform configurations will reuse it.

0 commit comments

Comments
 (0)