-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Enterprise Air-Gap & Mirror Support - Solution Proposal
Date: 2025-11-14
Problem: rules_wasm_component cannot build in air-gapped/corporate environments
Impact: Blocks enterprise adoption
Effort: 2-3 weeks
Problem Statement
Current Situation (CRITICAL BLOCKER)
Downloads Required: ~554 MB from 5 external registries
- GitHub (10 tools): wasm-tools, wit-bindgen, wac, wkg, wasmtime, wasi-sdk, wizer, TinyGo, Binaryen, wasmsign2
- npmjs.org: jco + dependencies
- nodejs.org: Node.js runtime
- go.dev: Go SDK
- registry.wasm.io: WKG packages (runtime)
Enterprise Requirements NOT Met:
- ❌ Air-gap builds (no internet access)
- ❌ Corporate proxy with authentication
- ❌ Custom internal mirrors (JFrog Artifactory, Sonatype Nexus, Harbor)
- ❌ Security scanning before downloads
- ❌ License compliance verification
- ❌ Download audit trail for compliance
Solution Analysis: 4 Approaches
Approach 1: Environment Variable Mirror Override ⭐ RECOMMENDED
Design:
# Existing code:
url = f"https://github.com/{repo}/releases/download/v{version}/{asset}"
# New code:
mirror_base = os.getenv("BAZEL_WASM_GITHUB_MIRROR", "https://github.com")
url = f"{mirror_base}/{repo}/releases/download/v{version}/{asset}"Pros:
- ✅ Minimal code changes (20 lines total)
- ✅ Backward compatible (defaults to public URLs)
- ✅ Works with ANY mirror (JFrog, Minio, Harbor, S3)
- ✅ Per-registry configuration (GitHub, npm, Go separately)
- ✅ Immediately enables air-gap with distfiles
- ✅ No Bazel API changes needed
Cons:
⚠️ Requires environment variables (must document)⚠️ Mirror must replicate exact GitHub URL structure⚠️ No automatic fallback to public if mirror fails
Implementation:
# toolchains/secure_download.bzl
def secure_download_tool(repository_ctx, tool_name, version, platform):
# NEW: Read mirror configuration
github_mirror = repository_ctx.os.environ.get(
"BAZEL_WASM_GITHUB_MIRROR",
"https://github.com"
)
# Construct URL with mirror
url = construct_url(github_mirror, tool_info, version, platform)
# Download with checksum verification (unchanged)
repository_ctx.download_and_extract(
url = url,
sha256 = checksum,
# ...
)Environment Variables:
export BAZEL_WASM_GITHUB_MIRROR="https://artifacts.corp.com/github-mirror"
export BAZEL_NPM_REGISTRY="https://npm.corp.com"
export BAZEL_GO_MIRROR="https://go-mirror.corp.com"
export BAZEL_NODEJS_MIRROR="https://nodejs-mirror.corp.com"Corporate Setup Steps:
- Mirror GitHub releases to JFrog:
https://jfrog.corp.com/github/{owner}/{repo}/... - Set env var:
BAZEL_WASM_GITHUB_MIRROR=https://jfrog.corp.com/github - Build works identically, just different source
Effort: 3-5 days
Approach 2: Bazel Repository Cache + Distfiles
Design: Leverage Bazel's --repository_cache and --distdir
Pros:
- ✅ Uses native Bazel features
- ✅ No code changes needed
- ✅ Works for all repository downloads
Cons:
- ❌ Requires manual pre-population of cache
- ❌ Complex to distribute cache to air-gapped systems
- ❌ No built-in cache mirroring
- ❌ Doesn't solve npm/Go SDK downloads
Usage:
# Step 1: Build on internet-connected machine
bazel build --repository_cache=/tmp/bazel-cache //...
# Step 2: Copy cache to air-gapped machine
rsync -av /tmp/bazel-cache airgap-server:/opt/bazel-cache
# Step 3: Build on air-gapped machine
bazel build --repository_cache=/opt/bazel-cache //...Issues:
- Cache isn't human-readable (content-addressed by SHA256)
- npm packages still require internet
- Go SDK still requires internet
- No way to pre-populate cache without building first
Effort: 1 week (documentation + testing)
Approach 3: Bazel Module Extension with Mirror Config
Design: Add mirror configuration to Bazel module extension
Pros:
- ✅ Type-safe configuration
- ✅ Per-project customization
- ✅ Version-controlled configuration
- ✅ Better UX than environment variables
Cons:
- ❌ Requires MODULE.bazel changes (users must update)
- ❌ Not backward compatible
- ❌ More complex implementation
- ❌ Harder to apply globally across projects
Usage:
# MODULE.bazel (user configuration)
wasm_toolchain = use_extension("//wasm:extensions.bzl", "wasm_toolchain")
wasm_toolchain.configure_mirrors(
github_base = "https://artifacts.corp.com/github",
npm_registry = "https://npm.corp.com",
go_mirror = "https://go.corp.com",
nodejs_mirror = "https://nodejs.corp.com",
)
wasm_toolchain.register(name = "wasm_tools")Implementation:
# wasm/extensions.bzl (new file)
def _configure_mirrors_impl(ctx):
ctx.file("mirrors.bzl", content = """
GITHUB_MIRROR = "{github}"
NPM_REGISTRY = "{npm}"
GO_MIRROR = "{go}"
NODEJS_MIRROR = "{nodejs}"
""".format(
github = ctx.attr.github_base,
npm = ctx.attr.npm_registry,
go = ctx.attr.go_mirror,
nodejs = ctx.attr.nodejs_mirror,
))
configure_mirrors = tag_class(attrs = {
"github_base": attr.string(default = "https://github.com"),
"npm_registry": attr.string(default = "https://registry.npmjs.org"),
# ...
})Effort: 1-2 weeks
Approach 4: Vendoring Script + Offline Mode
Design: Pre-download all dependencies to third_party/ directory
Pros:
- ✅ Complete offline capability
- ✅ No runtime configuration needed
- ✅ Works identically on all machines
- ✅ Audit trail (vendored files in repo)
Cons:
- ❌ Large repo size (~554 MB)
- ❌ Complex vendoring script
- ❌ Must re-vendor for version updates
- ❌ Git doesn't handle large binaries well
Usage:
# Step 1: Vendor all toolchains (internet required)
bazel run //tools:vendor_toolchains
# Step 2: Commit vendored files
git add third_party/toolchains/
git commit -m "vendor: toolchain binaries for v1.0.0"
# Step 3: Build offline
bazel build --config=offline //...Implementation:
# tools/vendor_toolchains.py
def vendor_all_toolchains(output_dir):
registry = load_registry()
for tool in registry.tools:
for version in tool.versions:
for platform in tool.platforms:
url = construct_url(tool, version, platform)
checksum = get_checksum(tool, version, platform)
# Download to third_party/
download_file(
url=url,
output=f"{output_dir}/{tool}/{version}/{platform}",
verify_sha256=checksum
)Effort: 2 weeks
Recommended Solution: Hybrid Approach
Combine Approach 1 (env var mirrors) + Approach 4 (vendoring script) for maximum flexibility.
Architecture
┌─────────────────────────────────────────────────────────┐
│ Build Environment Detection │
├─────────────────────────────────────────────────────────┤
│ │
│ 1. Check BAZEL_WASM_OFFLINE=1 │
│ ├─ YES → Use vendored files in third_party/ │
│ └─ NO → Continue to step 2 │
│ │
│ 2. Check BAZEL_WASM_GITHUB_MIRROR set? │
│ ├─ YES → Download from corporate mirror │
│ └─ NO → Download from public GitHub │
│ │
│ 3. Download with SHA256 verification │
│ ├─ SUCCESS → Cache in Bazel repository cache │
│ └─ FAIL → Error with troubleshooting hints │
│ │
└─────────────────────────────────────────────────────────┘
Usage Scenarios
Scenario 1: Public Internet (Default)
# No configuration needed
bazel build //examples/basic:hello_component
# Downloads from github.com, npmjs.org, etc.Scenario 2: Corporate Mirror
# .bazelrc or CI/CD environment
export BAZEL_WASM_GITHUB_MIRROR=https://jfrog.corp.com/github
export BAZEL_NPM_REGISTRY=https://npm.corp.com
bazel build //examples/basic:hello_component
# Downloads from corporate mirrorsScenario 3: Air-Gap (Vendored)
# Step 1: Vendor on internet-connected machine
bazel run //tools:vendor_toolchains -- --platform=linux_amd64,darwin_arm64
# Step 2: Transfer repo to air-gapped machine
# Step 3: Build offline
export BAZEL_WASM_OFFLINE=1
bazel build //examples/basic:hello_component
# Uses third_party/toolchains/ (no internet required)Scenario 4: Mixed (Partial Air-Gap)
# Use vendored files + corporate mirror for new tools
export BAZEL_WASM_OFFLINE=prefer # Try vendored first, fallback to mirror
export BAZEL_WASM_GITHUB_MIRROR=https://jfrog.corp.com/github
bazel build //examples/basic:hello_componentImplementation Plan
Phase 1: Environment Variable Mirrors (Week 1)
- Add mirror URL environment variable support to
secure_download.bzl - Add npm registry configuration to
jco_toolchain.bzl - Add Go mirror configuration to
tinygo_toolchain.bzl - Add Node.js mirror configuration to
jco_toolchain.bzl - Update all toolchain files to use configurable mirrors
- Add retry logic with exponential backoff
- Document mirror setup for JFrog, Nexus, Harbor
Phase 2: Vendoring Support (Week 2)
- Create
tools/vendor_toolchains.pyscript - Add offline mode detection to
secure_download.bzl - Support
file://URLs in download infrastructure - Add
third_party/toolchains/.gitignore(optional vendoring) - Test complete offline build workflow
- Document vendoring process
Phase 3: Testing & Documentation (Week 3)
- Test with JFrog Artifactory setup
- Test with air-gap environment
- Test with corporate proxy
- Write enterprise deployment guide
- Create mirror setup scripts
- Add troubleshooting documentation
Proof of Concept: Environment Variable Mirrors
Code Changes Required
File 1: toolchains/secure_download.bzl (20 lines changed)
def secure_download_tool(repository_ctx, tool_name, version, platform):
"""Download and verify tool with configurable mirror support."""
# NEW: Read mirror configuration from environment
github_mirror = repository_ctx.os.environ.get(
"BAZEL_WASM_GITHUB_MIRROR",
"https://github.com" # Default to public GitHub
)
# Load tool info from registry
tool_info = get_tool_info(tool_name)
checksum = get_tool_checksum(tool_name, version, platform)
# Construct URL with configurable mirror
if github_mirror != "https://github.com":
# Corporate mirror: replace github.com with mirror
url = construct_mirror_url(github_mirror, tool_info, version, platform)
else:
# Public GitHub: use standard URL construction
url = construct_github_url(tool_info, version, platform)
# Download with verification (unchanged)
repository_ctx.download_and_extract(
url = url,
sha256 = checksum,
type = archive_type,
)File 2: toolchains/jco_toolchain.bzl (15 lines changed)
def _jco_toolchain_impl(repository_ctx):
# NEW: Read NPM registry from environment
npm_registry = repository_ctx.os.environ.get(
"BAZEL_NPM_REGISTRY",
"https://registry.npmjs.org"
)
# NEW: Read Node.js mirror from environment
nodejs_mirror = repository_ctx.os.environ.get(
"BAZEL_NODEJS_MIRROR",
"https://nodejs.org"
)
# Download Node.js from configurable mirror
node_url = f"{nodejs_mirror}/dist/v{node_version}/node-v{node_version}-{platform}.tar.gz"
# Configure npm to use corporate registry
npm_config = f"registry={npm_registry}\n"
repository_ctx.file(".npmrc", content=npm_config)File 3: .bazelrc (documentation)
# Corporate mirror configuration (optional)
# Uncomment and customize for your environment:
# build --repo_env=BAZEL_WASM_GITHUB_MIRROR=https://artifacts.corp.com/github
# build --repo_env=BAZEL_NPM_REGISTRY=https://npm.corp.com
# build --repo_env=BAZEL_GO_MIRROR=https://go-mirror.corp.com
# build --repo_env=BAZEL_NODEJS_MIRROR=https://nodejs-mirror.corp.com
# Air-gap mode (use vendored files)
# build:offline --repo_env=BAZEL_WASM_OFFLINE=1Testing the POC
Test 1: Mirror URL Construction
# Set mirror
export BAZEL_WASM_GITHUB_MIRROR=https://jfrog.corp.com/github-mirror
# Verify URL construction
bazel build --repository_cache=/tmp/test-cache //examples/basic:hello_component 2>&1 | grep "Downloading"
# Expected: https://jfrog.corp.com/github-mirror/bytecodealliance/wasm-tools/...
# NOT: https://github.com/bytecodealliance/wasm-tools/...Test 2: Fallback to Default
# No mirror set
unset BAZEL_WASM_GITHUB_MIRROR
# Should use public GitHub
bazel build //examples/basic:hello_component 2>&1 | grep "Downloading"
# Expected: https://github.com/...Test 3: NPM Registry Override
export BAZEL_NPM_REGISTRY=https://npm.corp.com
# Check npm configuration
bazel build //toolchains/jco:jco_toolchain --repository_cache=/tmp/test
cat $(bazel info output_base)/external/jco_toolchain/.npmrc
# Expected: registry=https://npm.corp.comCorporate Mirror Setup Guide
JFrog Artifactory
Step 1: Create Remote Repository
# Artifactory → Repositories → New Remote Repository
Repository Type: Generic
Repository Key: github-releases
URL: https://github.comStep 2: Configure URL Rewriting
// Artifactory → Remote Repositories → github-releases → Advanced
Remote Repository URL: https://github.com
Path Pattern: **/*Step 3: Set Environment Variable
export BAZEL_WASM_GITHUB_MIRROR=https://artifactory.corp.com/artifactory/github-releasesSonatype Nexus
Step 1: Create Raw Proxy Repository
# Nexus → Repositories → Create Repository → raw (proxy)
Name: github-proxy
Remote Storage: https://github.comStep 2: Configure
export BAZEL_WASM_GITHUB_MIRROR=https://nexus.corp.com/repository/github-proxyHarbor (OCI Registry)
Challenge: Harbor is OCI-only, GitHub releases are not OCI
Solution: Use Harbor for WASM components only, different mirror for binaries
export BAZEL_WASM_GITHUB_MIRROR=https://storage.corp.com/github-mirror # S3/Minio
export WKG_REGISTRY=https://harbor.corp.com # For WASM componentsRisk Assessment
Technical Risks
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Mirror URL format mismatch | Medium | High | Document exact URL structure required |
| npm registry incompatibility | Low | Medium | Test with common registries (Verdaccio, Nexus) |
| Checksum verification fails | Low | High | Mirror must preserve exact file contents |
| Environment variable not propagated | Medium | Medium | Document bazel --repo_env usage |
Organizational Risks
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Corporate IT blocks setup | Low | High | Provide security/compliance docs |
| Mirror maintenance burden | Medium | Medium | Document automated mirroring |
| Users don't read docs | High | Low | Fail with helpful error messages |
Success Criteria
Must Have:
- ✅ Builds succeed with
BAZEL_WASM_GITHUB_MIRRORset to test mirror - ✅ Builds succeed in complete air-gap with vendored files
- ✅ Backward compatible (no env vars = current behavior)
- ✅ Works with JFrog Artifactory
- ✅ Works with npm registries (Verdaccio/Nexus)
Should Have:
- ✅ Retry logic for transient failures
- ✅ Helpful error messages for mirror misconfiguration
- ✅ Documentation for common corporate setups
- ✅ Vendoring script for air-gap preparation
Nice to Have:
- Mirror health checking
- Automatic fallback to public if mirror fails
- Download audit logging
Estimated Effort
| Phase | Effort | Risk |
|---|---|---|
| Env var mirrors | 3-5 days | Low |
| Vendoring support | 5-7 days | Medium |
| Testing & docs | 3-5 days | Low |
| Total | 11-17 days | Low-Medium |
Recommendation
Implement Hybrid Approach (Env Var + Vendoring):
- Start with Phase 1 (env var mirrors) - delivers 80% of value in 1 week
- Add Phase 2 (vendoring) - completes air-gap story
- Polish in Phase 3 - documentation and edge cases
Why This Approach:
- ✅ Minimal code changes (proven pattern used by Bazel rules_docker, rules_oci)
- ✅ Backward compatible (zero breaking changes)
- ✅ Flexible (works with any mirror system)
- ✅ Quick to implement (2-3 weeks total)
- ✅ Addresses root enterprise blocker
Alternative If Timeline Critical:
- Implement only Phase 1 (env var mirrors) in 1 week
- Document manual vendoring workaround
- Add formal vendoring support later based on demand