Skip to content

feat: TargetArch() utility and arch-aware path resolution#2258

Merged
jakubno merged 5 commits intomainfrom
feat/target-arch-path-resolution
Mar 30, 2026
Merged

feat: TargetArch() utility and arch-aware path resolution#2258
jakubno merged 5 commits intomainfrom
feat/target-arch-path-resolution

Conversation

@tomassrnka
Copy link
Copy Markdown
Member

Summary

  • Adds TargetArch() helper in packages/shared/pkg/utils/env.go that reads TARGET_ARCH env var with alias normalization (x86_64amd64, aarch64arm64), defaulting to runtime.GOARCH
  • Updates Firecracker and kernel path resolution in the orchestrator to prefer an arch-prefixed layout ({version}/{arch}/binary) with a legacy flat-directory fallback
  • Converts the OCI DefaultPlatform from a hardcoded amd64 variable to a function backed by TargetArch()
  • Adds comprehensive unit tests for all new behavior

Part of #1875 (ARM64 support split). This is the foundation PR that subsequent PRs for runtime guards and create-build depend on.

Test plan

  • go test ./packages/shared/pkg/utils/... passes (TargetArch unit tests)
  • go test ./packages/orchestrator/pkg/sandbox/fc/... passes (config path resolution tests)
  • go test ./packages/orchestrator/pkg/template/build/core/oci/... passes (OCI DefaultPlatform tests)
  • CI green on all existing tests (no regressions — legacy flat paths still resolve)

🤖 Generated with Claude Code

- Add TargetArch() to shared/utils that reads TARGET_ARCH env var with
  alias normalization (x86_64→amd64, aarch64→arm64), defaulting to
  runtime.GOARCH
- Update Firecracker and kernel path resolution to prefer arch-prefixed
  layout ({version}/{arch}/binary) with legacy flat fallback
- Change OCI DefaultPlatform from hardcoded amd64 var to function using
  TargetArch()
- Add comprehensive tests for path resolution and TargetArch

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cursor
Copy link
Copy Markdown

cursor bot commented Mar 29, 2026

PR Summary

Medium Risk
Changes platform selection and on-disk binary/kernel path resolution based on TARGET_ARCH, which can affect which Firecracker/kernel binaries and OCI images are used at runtime. Legacy fallbacks and added tests reduce risk, but misconfiguration or unexpected filesystem/layout issues could still cause launch or pull failures.

Overview
This PR introduces a utils.TargetArch() helper (driven by TARGET_ARCH with alias normalization and host-arch fallback) and uses it to make the orchestrator multi-arch aware. Firecracker and kernel path resolution now prefers an arch-prefixed layout with a safe legacy fallback, and OCI image pulls/validation now select the platform architecture dynamically instead of being hardcoded to amd64, with updated error messaging and unit tests covering the new behaviors.

Written by Cursor Bugbot for commit e7543ce. This will update automatically on new commits. Configure here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 20a4968e92

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

- Use sync.Once for unrecognized TARGET_ARCH warning to prevent stderr
  flood on repeated calls
- Only fall back to legacy path on os.ErrNotExist, not on permission
  errors or other os.Stat failures

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines +20 to +32

// TargetArch returns the target architecture for binary paths and OCI platform.
// If TARGET_ARCH is set, it is normalized to Go convention ("amd64" or "arm64");
// otherwise defaults to the host architecture (runtime.GOARCH).
func TargetArch() string {
if arch := os.Getenv("TARGET_ARCH"); arch != "" {
if normalized, ok := archAliases[arch]; ok {
return normalized
}

archWarningOnce.Do(func() {
fmt.Fprintf(os.Stderr, "WARNING: unrecognized TARGET_ARCH=%q, falling back to %s\n", arch, runtime.GOARCH)
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The archAliases map lookup in TargetArch() is case-sensitive, so TARGET_ARCH=AMD64 or TARGET_ARCH=ARM64 (uppercase, common in some CI/CD environments) fails the lookup, emits a misleading warning, and returns runtime.GOARCH instead of the intended architecture. On a cross-compilation host (e.g., arm64 node with TARGET_ARCH=AMD64), this silently selects the wrong architecture for kernel paths and OCI platform. Fix: normalize with strings.ToLower(arch) before the map lookup in env.go:25.

Extended reasoning...

What the bug is and how it manifests

The TargetArch() function in packages/shared/pkg/utils/env.go reads the TARGET_ARCH environment variable and normalizes common aliases (e.g., x86_64->amd64, aarch64->arm64) via the archAliases map. However, all map keys are lowercase, so any uppercase or mixed-case value (e.g., AMD64, ARM64, X86_64) will fail the lookup.

The specific code path that triggers it

In env.go around line 25:

if normalized, ok := archAliases[arch]; ok {
    return normalized
}
// falls through to warning + runtime.GOARCH fallback
fmt.Fprintf(os.Stderr, "WARNING: unrecognized TARGET_ARCH=%q, falling back to %s\n", arch, runtime.GOARCH)
return runtime.GOARCH

When arch is "AMD64", the map lookup misses (key "AMD64" does not exist; only "amd64" does), the warning is printed, and runtime.GOARCH (the host arch) is returned.

Why existing code does not prevent it

The fix needs to be applied before the map lookup: archAliases[strings.ToLower(arch)]. The strings package is already imported in this file (used by RequiredEnv/OptionalEnv), so this is a trivial one-line change.

Impact

On a cross-compilation host -- for example, an arm64 CI runner with TARGET_ARCH=AMD64 to build amd64 binaries -- the function returns "arm64" instead of "amd64". This propagates to: (1) the Firecracker binary path (FirecrackerPath builds an arch-prefixed path using TargetArch()), (2) the kernel path (HostKernelPath), and (3) the OCI DefaultPlatform(). The operator does see a warning, so this is not completely silent -- but a WARNING to stderr is easy to overlook in noisy CI logs, and the downstream behavior (wrong arch binary, wrong OCI platform) is incorrect.

Step-by-step proof

  1. CI runner is arm64; operator sets TARGET_ARCH=AMD64 (uppercase, intending amd64 cross-compile).
  2. TargetArch() calls os.Getenv("TARGET_ARCH") and gets "AMD64".
  3. archAliases["AMD64"] returns ok == false (map has only "amd64").
  4. Function emits: WARNING: unrecognized TARGET_ARCH="AMD64", falling back to arm64.
  5. Returns "arm64" (runtime.GOARCH on the arm64 runner) -- the wrong value.
  6. DefaultPlatform() returns {OS: "linux", Architecture: "arm64"} -- the wrong OCI platform.
  7. FirecrackerPath checks {version}/arm64/firecracker and falls back to the legacy flat path, potentially using the arm64 binary when amd64 was intended.

How to fix it

Change env.go line 25 from:

if normalized, ok := archAliases[arch]; ok {

to:

if normalized, ok := archAliases[strings.ToLower(arch)]; ok {

This makes TargetArch() case-insensitive and robustly handles the full set of real-world CI inputs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't fix — Go/Docker/Debian arch conventions are always lowercase (amd64, arm64). Adding strings.ToLower would mask actual typos rather than surfacing them via the warning.

Copy link
Copy Markdown
Member

@jakubno jakubno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO no need to fallback as we have them already long enough

tomassrnka and others added 2 commits March 30, 2026 09:25
All active Firecracker and kernel versions have been migrated to the
arch-prefixed layout ({version}/{arch}/binary). Remove the fallback to
the legacy flat path ({version}/binary) so that a missing arch directory
fails loudly instead of silently using a potentially wrong-arch binary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tomassrnka
Copy link
Copy Markdown
Member Author

@jakubno agreed, removed all fallback code from the PR

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@dobrac dobrac assigned jakubno and unassigned dobrac Mar 30, 2026
Copy link
Copy Markdown
Member

@jakubno jakubno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found out later that we're missing it for some firecracker versions (v1.10), so we need the fallback. I'm sorry, my overlook

@jakubno jakubno merged commit d250235 into main Mar 30, 2026
36 checks passed
@jakubno jakubno deleted the feat/target-arch-path-resolution branch March 30, 2026 13:43
drarijitdas added a commit to drarijitdas/infra that referenced this pull request Mar 31, 2026
Syncs all application code from e2b-dev/infra upstream including:
- Redis PubSub for state transitions (e2b-dev#2099)
- Pluggable egress firewall (e2b-dev#2187)
- Firecracker v1.12 upgrade (e2b-dev#2245)
- Label-based sandbox scheduler (e2b-dev#2066)
- Orchestrator internal/ -> pkg/ migration
- Pre-compute cgroup CPU deltas (e2b-dev#2265)
- Arch-aware downloads (e2b-dev#2258, e2b-dev#2260)
- Customizable pre-warmed NBDs (e2b-dev#2266)
- Autoresume improvements (e2b-dev#1969, e2b-dev#2196)
- Many bug fixes for race conditions, eviction, error handling

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants