Skip to content

Inspect k8s sandbox does not honour concurrency parameter in exec #126

@vhong-aisi

Description

@vhong-aisi

Description

Inspect AI recently added a concurrency parameter to control whether a call to exec should be count towards the concurrency limit (commit 239386ba). This is to address an issue with Claude Code agent in inspect_swe where running multiple evals with Claude Code agent in parallel would cause the sandbox to get stuck due to hitting the limit.

In K8s case, I believe the limit being hit is pod-op (src). My current work around is to set INSPECT_MAX_POD_OPS=10000.

Image

Expected behaviour

Running multiple Claude Code agent in parallel doesn't cause everything to pause indefinitely (i.e., until task timeout).

Reproduction

I suspect this minimal reproduction would work, though I haven't tested it:

from inspect_ai import Task, eval_set, task
from inspect_ai.dataset import Sample
from inspect_evals.cybench import cybench
from inspect_swe import claude_code

eval_set(
    cybench(agent=claude_code()),
    sandbox="k8s",
    log_dir="/tmp/",
    model=["anthropic/claude-sonnet-4-20250514"],
    max_tasks=25,
    max_sandboxes=50,
    max_connections=50,
    time_limit=3600,
    token_limit=50000,
)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions