Describe the bug
validate_tarfile() currently validates tar member paths, but it does not validate a symlink member's linkname. As a result, workspace hydrate paths that allow symlinks will accept entries whose targets point outside the extracted archive root.
That shows up most clearly in workspace restore flows:
UnixLocalSandboxSession.hydrate_workspace() calls safe_extract_tarfile()
DockerSandboxSession.hydrate_workspace() validates the tar and then streams it into tar -x
- several extension backends validate tar bytes the same way before hydrate
The regular session.extract() tar path is already using allow_symlinks=False, so this is not "all tar uploads are unsafe." The problem is narrower: restore / hydrate flows that intentionally allow symlink members also allow targets like /etc/passwd or ../../outside.
I can reproduce this on current main by creating a tar with leak -> /etc/passwd, passing it through validate_tarfile(), and then extracting it. Validation succeeds and the restored workspace ends up with a symlink that points outside the extracted root.
Debug information
- Agents SDK version:
main at f2fb9ffb (latest release boundary: v0.15.1)
- Python version: Python 3.12
Repro steps
import io
import tarfile
import tempfile
from pathlib import Path
from agents.sandbox.util.tar_utils import safe_extract_tarfile, validate_tarfile
buf = io.BytesIO()
with tarfile.open(fileobj=buf, mode="w") as tf:
info = tarfile.TarInfo("leak")
info.type = tarfile.SYMTYPE
info.linkname = "/etc/passwd"
tf.addfile(info)
buf.seek(0)
with tarfile.open(fileobj=buf, mode="r:*") as tf:
validate_tarfile(tf)
with tempfile.TemporaryDirectory() as td:
safe_extract_tarfile(tf, root=Path(td))
print((Path(td) / "leak").readlink())
Current result:
Expected behavior
Workspace hydrate / restore validation should reject symlink targets that escape the extracted archive root.
Keeping the general symlink support is still useful for normal workspace snapshots, but restore paths that materialize an archive into a sandbox workspace should fail loudly when a symlink target is absolute or traverses outside the archive root.
Describe the bug
validate_tarfile()currently validates tar member paths, but it does not validate a symlink member'slinkname. As a result, workspace hydrate paths that allow symlinks will accept entries whose targets point outside the extracted archive root.That shows up most clearly in workspace restore flows:
UnixLocalSandboxSession.hydrate_workspace()callssafe_extract_tarfile()DockerSandboxSession.hydrate_workspace()validates the tar and then streams it intotar -xThe regular
session.extract()tar path is already usingallow_symlinks=False, so this is not "all tar uploads are unsafe." The problem is narrower: restore / hydrate flows that intentionally allow symlink members also allow targets like/etc/passwdor../../outside.I can reproduce this on current
mainby creating a tar withleak -> /etc/passwd, passing it throughvalidate_tarfile(), and then extracting it. Validation succeeds and the restored workspace ends up with a symlink that points outside the extracted root.Debug information
mainatf2fb9ffb(latest release boundary:v0.15.1)Repro steps
Current result:
Expected behavior
Workspace hydrate / restore validation should reject symlink targets that escape the extracted archive root.
Keeping the general symlink support is still useful for normal workspace snapshots, but restore paths that materialize an archive into a sandbox workspace should fail loudly when a symlink target is absolute or traverses outside the archive root.