Skip to content

Bug: pixi-unpack sometimes unpacks archives with truncated filenames #258

@connortann

Description

@connortann

Summary

We encountered an intermittent issue with pixi-pack where wheels with long filenames were sometimes corrupted, and the tarball could not be unpacked.

The issue seems to relate with how long filenames are stored in a tarball header, given the 100-character limit for filenames in the ustar header entry.

The issue was fixed by updating from v0.6,6 to 0.7.5. However, I'll share a write-up of what I've found in case others encounter the issue and in case it helps you identify the root cause.

PS - thank you to the maintainers for creating a very useful library!

Examining the tarball header

We created a tarball with pixi-pack containing the regex lib from PyPI. The filepath in the tarball is 101 characters long:

pypi/regex-2025.11.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl

When extracted with pixi-pack unpack, the wheel file is extracted with the filename truncated to 100 characters, which then fails with an error that a wheel filename has an invalid suffix:

$ pixi-pack unpack environment.tar
⏳ Extracting and installing 435 packages to /tmp/.tmpqOJi8B/cache...
Error: Could not install all pypi packages: Could not find all pypi package files: The wheel filename "regex-2025.11.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.wh" is invalid: Must end with .whl

There are a few possible storage mechanisms for filepaths in tarball headers:

  • Standard USTAR Header (name and prefix), 100 character limit
  • GNU Long Name Entry (Type L)
  • PAX Extended Header (Type x or g)

Examining the USTAR header entry, we see the "name" of the wheel is indeed truncated to 100 characters and ends in .wh:

tarball USTAR header entry
  --- Raw 512-Byte Header Block ---
    [000:099] name        : pypi/regex-2025.11.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.wh
    [100:107] mode        : 0000644.
    [108:115] uid         : 0000000.
    [116:123] gid         : 0000000.
    [124:135] size        : 00003012156.
    [136:147] mtime       : 00000000000.
    [148:155] chksum      : 0027162.
    [156:156] typeflag    : 0
    [157:256] linkname    : ....................................................................................................
    [257:262] magic       : ustar 
    [263:264] version     :  .
    [265:296] uname       : ................................
    [297:328] gname       : ................................
    [329:336] devmajor    : 0000000.
    [337:344] devminor    : 0000000.
    [345:499] prefix      : ...........................................................................................................................................................
    [500:511] padding     : ............
  --- End of Header Block ---

There is also a "GNU Long Name Entry" in the tarball header that contains the full filename. When extracted with tar -xvf environment.tar, the wheel file is restored with the proper filename.

So, for some reason, pixi-unpack sometimes seems to miss this "long name entry" and use the truncated version from the ustar header. Or, perhaps the header is malformed in some way when it was packed.

Reproducing the issue

Unfortunately I haven't identified exactly under which circumstances the issue occurs.

We encountered the error with pixi-pack v0.6.6, and haven't seen the issue with pixi-pack v0.7.5. Using v0.6.6 , sometimes tarballs created by pixi-pack are fine, and sometimes they're unextractable.

I see that v0.6.6 uses tokio-tar::Archive to do the extraction, whereas v0.7.5 uses tar::Archive. So, perhaps the root issue was tokio-tar's ability to spot the GNU Long Name Entry.

Given that this seems quite hard to reproduce, and it works on the latest version, I imagine you might wish to close this issue, unless you have further thoughts on diagnosing the root cause or can spot a bug in the parsing of the tarball header.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions