-
Notifications
You must be signed in to change notification settings - Fork 7
[Ecosystem] safetensors #61
Description
Contact emails
Project summary
Simple, safe way to store and distribute tensors
Project description
SafeTensors is a secure, fast file format for storing machine learning tensors.
It can be used as a replacement of torch.load and binds itself to underlying torch loading APIs; preventing arbitrary code execution while enabling zero-copy and lazy loading.
Files contain a JSON header with tensor metadata followed by raw data buffers. It speeds up distributed model loading significantly.
Key benefits:
- no code execution risk
- 100MB header limit for DOS protection
- faster multi-GPU loading
- cross-language compatibility via Rust and Python implementations
It is an Open-source under Apache 2.0, developed by Hugging Face.
Are there any other projects in the PyTorch Ecosystem similar to yours? If, yes, what are they?
No comparable library as far as I know
Project repo URL
https://github.com/huggingface/safetensors
Additional repos in scope of the application
No response
Project license
Apache 2.0
GitHub handles of the project maintainer(s)
danieldk,McPatate
Is there a corporate or academic entity backing this project? If so, please provide the name and URL of the entity.
Hugging Face
Website URL
huggingface.co
Documentation
huggingface.co/docs/safetensors
How do you build and test the project today (continuous integration)? Please describe.
on every commit, build rust, run rust tests, audit, code coverage, clippy, fmt on stable release for the following oses:
[ubuntu-latest, windows-latest, macOS-latest]
Performance regression check is run on every commit to main with pytest-benchmarking , we store the results of previous commits in GH actions/cache and add a comment on PR if an issue arises.
Documentation is built every commit to main.
For python, one CI to run tests, clippy, cargo audit of python bindings (py + rust code) on the following platforms:
- os: ubuntu-latest
version:
torch: torch
python: "3.13"
numpy: numpy
arch: "x64-freethreaded"
- os: macos-15-intel
version:
torch: torch==1.10
numpy: "numpy==1.26"
python: "3.9"
arch: "x64"
- os: macos-latest
version:
torch: torch
python: "3.12"
numpy: numpy
arch: "arm64"
- os: windows-11-arm
version:
torch: torch
python: "3.12"
numpy: numpy
arch: "arm64"and bonus separate platform:
test_s390x_big_endian:
runs-on: ubuntu-latest
name: Test bigendian - S390X-> pytorch latest I assume for s390x
and lastly, release CI which builds on every platform - architecture permutation in the world, and when all is built pushes an artifact to pypi. release CI is run on every commit to main but only pushes to pypi when the commit has a tag associated with it.
ubuntu-latest, aarch64
ubuntu-latest, armv7
ubuntu-latest, ppc64le
ubuntu-latest, s390x
ubuntu-latest, x86
ubuntu-latest, x86_64
macos-14, aarch64
macos-15-intel, x86_64
ubuntu-latest, aarch64
ubuntu-latest, armv7
ubuntu-latest, x86
ubuntu-latest, x86_64
windows-11-arm, arm64
windows-latest, x64
windows-latest, x86
Version of PyTorch
1.10 at the moment
Components of PyTorch
Components of pytorch:
- from_file
- Tensor (.to , .narrow )
- UntypedStorage , ByteStorage
How long do you expect to maintain the project?
As long as machine learning is relevant! We're pouring significant resources in the project and consider it still early,
Additional information
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status