This repository lists file formats used in ML/AI systems. It can be used as a resource for tool development and vulnerability research. We aim to keep this list as up-to-date and accurate as possible. If you discover any missing file formats, inaccuracies, or if you have more details to contribute, please raise an issue or submit a pull request.
| Name | ML-specific | Framework/Organization (if applicable) | Identification Tooling | Extensions | Additional Notes |
|---|---|---|---|---|---|
| PyTorch v1.3 | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file containing data.pkl (1 pickle file) |
| PyTorch v0.1.1 | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: Tar file with sys_info, pickle, storages, and tensors |
| PyTorch v0.1.10 | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: Stacked pickle files |
| TorchScript v1.4 | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file with data.pkl, constants.pkl, and version (2 pickle files and a folder) |
| TorchScript v1.3 (deprecated) | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file with data.pkl and constants.pkl (2 pickle files) |
| TorchScript v1.1 (deprecated) | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file with model.json and attributes.pkl (a JSON file and a pickle file) |
| TorchScript v1.0 (deprecated) | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file with model.json |
| PyTorch model archive format [ZIP] | Yes | PyTorch | Fickling | .mar | Description: ZIP file that includes Python code files and pickle files |
| PyTorch model archive format [TAR] | Yes | PyTorch | - | .mar | Description: TAR file that includes Python code files and pickle files |
| PyTorch Package | Yes | PyTorch | - | .pt, .pth, .bin | Description: ZIP file that includes a pickled model, user files represented as a Python package, and framework files including serialized tensor data |
| ExecuTorch | Yes | PyTorch | - | .pte | Description: Modified binary flatbuffer file with optional data segments appended |
| Torch.export | Yes | PyTorch | - | .pt2 | Description: ZIP file with JSON files and Python code file |
| PyTorch Mobile | Yes | PyTorch | - | .ptl | Description: Modified binary flatbuffer file |
| Safetensors | Yes | - | PolyFile | .safetensors | Refer to our audit |
| ONNX | Yes | - | - | .onnx | Refer to LobotoMI |
| Keras native file format | Yes | Keras | - | .keras | Description: ZIP archive with 2 JSON files and 1 h5 file |
| TensorFlow Saved Models | Yes | TensorFlow | - | .pb | Description: Custom Protobuf format. Can result in arbitrary code execution. |
| TensorFlow Checkpoint | Yes | TensorFlow | - | .ckpt | Description: Custom Protobuf format. Can result in arbitrary code execution. |
| TFLite | Yes | TensorFlow | - | .tflite | Description: Modified binary flatbuffer file |
| TFJS | Yes | TensorFlow | - | - | Description: JSON file and binary file with weights. Technically not a singular file format. |
| TF1 Hub format (deprecated) | Yes | TensorFlow | - | - | Description: Custom Protobuf format. |
| Tensorizer | Yes | CoreWeave | - | - | Not uncommon especially in private production systems |
| TFRecords | Yes | TensorFlow | - | .tfrecords | Description: Wrapper around a Protocol Buffer |
| NPY | Yes | NumPy | - | .npy | Used to integrate pickle by default as well. |
| NPZ | Yes | NumPy | - | .npz | Description: ZIP file of NPY files |
| GGUF | Yes | llama.cpp/GGML | - | .gguf | - |
| GGML | Yes | llama.cpp/GGML | - | .ggml | - |
| GGMF (deprecated) | Yes | llama.cpp/GGML | - | .ggmf | - |
| GGJT (deprecated) | Yes | llama.cpp/GGML | - | .ggjt | - |
| NetCDF | Yes | - | - | .nc | - |
| PMML | Yes | - | - | - | - |
| MLeap | Yes | Spark | - | .mleap | - |
| CoreML | Yes | Apple | - | .coreml | - |
| MLFlow Format | Yes | MLFlow | - | - | - |
| MLFlow TensorSpec input format | Yes | MLFlow | - | - | - |
| SurrealML | Yes | SurrealDB | - | .surml | - |
| Llamafile | Yes | - | - | .llamafile | - |
| .prompt | Yes | HumanLoop | - | .prompt | - |
| Pickle | No | Python | PolyFile | .pkl | Refer to Fickling |
| Joblib | No | - | PolyFile | - | - |
| Nemo | Yes | NVIDIA | - | - | - |
| Riva | Yes | NVIDIA | - | - | - |
| AVRO | No | - | - | - | - |
| PARQUET | No | - | - | - | - |
| ORC | No | - | - | - | - |
| JSON | No | - | PolyFile | - | - |
| CSV | No | - | - | - | - |
| Protocol Buffers | No | - | - | - | Usually an underlying file format |
| HDF5 | No | - | - | .h5 | - |
| Caffe | Yes | Caffe | - | .caffemodel & .prototxt | Description: Protobuf-based file format |
| ArmNN Flatbuffers | Yes | ArmNN | - | - | - |
| Cambricon | Yes | - | - | - | - |
| Circle | Yes | - | - | - | - |
| ZIP | No | - | PolyFile | - | Usually an underlying file format |
| CNTK v1 (deprecated) | Yes | Microsoft Cognitive Toolkit | - | - | - |
| CNTK v2 | Yes | Microsoft Cognitive Toolkit | - | - | Description: Protobuf-based file format |
| Darknet | Yes | Hank.ai Darknet | - | - | - |
| DL4J | Yes | DL4J | - | - | Description: ZIP-based file format |
| Deep Learning Container (DLC) | Yes | Qualcomm Neural Processing SDK | - | .dlc | - |