M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization

This repository contains the pseudo quantization workflow for M2XFP and provides a lightweight way to evaluate accuracy (e.g., perplexity) on LLaMA-3 and other LLMs.

Setup

Create and activate conda environment:

conda create -n mxq python=3.10
conda activate mxq

Install the package in development mode:

pip install vllm==0.7.0 --extra-index-url https://download.pytorch.org/whl/cu128
pip install -e .

Usage

Run the main quantization workflow:

# Perplexity evaluation on WikiText for LLaMA-3
bash llama3_run.sh wikitext

# Reasoning benchmarks
bash reasoning.sh

Structure

entry.py - Main entry point for quantization
llama3_run.sh - An example script to run Llama3 quantization
quantize/ - Core quantization modules
- quant_func.py - Quantization configuration and functions
- quantizer.py - Main quantization logic
- linear.py - Quantized linear layer implementation
- pre_quant.py - Pre-quantization utilities
utils/ - Utility modules
- module.py - Module manipulation utilities
- dataload_utils.py - Data loading utilities
- parallel.py - Parallel processing utilities
- calib_data.py - Calibration data handling
- utils.py - General utilities

Citation

If you find this repository useful in your research or project, please kindly cite:

@misc{hu2026m2xfpmetadataaugmentedmicroscalingdata,
  title={M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization}, 
  author={Weiming Hu and Zihan Zhang and Haoyan Zhang and Chen Zhang and Cong Guo and Yu Feng and Tianchi Hu and Guanglin Li and Guipeng Hu and Junsong Wang and Jingwen Leng},
  year={2026},
  eprint={2601.19213},
  archivePrefix={arXiv},
  primaryClass={cs.AR},
  url={https://arxiv.org/abs/2601.19213}, 
}

Acknowledgements

We sincerely thank the authors and contributors of the following open-source projects. Our implementation builds upon their excellent codebases:

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
mxq		mxq
LICENSE		LICENSE
README.md		README.md
llama3_run.sh		llama3_run.sh
pyproject.toml		pyproject.toml
reasoning.sh		reasoning.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization

Setup

Usage

Structure

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization

Setup

Usage

Structure

Citation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages