This project aims to identify and analyze super weights in the context of Large Language Models (LLMs). Super weights are defined as weights that have a significant impact on the model's performance. The project provides a Jupyter notebook for users to run the analysis on their own models. It is unoficial implementation of the paper The Super Weight in Large Language Models.
- Python 3.12
- CUDA 12.8 (for GPU support)
- UV
-
Clone the repository:
git clone https://github.com/aerubanov/SmolLMSuperWeight.git cd SmolLMSuperWeight -
Install dependencies:
uv sync --extra cu128
of for cpu only
uv sync --extra cpu
-
Open SmolLM-super-weight.ipynb in VSCode with Jupyter extension and run the notebook.
- Original paper: https://arxiv.org/abs/2411.07191
- Original code: https://github.com/mengxiayu/LLMSuperWeight