Skip to content

Latest commit

 

History

History
38 lines (24 loc) · 704 Bytes

File metadata and controls

38 lines (24 loc) · 704 Bytes

distill-llm

Example of distilling LLM knowledge using LoRa

Data

We use this dataset juancavallotti/multilingual-gec from the huggingface Hub. It is a synthetic grammar correction dataset.

Install

Torch

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

Other libs:

pip install -r requirements.txt

Run

All steps to run the experiments are listed in order in the file scripts/run.sh that you can run as bash scripts/run.sh

Results

% of exact match between ground truth and prediction:

LLama 2–70B: 42%
Base Tiny-LLama: 11%
Distilled Tiny-LLama: 31%

Blog

https://medium.com/p/12e954d256c2