Skip to content

Conversation

uSaiPrashanth
Copy link
Collaborator

Other modifications worth mentioning:

  • Changed scripts/convert_hf_checkpoint.py to support loading of finetuned Llama-3 models from safetensors state dict
  • Added finetuned configs to model.py (Finetuned models use a vocab size of one token higher than Llama 3)

Griffin Adams and others added 30 commits May 20, 2024 19:35
- Creates cache.py
- Introduces global_tokens
- Formats repo with ruff
- Speed parity with full KV-cache
Implements window KV-Cache Compression Strategy
… Hitter methods which require attention probs.
Implement Scissorhands KV-cache compression & SnapKV prompt compression
commit 0528eef
Author: Faisal Ladhak <[email protected]>
Date:   Thu Jun 13 20:30:35 2024 +0000

    Reformatting with ruff

commit 4c3cfc5
Author: Faisal Ladhak <[email protected]>
Date:   Thu Jun 13 04:35:05 2024 +0000

    Added support for Qwen2 models.
Adding code for evaluation, along with Squality and refernce-based metrics.
- This is necessary for training models outside of eval.py
Update Tasks to allow for train, val, test splits to be used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants