Skip to content

Commit ecbee1e

Browse files
authored
Update README.md
1 parent 7c90f61 commit ecbee1e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ The current release supports:
1414

1515
- Llama-2 and Mistral based models.
1616
- Memory efficient 16-bit + 1-bit Δ Linear in PyTorch
17-
- Triton kernel for fast inference
17+
- Triton kernel for fast inference (TODO: Update repo with faster [BitBLAS](https://github.com/microsoft/BitBLAS) W1A16 kernel)
1818
- Gradio demo showcasing batched inference over 6 Mistral-7B based models, using only **30 GB** of GPU memory!
1919

2020
## News

0 commit comments

Comments
 (0)