Skip to content

Commit bc2b1bf

Browse files
committed
fix typo
Signed-off-by: yiliu30 <[email protected]>
1 parent 41d927c commit bc2b1bf

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ model = GenericLoraKbitModel('tiiuae/falcon-7b')
8080
model.finetune(dataset)
8181
```
8282

83-
4. __CPU inference__ - The CPU, including notebook CPUs, is now fully equipped to handle LLM inference. We integrated [Itrex](https://github.com/intel/intel-extension-for-transformers) to conserve memory by compressing the model with [weight-only quantization algorithms](https://github.com/intel/intel-extension-for-transformers/blob/main/docs/weightonlyquant.md) and accelerate the inference by leveraging its highly optimized kernel on Intel platforms.
83+
4. __CPU inference__ - The CPU, including laptop CPUs, is now fully equipped to handle LLM inference. We integrated [Itrex](https://github.com/intel/intel-extension-for-transformers) to conserve memory by compressing the model with [weight-only quantization algorithms](https://github.com/intel/intel-extension-for-transformers/blob/main/docs/weightonlyquant.md) and accelerate the inference by leveraging its highly optimized kernel on Intel platforms.
8484

8585
5. __Batch integration__ - By tweaking the 'batch_size' in the .generate() and .evaluate() functions, you can expedite results. Using a 'batch_size' greater than 1 typically enhances processing efficiency.
8686
```python

0 commit comments

Comments
 (0)