Skip to content

Commit ea66dc5

Browse files
wjayeshstrickvl
andauthored
Apply suggestions from code review
Co-authored-by: Alex Strick van Linschoten <[email protected]>
1 parent 5af607d commit ea66dc5

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

gamesense/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -305,7 +305,7 @@ python run.py --config phi3.5_finetune_cpu.yaml
305305

306306
Note that training on CPU will be significantly slower than training on a GPU. The CPU configuration uses:
307307

308-
1. A smaller model (Phi-3.5-mini-instruct) which is more CPU-friendly
308+
1. A smaller model (`phi-3.5-mini-instruct`) which is more CPU-friendly
309309
2. Reduced batch size and increased gradient accumulation steps
310310
3. Fewer total training steps (50 instead of 300)
311311
4. Half-precision (float16) where possible to reduce memory usage
@@ -315,14 +315,14 @@ Note that training on CPU will be significantly slower than training on a GPU. T
315315
For best results, we recommend:
316316
- Using a machine with at least 16GB of RAM
317317
- Being patient! LLM training on CPU is much slower than on GPU
318-
- If you still encounter memory issues, try reducing the max_train_samples parameter even further in the config file
318+
- If you still encounter memory issues, try reducing the `max_train_samples` parameter even further in the config file
319319

320320
### Known Issues and Workarounds
321321

322322
Some large language models like Phi-3.5 have caching mechanisms that are optimized for GPU usage and may encounter issues when running on CPU. Our CPU configuration includes several workarounds:
323323

324324
1. Disabling KV caching for model generation
325-
2. Using torch.float16 data type to reduce memory usage
325+
2. Using `torch.float16 data` type to reduce memory usage
326326
3. Disabling flash attention which isn't needed on CPU
327327
4. Using standard AdamW optimizer instead of 8-bit optimizers that require GPU
328328

0 commit comments

Comments
 (0)