Skip to content

Commit 076f411

Browse files
committed
fix: title and missing bibtex
Signed-off-by: Kai Xu <me@xuk.ai>
1 parent 712e3c9 commit 076f411

File tree

3 files changed

+15
-2
lines changed

3 files changed

+15
-2
lines changed

_posts/2025-02-06-r1-like-reasoning-update-1.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives) - Update 1
2+
title: Update 1 - Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)
33
date: 2025-02-06
44
---
55

_posts/2025-02-07-r1-like-reasoning-update-2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives) - Update 2
2+
title: Update 2 - Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)
33
date: 2025-02-07
44
---
55

_posts/2025-02-17-r1-like-reasoning-update-3.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,16 @@ Inference-time scaling can be computationally expensive, requiring both the **tr
4040
In other words, **can we train the model to not only generate responses but also judge and refine its own drafts?** This could effectively **amortize** the cost of inference-time scaling by training the model to perform this reasoning process upfront.
4141

4242
Moving forward, we’ll be focusing our efforts on testing this hypothesis and will continue sharing our findings. Stay tuned!
43+
44+
---
45+
46+
If you want to cite our work, you can use the following BibTeX entry of the original blog post.
47+
48+
```bibtex
49+
@misc{srivastava2024lessonsonreproducing,
50+
title={Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)},
51+
author={Akash Srivastava, Isha Puri, Kai Xu, Shivchander Sudalairaj, Mustafa Eyceoz, Oleg Silkin, Abhishek Bhandwaldar, Aldo Genaro Pareja Cardona and GX Xu},
52+
url={https://red-hat-ai-innovation-team.github.io/posts/r1-like-reasoning},
53+
year={2025},
54+
}
55+
```

0 commit comments

Comments
 (0)