You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: index.html
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -54,13 +54,13 @@ <h1 class="header">About</h1>
54
54
<sectionid="publications">
55
55
<h1class="header">Select Publications</h1>
56
56
<divclass="hero">
57
-
<b><ahref="https://arxiv.org/abs/2410.14872">How to Evaluate Reward Models for RLHF [In Review]</a></b>
57
+
<b><ahref="https://arxiv.org/abs/2410.14872">How to Evaluate Reward Models for RLHF [ICLR 2025]</a></b>
58
58
<p>Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios N. Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. (2024).</p>
59
59
<br>
60
60
<b><ahref="https://openreview.net/pdf?id=GqDntYTTbk">Starling-7B: Improving Helpfulness and Harmlessness with RLAIF. [COLM Spotlight]</a></b>
<b><ahref="https://arxiv.org/abs/2406.11939">From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline. [In Review]</a></b>
63
+
<b><ahref="https://arxiv.org/abs/2406.11939">From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline.</a></b>
64
64
<p>Tianle Li*, Wei-Lin Chiang*, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. (2024).</p>
65
65
<br>
66
66
<b><ahref="https://nexusflow.ai/blogs/athene"> Athene-70B: Redefining the Boundaries of Post-Training for Open Models.</a></b>
0 commit comments