Skip to content

Commit d8ef452

Browse files
Evan FrickEvan Frick
authored andcommitted
update
1 parent 3aaa4eb commit d8ef452

File tree

1 file changed

+6
-3
lines changed

1 file changed

+6
-3
lines changed

index.html

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ <h1 class="header">About</h1>
4545
<br>
4646
<p>My research focuses on Reinforcement Learning with Human Feedback (RLHF) for fine-tuning LLMs. Currently, much of my efforts revolve around reward model training and benchmarking.</p>
4747
<br>
48-
<p>I am also a Research Engineer at <a href="https://nexusflow.ai/">Nexusflow</a>, where I work on training LLMs like <a href="https://huggingface.co/Nexusflow/Athene-70B">Athene-70B</a>. I also work with <a href="https://lmsys.org/">LMSYS</a>, mainly on analyzing <a href="https://lmarena.ai/">Chatbot Arena</a> and building LLM benchmarks.</p>
48+
<p>I am also a Research Engineer at <a href="https://nexusflow.ai/">Nexusflow</a>, where I work on training LLMs like <a href="https://huggingface.co/Nexusflow/Athene-70B">Athene-70B</a>. I also work with <a href="https://blog.lmarena.ai/about/">Chatbot Arena</a>, mainly on modeling human preferences and building LLM/RM benchmarks.</p>
4949
<!-- </div> -->
5050
</div>
5151
</div>
@@ -54,10 +54,13 @@ <h1 class="header">About</h1>
5454
<section id="publications">
5555
<h1 class="header">Select Publications</h1>
5656
<div class="hero">
57-
<b><a href="https://openreview.net/pdf?id=GqDntYTTbk">Starling-7B: Improving Helpfulness and Harmlessness with RLAIF. [COLM]</a></b>
57+
<b><a href="https://arxiv.org/abs/2410.14872">How to Evaluate Reward Models for RLHF [In Review]</a></b>
58+
<p>Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios N. Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. (2024).</p>
59+
<br>
60+
<b><a href="https://openreview.net/pdf?id=GqDntYTTbk">Starling-7B: Improving Helpfulness and Harmlessness with RLAIF. [COLM Spotlight]</a></b>
5861
<p>Banghua Zhu, Evan Frick, Tianhao Wu, Hanlin Zhu, Karthik Ganesan, Wei-Lin Chiang, Jian Zhang, and Jiantao Jiao. (2024).</p>
5962
<br>
60-
<b><a href="https://arxiv.org/abs/2406.11939">From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline. [Neurips: In Review]</a></b>
63+
<b><a href="https://arxiv.org/abs/2406.11939">From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline. [In Review]</a></b>
6164
<p>Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. (2024).</p>
6265
<br>
6366
<b><a href="https://nexusflow.ai/blogs/athene"> Athene-70B: Redefining the Boundaries of Post-Training for Open Models.</a></b>

0 commit comments

Comments
 (0)