yay

Evan Frick · Evan Frick · commit eabe0cc99f76 · 2025-01-23T00:15:21.000-08:00
diff --git a/index.html b/index.html
@@ -54,13 +54,13 @@ <h1 class="header">About</h1>
     <section id="publications">
       <h1 class="header">Select Publications</h1>
       <div class="hero">
-        <b><a href="https://arxiv.org/abs/2410.14872">How to Evaluate Reward Models for RLHF [In Review]</a></b>
+        <b><a href="https://arxiv.org/abs/2410.14872">How to Evaluate Reward Models for RLHF [ICLR 2025]</a></b>
         <p>Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios N. Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. (2024).</p>
         <br>
         <b><a href="https://openreview.net/pdf?id=GqDntYTTbk">Starling-7B: Improving Helpfulness and Harmlessness with RLAIF. [COLM Spotlight]</a></b>
         <p>Banghua Zhu*, Evan Frick*, Tianhao Wu*, Hanlin Zhu, Karthik Ganesan, Wei-Lin Chiang, Jian Zhang, and Jiantao Jiao. (2024).</p>
         <br>
-        <b><a href="https://arxiv.org/abs/2406.11939">From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline. [In Review]</a></b>
+        <b><a href="https://arxiv.org/abs/2406.11939">From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline.</a></b>
         <p>Tianle Li*, Wei-Lin Chiang*, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, and Ion Stoica. (2024).</p>
         <br>
         <b><a href="https://nexusflow.ai/blogs/athene"> Athene-70B: Redefining the Boundaries of Post-Training for Open Models.</a></b>
@@ -71,7 +71,7 @@ <h1 class="header">Select Publications</h1>
     <section id="experience">
       <h1 class="header">Experience</h1>
       <div class="hero">
-        <b>Sky Lab</b> | Researcher |  May 24 - Present | Part-time
+        <b>Sky Lab</b> | Researcher |  Apr 24 - Present | Part-time
         <p>Working on Chatbot Arena and model evaluations.</p>
         <br>
         <b>Nexusflow</b> | Machine Learning Engineer | Nov 23 - Present | Full-time