Skip to content

Commit 9389ca5

Browse files
authored
simplify repoqa html
1 parent 2fe3071 commit 9389ca5

File tree

1 file changed

+11
-28
lines changed

1 file changed

+11
-28
lines changed

repoqa.html

Lines changed: 11 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,12 @@ <h1 class="text-nowrap mt-5">
151151
🚩The <i>First</i> Benchmark for Long-Context Code Understanding.🚩<br />
152152
</div>
153153
<div class="d-flex flex-row justify-content-center gap-3">
154-
<a href="https://github.com/evalplus/repoqa"
154+
<a href="https://arxiv.org/abs/2406.06025"
155+
><img
156+
src="https://img.shields.io/badge/arXiv-2406.06025-b31b1b.svg?style=for-the-badge"
157+
alt="arxiv"
158+
class="img-fluid" /></a
159+
><a href="https://github.com/evalplus/repoqa"
155160
><img
156161
src="https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white"
157162
alt="github"
@@ -167,9 +172,8 @@ <h1 class="text-nowrap mt-5">
167172
<div class="container-fluid d-flex flex-row flex-nowrap">
168173
<div class="container-fluid d-flex flex-column align-items-center">
169174
<p>
170-
<b>🔊 The goal of RepoQA:</b> is to create a series of long-context
171-
code understanding tasks to challenge chat/instruction models for
172-
code:
175+
RepoQA aims to create a series of long-context code understanding tasks
176+
to challenge chat/instruction models for code:
173177
</p>
174178
<ul>
175179
<li>
@@ -299,17 +303,13 @@ <h2 id="faq" class="text-nowrap mt-5">🙋🏻‍♀️ FAQ</h2>
299303
<h3 id="yet-another" class="text-nowrap mt-5">
300304
Just yet another needle test?
301305
</h3>
302-
No. Here are some notes:
303306
<ul>
304307
<li>
305-
<b>SNF != RepoQA, SNF \in RepoQA:</b> Yes, SNF is a variant of
306-
needle test, but SNF != RepoQA. SNF is a start point and
307-
elementary test:
308+
SNF is a variant of needle test and is part of RepoQA as the elementary test:
308309
<b
309310
>if a model can't pass SNF, don't expect it to pass more
310311
challenging tasks.</b
311312
>
312-
We will build more challenging tasks in the future.
313313
</li>
314314
<li>
315315
Unlike vanilla needle tests which use single test to perform fully
@@ -339,25 +339,8 @@ <h3 id="limit" class="text-nowrap mt-5">Known limitations</h3>
339339
</ul>
340340
<h2 id="sponsor" class="text-nowrap mt-5">🤗 Acknowledgment</h2>
341341
<p>
342-
Running long-context evaluations can be costly -- we thank
343-
<a href="https://deepmind.google/">Google DeepMind</a>
344-
and
345-
<a href="https://openai.com/form/researcher-access-program/"
346-
>OpenAI Researcher Access Program</a
347-
>
348-
for their generous API credits!
349-
</p>
350-
<p>
351-
Meanwhile, note that RepoQA is a transparent research project
352-
started by students at UIUC. We assure the reproducibility and
353-
fairness of the evaluation as well as the indenpendence of our
354-
benchmark design that none of these will be optimized or compromised
355-
for models from specific organizations. The outputs and results of
356-
evaluated models can be found at our
357-
<a
358-
href="https://github.com/evalplus/repoqa/releases/tag/dev-results"
359-
>GitHub release page</a
360-
>.
342+
Part of the compute is generously provided by <a href="https://deepmind.google/">Google DeepMind</a>
343+
and <a href="https://wandb.ai/site">Weights & Biases</a>.
361344
</p>
362345
</div>
363346
</div>

0 commit comments

Comments
 (0)