Utility-focused-annotation

[🎉 2025-09] MS MARCO and NQ annotations are released in Hugging Face.
[🎉 2025-08] Our paper "Utility-Focused LLM Annotation for Retrieval and Retrieval-Augmented Generation" has been accepted at #EMNLP25 for the main conference!

Overview

This repository contains the code, datasets, and models used in our paper: "Utility-Focused LLM Annotation for Retrieval and Retrieval-Augmented Generation".

This paper explores the use of large language models (LLMs) for annotating document utility in training retrieval and retrieval-augmented generation (RAG) systems, aiming to reduce dependence on costly human annotations. We address the gap between retrieval relevance and generative utility by employing LLMs to annotate document utility. To effectively utilize multiple positive samples per query, we introduce a novel loss that maximizes their summed marginal likelihood. Using the Qwen-2.5-32B model, we annotate utility on the MS MARCO dataset and conduct retrieval experiments on MS MARCO and BEIR, as well as RAG experiments on MS MARCO QA, NQ, and HotpotQA. Our results show that LLM-generated annotations enhance out-of-domain retrieval performance and improve RAG outcomes compared to models trained solely on human annotations or downstream QA metrics. Furthermore, combining LLM annotations with just 20% of human labels achieves performance comparable to using full human annotations. Our study offers a comprehensive approach to utilizing LLM annotations for initializing QA systems on new corpora.

Download dataset

We utilize in-domain settings (MSMARCO v1 and TREC-DL NQ) and out-of-domain settings (BEIR) on both the retrieval and RAG tasks.

LLMs Annotations

We use the hard negative samples provided by Tevatron official's repository. The prompts used in our paper are shown in prompts.md.

Retrievers Training

We use the RetroMAE and Contriever as our retriever backbone, which can be downloaded on RetroMAE Pre-training on MSMARCO Passage and Contriever

sh run.sh

Checkpoint

Currently, all the LLM annotated positive labels and the models' checkpoints are in the Hugging Face.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
bi-encode		bi-encode
utility-annotation-code		utility-annotation-code
README.md		README.md
prompts.md		prompts.md
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Utility-focused-annotation

Overview

Download dataset

LLMs Annotations

Retrievers Training

Checkpoint

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Utility-focused-annotation

Overview

Download dataset

LLMs Annotations

Retrievers Training

Checkpoint

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages