Skip to content

Commit 84b6377

Browse files
authored
Add Unlimiformer paper summary for retrieval research. (#3167)
1 parent cebeb55 commit 84b6377

File tree

1 file changed

+14
-2
lines changed

1 file changed

+14
-2
lines changed

docs/docs/research/retrieval.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -113,11 +113,12 @@ index needs to be re-updated during training.
113113
- REALM: [https://arxiv.org/abs/2002.08909](https://arxiv.org/abs/2002.08909)
114114
- RAG: [https://arxiv.org/abs/2005.11401](https://arxiv.org/abs/2005.11401)
115115
- Atlas [https://arxiv.org/abs/2208.03299](https://arxiv.org/abs/2208.03299)
116-
- ...
116+
- Unilimiformer
117+
[http://arxiv.org/abs/2305.01625](http://arxiv.org/abs/2305.01625)
117118

118119
## Paper summaries
119120

120-
### Borgeaud et al 2020.: Improving Language Models by Retrieving from Trillions of Tokens - "RETRO"
121+
### Borgeaud et al. 2020: Improving Language Models by Retrieving from Trillions of Tokens - "RETRO"
121122

122123
Idea: Use BERT (Devlin et al. 2018) as a contextual encoder for chunks of size
123124
64 of the training data. Then train an encoder-decoder transformer model with
@@ -135,6 +136,17 @@ i.e. the 7B can utilize 40 nearest neighbor chunks, a 172M model only 10 NNs.
135136

136137
[http://arxiv.org/abs/2112.04426](http://arxiv.org/abs/2112.04426)
137138

139+
### Bertsch et al. 2023: Unlimiformer: Long-Range Transformers with Unlimited Length Input
140+
141+
Idea: Use retrieval to actually maximize overlap of "query embeddings" with
142+
embeddings from an encoder (in a encoder-decoder architecture). Essentially it
143+
is an ideal approximation of the softmax in the Cross-Attention over all
144+
previous tokens (in the encoder inputs).
145+
146+
Code:
147+
[https://github.com/abertsch72/unlimiformer](https://github.com/abertsch72/unlimiformer)
148+
Paper: [http://arxiv.org/abs/2305.01625](http://arxiv.org/abs/2305.01625)
149+
138150
### Izacard et al. 2022: Unsupervised Dense Information Retrieval with Contrastive Learning - "Contriver"
139151

140152
They present Contriver, an open-source implementation of their novel approach to

0 commit comments

Comments
 (0)