You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: results/README.md
+141-1Lines changed: 141 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ This document contains the results of the Model2Vec project. The results are pre
5
5
-[Training Results](#training-results)
6
6
-[Ablations](#ablations)
7
7
8
-
## MTEB Results
8
+
## MTEB Results (English)
9
9
10
10
Model2Vec is evaluated on MTEB, as well as two additional tasks: [PEARL](https://github.com/tigerchen52/PEARL) (a phrase representation task) and WordSim (a collection of _word_ similarity tasks). The results are shown in the table below.
11
11
@@ -52,6 +52,146 @@ NOTE: for fairness of comparison, we disabled multiprocessing for Model2Vec for
52
52
|*Figure: The average MTEB score plotted against sentences per second. The circle size indicates model size.*|
53
53
54
54
55
+
### MMTEB Results (Multilingual)
56
+
The results for the multilingual models are shown in the table below. We compare against the [LaBSE](https://huggingface.co/sentence-transformers/LaBSE) model, as well as other multilingual static embedding models.
57
+
58
+
| Model | Mean (Task) | Mean (TaskType) | BitMining | Class | Clust | InstRet | MuliClass | PairClass | Rank | Ret | STS |
As can be seen, [potion-multilingual-128M](https://huggingface.co/minishlab/potion-multilingual-128M) is the most performant static multilingual model, reaching 90.86% of the performance of [LaBSE](https://huggingface.co/sentence-transformers/LaBSE). There are differences per task. The [static-similarity-mrl-multilingual-v1](https://huggingface.co/sentence-transformers/static-similarity-mrl-multilingual-v1) model is better for retrieval and STS tasks (which can be explained by the fact that it's trained for STS), while the [potion-multilingual-128M](https://huggingface.co/minishlab/potion-multilingual-128M) model is better for classification and clustering tasks. It is important to note that the [potion-multilingual-128M](https://huggingface.co/minishlab/potion-multilingual-128M) model supports a total of 101 languages, while [static-similarity-mrl-multilingual-v1](https://huggingface.co/sentence-transformers/static-similarity-mrl-multilingual-v1) supports only 50 languages. It is also important to note that MMTEB does not include tasks for every language, and there may be a bias towards larger languages.
66
+
67
+
68
+
<details>
69
+
<summary> Task Abbreviations </summary>
70
+
71
+
For readability, the MMTEB task names are abbreviated as follows:
72
+
73
+
- BitMining: Bitext Mining
74
+
- Class: Classification
75
+
- Clust: Clustering
76
+
- InstRet: Instruction Retrieval
77
+
- MuliClass: Multilabel Classification
78
+
- PairClass: PairClassification
79
+
- Rank: Reranking
80
+
- Ret: Retrieval
81
+
- STS: Semantic Textual Similarity
82
+
83
+
</details>
84
+
85
+
<details>
86
+
<summary> Supported Languages </summary>
87
+
88
+
The languages supported by the [potion-multilingual-128M](https://huggingface.co/minishlab/potion-multilingual-128M) model are:
89
+
90
+
- en
91
+
- multilingual
92
+
- af
93
+
- am
94
+
- ar
95
+
- az
96
+
- be
97
+
- bg
98
+
- bn
99
+
- ca
100
+
- ceb
101
+
- co
102
+
- cs
103
+
- cy
104
+
- da
105
+
- de
106
+
- el
107
+
- eo
108
+
- es
109
+
- et
110
+
- eu
111
+
- fa
112
+
- fi
113
+
- fil
114
+
- fr
115
+
- fy
116
+
- ga
117
+
- gd
118
+
- gl
119
+
- gu
120
+
- ha
121
+
- haw
122
+
- hi
123
+
- hmn
124
+
- ht
125
+
- hu
126
+
- hy
127
+
- id
128
+
- ig
129
+
- is
130
+
- it
131
+
- iw
132
+
- ja
133
+
- jv
134
+
- ka
135
+
- kk
136
+
- km
137
+
- kn
138
+
- ko
139
+
- ku
140
+
- ky
141
+
- la
142
+
- lb
143
+
- lo
144
+
- lt
145
+
- lv
146
+
- mg
147
+
- mi
148
+
- mk
149
+
- ml
150
+
- mn
151
+
- mr
152
+
- ms
153
+
- mt
154
+
- my
155
+
- ne
156
+
- nl
157
+
- no
158
+
- ny
159
+
- pa
160
+
- pl
161
+
- ps
162
+
- pt
163
+
- ro
164
+
- ru
165
+
- sd
166
+
- si
167
+
- sk
168
+
- sl
169
+
- sm
170
+
- sn
171
+
- so
172
+
- sq
173
+
- sr
174
+
- st
175
+
- su
176
+
- sv
177
+
- sw
178
+
- ta
179
+
- te
180
+
- tg
181
+
- th
182
+
- tr
183
+
- uk
184
+
- ur
185
+
- uz
186
+
- vi
187
+
- xh
188
+
- yi
189
+
- yo
190
+
- zh
191
+
- zu
192
+
193
+
</details>
194
+
55
195
### Retrieval Results
56
196
57
197
A subset of models we created and compare against are specifically designed for retrieval tasks. The results are shown in the table below, including two general-purpose models for comparison and a transformer.
0 commit comments