Offline Evaluations on MTEB

Hello, I'm trying to run MTEB on a cluster without internet access, but I am struggling. Here are the following instructions I've followed:

1. `$ pip install mteb`
2. `$ !pip install --upgrade git+https://github.com/Muennighoff/mteb.git@offlineaccess`
3. Install banking77 dataset from huggingface: `$ git clone git@hf.co:datasets/mteb/banking77`

I then run:

```python
import torch
from llm2vec import LLM2Vec
from mteb import MTEB


l2v = LLM2Vec.from_pretrained(
    "<path to model>/llama3-llm2vec",
    peft_model_name_or_path="<path to model>/llama3-llm2vec-unsup",
    attn_implementation="flash_attention_2",
    device_map="cuda" if torch.cuda.is_available() else "cpu",
    torch_dtype=torch.bfloat16,
)


MODEL_NAME = "llama3-llm2vec"
evaluation = MTEB(tasks=["Banking77Classification"])
results = evaluation.run(l2v, output_folder=f"results/{MODEL_NAME}")
```
with the following environment variables set in my .bashrc:
```shell
export HF_HOME="<path to model>/cache/"
export HF_DATASETS_CACHE="<path to data>/big_data/hf/"
export TRANSFORMERS_CACHE="<path to model>/cache/"
export HF_HUB_OFFLINE=1 # 1 means offline.
export HF_DATASETS_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
```

But I get this error:

`Error while evaluating Banking77Classification: Couldn't find a dataset script at <path to data>/mteb/banking77/banking77.py or any data file in the same directory.`

I'm not sure where I obtain the script:  ```banking77.py``` or if I am properly following the offline evaluation instructions. Any help would be immensely appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offline Evaluations on MTEB #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Offline Evaluations on MTEB #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions