-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Hello, I'm trying to run MTEB on a cluster without internet access, but I am struggling. Here are the following instructions I've followed:
$ pip install mteb$ !pip install --upgrade git+https://github.com/Muennighoff/mteb.git@offlineaccess- Install banking77 dataset from huggingface:
$ git clone [email protected]:datasets/mteb/banking77
I then run:
import torch
from llm2vec import LLM2Vec
from mteb import MTEB
l2v = LLM2Vec.from_pretrained(
"<path to model>/llama3-llm2vec",
peft_model_name_or_path="<path to model>/llama3-llm2vec-unsup",
attn_implementation="flash_attention_2",
device_map="cuda" if torch.cuda.is_available() else "cpu",
torch_dtype=torch.bfloat16,
)
MODEL_NAME = "llama3-llm2vec"
evaluation = MTEB(tasks=["Banking77Classification"])
results = evaluation.run(l2v, output_folder=f"results/{MODEL_NAME}")with the following environment variables set in my .bashrc:
export HF_HOME="<path to model>/cache/"
export HF_DATASETS_CACHE="<path to data>/big_data/hf/"
export TRANSFORMERS_CACHE="<path to model>/cache/"
export HF_HUB_OFFLINE=1 # 1 means offline.
export HF_DATASETS_OFFLINE=1
export TRANSFORMERS_OFFLINE=1But I get this error:
Error while evaluating Banking77Classification: Couldn't find a dataset script at <path to data>/mteb/banking77/banking77.py or any data file in the same directory.
I'm not sure where I obtain the script: banking77.py or if I am properly following the offline evaluation instructions. Any help would be immensely appreciated.
Metadata
Metadata
Assignees
Labels
No labels