-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
A clear and concise description of what the bug is.
I tried using RAGAS with a model that is not OpenAI. In general whatever model I use I get this error back:
File /opt/conda/lib/python3.10/site-packages/ragas/evaluation.py:237, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, is_async, run_config, raise_exceptions, column_map)
235 results = executor.results()
236 if results == []:
--> 237 raise ExceptionInRunner()
239 # convert results to dataset_like
240 for i, _ in enumerate(dataset):
ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass raise_exceptions=False incase you want to show only a warning message instead.
/opt/conda/lib/python3.10/site-packages/ipykernel/iostream.py:123: RuntimeWarning: coroutine 'as_completed.<locals>.sema_coro' was never awaited
await self._event_pipe_gc()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Which I solved using this:
import nest_asyncio
nest_asyncio.apply()
However, it is not returning error but it is returning:
{'faithfulness': nan, 'answer_relevancy': nan, 'context_utilization': nan}
Code to Reproduce
import pandas as pd
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
from sentence_transformers import SentenceTransformer
from langchain import HuggingFacePipeline
from ragas.metrics import (
answer_relevancy,
faithfulness,
context_recall,
context_precision,
context_utilization
)
from ragas import evaluate
from datasets import Dataset
import nest_asyncio
nest_asyncio.apply()
# embedding model
embedding_model = SentenceTransformer("microsoft/mpnet-base")
# evaluator
model_id = "mistralai/Mistral-7B-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
device = 0 # Use GPU (0 is typically the first GPU device)
pipe = pipeline(
model=model,
tokenizer=tokenizer,
return_full_text=True, # langchain expects the full text
task='text-generation',
temperature=0.1,
do_sample=True,
max_new_tokens = 200,
repetition_penalty=1.1 # without this output begins repeating
)
evaluator = HuggingFacePipeline(pipeline=pipe)
data_samples = {
'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
'contexts' : [['The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,'],
['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']],
}
dataset = Dataset.from_dict(data_samples)
# ragas
result = evaluate(
dataset=dataset,
llm=evaluator,
embeddings=embedding_model,
raise_exceptions=False,
metrics=[
faithfulness,
answer_relevancy,
context_utilization,
]
)
print(result)
Error trace
No error, but basically is not working
Expected behavior
It should return the evaluation metrics
Thank you very much for your help
dosubot, EddyJens and Monknaru
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working