-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Local LLMs either raise Timeout error or Fails to parse output.
Ragas version: 0.1.15
Python version: 3.11.3
Code to Reproduce
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3")
import pandas as pd
df = pd.read_csv("output.csv", sep=";")
data_samples = {
'question': df['question'].tolist(),
'answer': df['answer'].tolist(),
'contexts': df['contexts'].apply(lambda x: [x] if isinstance(x, str) else x).tolist(),
'ground_truth': df['ground_truth'].tolist()
}
from datasets import Dataset
dataset = Dataset.from_dict(data_samples)
from ragas import evaluate
from ragas.metrics import (faithfulness,
answer_correctness,
answer_relevancy,
context_recall,
context_precision)
from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint
end = HuggingFaceEndpoint(repo_id="mistralai/Mistral-7B-Instruct-v0.3", max_new_tokens=512)
huggingface_llm = ChatHuggingFace(llm=end, tokenizer=tokenizer)
huggingface_embeddings = HuggingFaceEmbeddings(model_name="nomic-ai/nomic-embed-text-v1.5")
metrics=[faithfulness,
answer_correctness,
answer_relevancy,
context_recall,
context_precision]
score = evaluate(dataset=dataset,
metrics=metrics,
llm=huggingface_llm,
embeddings=huggingface_embeddings,
raise_exceptions=False
)Error trace
Failed to parse output. Returning None.
Failed to parse output. Returning None.
Failed to parse output. Returning None.
Exception raised in Job[304]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Failed to parse output. Returning None.
Exception raised in Job[444]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Exception raised in Job[169]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
Failed to parse output. Returning None.
Failed to parse output. Returning None..
Exception raised in Job[309]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Failed to parse output. Returning None.
Exception raised in Job[174]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Failed to parse output. Returning None.
Failed to parse output. Returning None.
Exception raised in Job[449]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Failed to parse output. Returning None.
Exception raised in Job[179]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Failed to parse output. Returning None.
Exception raised in Job[314]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Failed to parse output. Returning None.
Failed to parse output. Returning None.
Exception raised in Job[184]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Failed to parse output. Returning None.
Failed to parse output. Returning None.
Failed to parse output. Returning None.
Failed to parse output. Returning None.
Failed to parse output. Returning None.
Exception raised in Job[454]: ValidationError(2 validation errors for ContextPrecisionVerifications
__root__ -> 0 -> reason
field required (type=value_error.missing)
__root__ -> 0 -> verdict
field required (type=value_error.missing))
Failed to parse output. Returning None.
Exception raised in Job[461]: ClientResponseError(429, message='Too Many Requests', url=URL('https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.3'))
Exception raised in Job[196]: ClientResponseError(429, message='Too Many Requests', url=URL('https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.3'))
Exception raised in Job[462]: ClientResponseError(429, message='Too Many Requests', url=URL('https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.3'))
Additional context
it only evaluates for answer_correctness, other values are all NaN
dosubot and yifan0011
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working