Skip to content

Local LLM with Ragas evaluation issueΒ #1100

@SalwaMostafa

Description

@SalwaMostafa

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug
I am trying to use a local LLM in the evaluate function., where the LLM is imported from Langchain but it gives this error and I do not understand what I should do ? Should I use these wrappers

"langchain_llm = LangchainLLMWrapper(langchain_llm)
langchain_embeddings = LangchainEmbeddingsWrapper(langchain_embeddings)"

Evaluating: 0%| | 0/2 [00:00<?, ?it/s]GGML_ASSERT: /run/nvme/job_3577603/tmp/pip-install-aob_qv9b/llama-cpp-python_08d8f9c1210e4408948636399a5c41c8/vendor/llama.cpp/ggml/src/ggml.c:6142: mask->ne[0] == a->ne[0]
[New LWP 3089486]
[New LWP 3089488]
[New LWP 3089489]
[New LWP 3089490]
[New LWP 3089491]
[New LWP 3089492]
[New LWP 3089493]
[New LWP 3089494]
[New LWP 3089495]
[New LWP 3089509]
[New LWP 3089510]
[New LWP 3089512]
[New LWP 3089513]
[New LWP 3089514]
[New LWP 3089515]
[New LWP 3089516]
[New LWP 3089517]
[New LWP 3089518]
[New LWP 3089522]
[New LWP 3089523]
[New LWP 3089534]
[New LWP 3089537]
[New LWP 3089538]
[New LWP 3089539]
[New LWP 3089540]
[New LWP 3089541]
[New LWP 3089542]
[New LWP 3089544]
[New LWP 3089545]
[New LWP 3089546]
[New LWP 3089547]
[New LWP 3089548]
[New LWP 3089549]
[New LWP 3089550]
[New LWP 3089551]
[New LWP 3089552]
[New LWP 3089553]
[New LWP 3089554]
[New LWP 3089555]
[New LWP 3089556]
[New LWP 3089557]
[New LWP 3089558]
[New LWP 3089559]
[New LWP 3089560]
[New LWP 3089561]
[New LWP 3089562]
[New LWP 3089563]
[New LWP 3089564]
[New LWP 3089565]
[New LWP 3089566]
[New LWP 3089567]
[New LWP 3089568]
[New LWP 3089569]
[New LWP 3089570]
[New LWP 3089571]
[New LWP 3089572]
[New LWP 3089573]
[New LWP 3089574]
[New LWP 3089575]
[New LWP 3089576]
[New LWP 3089577]
[New LWP 3089578]
[New LWP 3089579]
[New LWP 3089580]
[New LWP 3089581]
[New LWP 3089582]
[New LWP 3089583]
[New LWP 3089584]
[New LWP 3089585]
[New LWP 3089586]
[New LWP 3089587]
[New LWP 3089588]
[New LWP 3089589]
[New LWP 3089590]
[New LWP 3089591]
[New LWP 3089592]
[New LWP 3089593]
[New LWP 3089594]
[New LWP 3089595]
[New LWP 3089596]
[New LWP 3089597]
[New LWP 3089598]
[New LWP 3089599]
[New LWP 3089600]
[New LWP 3089601]
[New LWP 3089602]
[New LWP 3089603]
[New LWP 3089604]
[New LWP 3089605]
[New LWP 3089606]
[New LWP 3089607]
[New LWP 3089608]
[New LWP 3089609]
[New LWP 3089610]
[New LWP 3089611]
[New LWP 3089612]
[New LWP 3089613]
[New LWP 3089614]
[New LWP 3089615]
[New LWP 3089616]
[New LWP 3089617]
[New LWP 3089618]
[New LWP 3089619]
[New LWP 3089620]
[New LWP 3089621]
[New LWP 3089622]
[New LWP 3089623]
[New LWP 3089624]
[New LWP 3089625]
[New LWP 3089626]
[New LWP 3089627]
[New LWP 3089628]
[New LWP 3089629]
[New LWP 3089630]
[New LWP 3089631]
[New LWP 3089632]
[New LWP 3089633]
[New LWP 3089634]
[New LWP 3089635]
[New LWP 3089636]
[New LWP 3089637]
[New LWP 3089638]
[New LWP 3089639]
[New LWP 3089640]
[New LWP 3089641]
[New LWP 3089642]
[New LWP 3089643]
[New LWP 3089644]
[New LWP 3089645]
[New LWP 3089646]
[New LWP 3089647]
[New LWP 3089648]
[New LWP 3089649]
[New LWP 3089650]
[New LWP 3089651]
[New LWP 3089652]
[New LWP 3089653]
[New LWP 3089654]
[New LWP 3089655]
[New LWP 3089656]
[New LWP 3089657]
[New LWP 3089658]
[New LWP 3089659]
[New LWP 3089660]
[New LWP 3089661]
[New LWP 3089662]
[New LWP 3089663]
[New LWP 3089664]
[New LWP 3089665]
[New LWP 3089666]
[New LWP 3089667]
[New LWP 3089668]
[New LWP 3089669]
[New LWP 3089670]
[New LWP 3089671]
[New LWP 3089672]
[New LWP 3089673]
[New LWP 3089674]
[New LWP 3089675]
[New LWP 3089676]
[New LWP 3089677]
[New LWP 3089678]
[New LWP 3089679]
[New LWP 3089680]
[New LWP 3089681]
[New LWP 3089682]
[New LWP 3089683]
[New LWP 3089684]
[New LWP 3089685]
[New LWP 3089686]
[New LWP 3089687]
[New LWP 3089688]
[New LWP 3089689]
[New LWP 3089690]
[New LWP 3089691]
[New LWP 3089692]
[New LWP 3089693]
[New LWP 3089694]
[New LWP 3089695]
[New LWP 3089696]
[New LWP 3089697]
[New LWP 3089698]
[New LWP 3089699]
[New LWP 3089700]
[New LWP 3089701]
[New LWP 3089702]
[New LWP 3089703]
[New LWP 3089704]
[New LWP 3089705]
[New LWP 3089706]
[New LWP 3089707]
[New LWP 3089708]
[New LWP 3089709]
[New LWP 3089710]
[New LWP 3089711]
[New LWP 3089712]
[New LWP 3089713]
[New LWP 3089714]
[New LWP 3089715]
[New LWP 3089716]
[New LWP 3089717]
[New LWP 3089718]
[New LWP 3089719]
[New LWP 3089720]
[New LWP 3089721]
[New LWP 3089722]
[New LWP 3089723]
[New LWP 3089724]
[New LWP 3089725]
[New LWP 3089726]
[New LWP 3089727]
[New LWP 3089728]
[New LWP 3089729]
[New LWP 3089730]
[New LWP 3089731]
[New LWP 3089732]
[New LWP 3089733]
[New LWP 3089734]
[New LWP 3089735]
[New LWP 3089736]
[New LWP 3089737]
[New LWP 3089738]
[New LWP 3089739]
[New LWP 3089740]
[New LWP 3089741]
[New LWP 3089742]
[New LWP 3089743]
[New LWP 3089744]
[New LWP 3089745]
[New LWP 3089746]
[New LWP 3089747]
[New LWP 3089748]
[New LWP 3089749]
[New LWP 3089750]
[New LWP 3089751]
[New LWP 3089752]
[New LWP 3089753]
[New LWP 3089754]
[New LWP 3089755]
[New LWP 3089756]
[New LWP 3089757]
[New LWP 3089758]
[New LWP 3089759]
[New LWP 3089760]
[New LWP 3089761]
[New LWP 3089762]
[New LWP 3089763]
[New LWP 3089764]
[New LWP 3089765]
[New LWP 3089766]
[New LWP 3089767]
[New LWP 3089768]
[New LWP 3089769]
[New LWP 3089770]
[New LWP 3089771]
[New LWP 3089772]
[New LWP 3089773]
[New LWP 3089774]
[New LWP 3089775]
[New LWP 3089776]
[New LWP 3089777]
[New LWP 3089778]
[New LWP 3089779]
[New LWP 3089780]
[New LWP 3089781]
[New LWP 3089782]
[New LWP 3089783]
[New LWP 3089784]
[New LWP 3089785]
[New LWP 3089786]
[New LWP 3089787]
[New LWP 3089788]
[New LWP 3089789]
[New LWP 3089790]
[New LWP 3089791]
[New LWP 3089792]
[New LWP 3089793]
[New LWP 3089794]
[New LWP 3089795]
[New LWP 3089796]
[New LWP 3089797]
[New LWP 3089798]
[New LWP 3089799]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007fffbf2b8da6 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#0 0x00007fffbf2b8da6 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1 0x00007fffbf2b8e98 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2 0x00007fffbf89c678 in PyThread_acquire_lock_timed () from /lib64/libpython3.9.so.1.0
#3 0x00007fffbf89cdd9 in lock_PyThread_acquire_lock () from /lib64/libpython3.9.so.1.0
#4 0x00007fffbf8aef4f in method_vectorcall_VARARGS_KEYWORDS () from /lib64/libpython3.9.so.1.0
#5 0x00007fffbf8f8248 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0
#6 0x00007fffbf8bc465 in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0
#7 0x00007fffbf8f8248 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0
#8 0x00007fffbf8bc465 in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0
#9 0x00007fffbf8f8248 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0
#10 0x00007fffbf87c693 in function_code_fastcall () from /lib64/libpython3.9.so.1.0
#11 0x00007fffbf8bc0ea in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0
#12 0x00007fffbf8f8248 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0
#13 0x00007fffbf8bc465 in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0
#14 0x00007fffbf8f8dfd in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0
#15 0x00007fffbf8b8713 in _PyEval_EvalCode () from /lib64/libpython3.9.so.1.0
#16 0x00007fffbf8b970f in _PyEval_EvalCodeWithName () from /lib64/libpython3.9.so.1.0
#17 0x00007fffbf8b9743 in PyEval_EvalCode () from /lib64/libpython3.9.so.1.0
#18 0x00007fffbf96adad in run_eval_code_obj () from /lib64/libpython3.9.so.1.0
#19 0x00007fffbf97eb0a in run_mod () from /lib64/libpython3.9.so.1.0
#20 0x00007fffbf80e2f6 in pyrun_file.cold () from /lib64/libpython3.9.so.1.0
#21 0x00007fffbf97f325 in PyRun_SimpleFileExFlags () from /lib64/libpython3.9.so.1.0
#22 0x00007fffbf97f7d2 in Py_RunMain () from /lib64/libpython3.9.so.1.0
#23 0x00007fffbf97f919 in Py_BytesMain () from /lib64/libpython3.9.so.1.0
#24 0x00007fffbe793d85 in __libc_start_main () from /lib64/libc.so.6
#25 0x000055555555475e in _start ()
[Inferior 1 (process 3089452) detached]
/appl/soft/ai/bin/apptainer_wrapper: line 38: 3089434 Aborted apptainer --silent exec $SING_FLAGS $SING_IMAGE "${@:2}"

Ragas version:
Python version:

Code to Reproduce

from ragas.metrics import (answer_relevancy,faithfulness,context_recall,context_precision)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from huggingface_hub import hf_hub_download, snapshot_download
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_community.llms import LlamaCpp

data_samples = {
'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
'answer': ['The first superbowl was held on January 15, 1967', 'The most super bowls have been won by The New England Patriots'],
'contexts' : [['The Super Bowl....season since 1966,','replacing the NFL...in February.'],
['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']],
'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)

embedding_model_name = "sentence-transformers/msmarco-bert-base-dot-v5"
embed_model = HuggingFaceEmbedding(model_name=embedding_model_name)

from langchain_core.language_models import BaseLanguageModel
from langchain_core.embeddings import Embeddings

critic_llm = LlamaCpp(
model_path="./Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
n_gpu_layers=1,
n_batch=512,
n_ctx=2048,
f16_kv=True,
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
verbose=True,
)

from ragas import evaluate

result_context_precision = evaluate(dataset,metrics=[context_precision], llm=critic_llm)
result_context_recall = evaluate(dataset, metrics=[context_recall], llm=critic_llm)

results = result_context_precision | result_context_recall

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleIssue has not had recent activity or appears to be solved. Stale issues will be automatically closed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions