Replies: 2 comments
-
Hi @nsorros! You are using doc_index = "tutorial5_docs"
label_index = "tutorial5_labels"
document_store = ElasticsearchDocumentStore(
host=host,
username="",
password="",
index=doc_index, # IMPORTANT: this needs to be same as doc_index param in add_eval_data
label_index=label_index,
embedding_field="emb",
embedding_dim=768,
excluded_meta_data=["emb"],
)
preprocessor = PreProcessor(
split_by="word",
split_length=200,
split_overlap=0,
split_respect_sentence_boundary=False,
clean_empty_lines=False,
clean_whitespace=False,
)
document_store.add_eval_data(
filename="data/tutorial5/nq_dev_subset_v2.json",
doc_index=doc_index, # IMPORTANT: this needs to be same as index param when initialising DocumentStore
label_index=label_index,
preprocessor=preprocessor,
) Let me know if you have further questions :) |
Beta Was this translation helpful? Give feedback.
-
I did look into the indices to ensure I am using the same but I must have been doing something wrong. Redid it today and it works 👍 Will update if there are any issues. Thanks |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I am using the annotation tool to create question and answer pairs and then loading that data as per the evaluation guide https://haystack.deepset.ai/tutorials/05_evaluation to assess my qa system.
I am wondering whether I need to also load the initial documents, i.e. the ones that were used for annotation in the document store, on top of the evaluation data in order for the evaluation to work?
I say that because I am experiencing some errors with some combinations, for example when working with an in memory store and tfidf retriever and do not load the initial docs, only the evalaution data I get
Retrieval requires dataframe df and tf-idf matrix but fit() did not calculate them probably due to an empty document store.
which is resolved when loading the initial docs 🤔Beta Was this translation helpful? Give feedback.
All reactions