Skip to content

Cannot run benchmark #75

@etienne428

Description

@etienne428

I am trying to run the benchmark but I get errors, both for BEIR as for KILT but neither of them are working. I am using a freshly installed conda environment but there seems to be an incompatibility for elasticsearch:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
elasticsearch-haystack 1.0.1 requires elasticsearch<9,>=8, but you have elasticsearch 7.9.1 which is incompatible.

Here is the whole bash script I use with a copy of the error I get for each of the benchmark:

#!/bin/bash
# script to run fastrag

conda deactivate
conda remove -n rag ENV --all -y

filepath="/home/pelirrojito/rag/fastrag/fastRAG"
cd $filepath
conda create -n rag python=3.9 -y
conda activate rag
pip install . # fastrag
pip install fastrag[elastic]
pip install fastrag[colbert]
# pip install git+https://github.com/facebookresearch/KILT.git
pip install beir


echo -e "\n\nResults\n\n"

# BEIR
# python "$filepath/benchmarks/BEIR/nq-plaid.py"              # installation succesfull but can't run the model because I don't understand the requirements for PLAIDDocumentStore
#   During installation:
#       elasticsearch-haystack 1.0.1 requires elasticsearch<9,>=8, but you have elasticsearch 7.9.1 which is incompatible.
#   When trying to run the code:
"""
2024-11-14 09:08:48 - Loading dataset...
2024-11-14 09:08:48 - Loading Corpus...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2681468/2681468 [00:07<00:00, 356143.51it/s]
2024-11-14 09:08:56 - Loaded 2681468 TEST Documents.
2024-11-14 09:08:56 - Doc Example: {'text': \"In accounting, minority interest (or non-controlling interest) is the portion of a subsidiary corporation's stock that is not owned by the parent corporation. The magnitude of the minority interest in the subsidiary company is generally less than 50% of outstanding shares, or the corporation would generally cease to be a subsidiary of the parent.[1]\", 'title': 'Minority interest'}
2024-11-14 09:08:56 - Loading Queries...
2024-11-14 09:08:56 - Loaded 3452 TEST Queries.
2024-11-14 09:08:56 - Query Example: what is non controlling interest on balance sheet
2024-11-14 09:08:56 - Loading PLAID index...
Traceback (most recent call last):
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 94, in load_from_index
    loaded_config, _ = cls.from_path(metadata_path)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 44, in from_path
    with open(name) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/benchmarks/datasets/nq/metadata.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File \"/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/nq-plaid.py\", line 41, in <module>
    document_store = PLAIDDocumentStore(
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/fastrag/stores/plaid.py\", line 113, in __init__
    self._load_index()
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/fastrag/stores/plaid.py\", line 120, in _load_index
    self.store = Searcher(
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/searcher.py\", line 33, in __init__
    self.index_config = ColBERTConfig.load_from_index(self.index)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 97, in load_from_index
    loaded_config, _ = cls.from_path(metadata_path)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 44, in from_path
    with open(name) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/benchmarks/datasets/nq/plan.json'
"""

# python "$filepath/benchmarks/BEIR/msmarco-plaid.py"         # Same as previous

# python "$filepath/benchmarks/BEIR/nq-bm25-sbert.py"         # cannot import name ElasticsearchDocumentStore from haystack.document_stores
#   During installation:
#       elasticsearch-haystack 1.0.1 requires elasticsearch<9,>=8, but you have elasticsearch 7.9.1 which is incompatible.
#   When trying to run the code:
"""
2024-11-14 09:08:48 - Loading dataset...
2024-11-14 09:08:48 - Loading Corpus...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2681468/2681468 [00:07<00:00, 356143.51it/s]
2024-11-14 09:08:56 - Loaded 2681468 TEST Documents.
2024-11-14 09:08:56 - Doc Example: {'text': \"In accounting, minority interest (or non-controlling interest) is the portion of a subsidiary corporation's stock that is not owned by the parent corporation. The magnitude of the minority interest in the subsidiary company is generally less than 50% of outstanding shares, or the corporation would generally cease to be a subsidiary of the parent.[1]\", 'title': 'Minority interest'}
2024-11-14 09:08:56 - Loading Queries...
2024-11-14 09:08:56 - Loaded 3452 TEST Queries.
2024-11-14 09:08:56 - Query Example: what is non controlling interest on balance sheet
2024-11-14 09:08:56 - Loading PLAID index...
Traceback (most recent call last):
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 94, in load_from_index
    loaded_config, _ = cls.from_path(metadata_path)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 44, in from_path
    with open(name) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/benchmarks/datasets/nq/metadata.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File \"/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/nq-plaid.py\", line 41, in <module>
    document_store = PLAIDDocumentStore(
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/fastrag/stores/plaid.py\", line 113, in __init__
    self._load_index()
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/fastrag/stores/plaid.py\", line 120, in _load_index
    self.store = Searcher(
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/searcher.py\", line 33, in __init__
    self.index_config = ColBERTConfig.load_from_index(self.index)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 97, in load_from_index
    loaded_config, _ = cls.from_path(metadata_path)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 44, in from_path
    with open(name) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/benchmarks/datasets/nq/plan.json'
: File name too long
Traceback (most recent call last):
  File \"/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/nq-bm25-sbert.py\", line 10, in <module>
    from haystack.document_stores import ElasticsearchDocumentStore
ImportError: cannot import name 'ElasticsearchDocumentStore' from 'haystack.document_stores' (/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/haystack/document_stores/__init__.py)
"""


python "$filepath/benchmarks/BEIR/msmarco-bm25-sbert.py"    # cannot import name ElasticsearchDocumentStore from haystack.document_stores
#   During installation:
#       elasticsearch-haystack 1.0.1 requires elasticsearch<9,>=8, but you have elasticsearch 7.9.1 which is incompatible.
#   When trying to run the code:

"""
2024-11-14 09:08:48 - Loading dataset...
2024-11-14 09:08:48 - Loading Corpus...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2681468/2681468 [00:07<00:00, 356143.51it/s]
2024-11-14 09:08:56 - Loaded 2681468 TEST Documents.
2024-11-14 09:08:56 - Doc Example: {'text': \"In accounting, minority interest (or non-controlling interest) is the portion of a subsidiary corporation's stock that is not owned by the parent corporation. The magnitude of the minority interest in the subsidiary company is generally less than 50% of outstanding shares, or the corporation would generally cease to be a subsidiary of the parent.[1]\", 'title': 'Minority interest'}
2024-11-14 09:08:56 - Loading Queries...
2024-11-14 09:08:56 - Loaded 3452 TEST Queries.
2024-11-14 09:08:56 - Query Example: what is non controlling interest on balance sheet
2024-11-14 09:08:56 - Loading PLAID index...
Traceback (most recent call last):
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 94, in load_from_index
    loaded_config, _ = cls.from_path(metadata_path)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 44, in from_path
    with open(name) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/benchmarks/datasets/nq/metadata.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File \"/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/nq-plaid.py\", line 41, in <module>
    document_store = PLAIDDocumentStore(
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/fastrag/stores/plaid.py\", line 113, in __init__
    self._load_index()
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/fastrag/stores/plaid.py\", line 120, in _load_index
    self.store = Searcher(
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/searcher.py\", line 33, in __init__
    self.index_config = ColBERTConfig.load_from_index(self.index)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 97, in load_from_index
    loaded_config, _ = cls.from_path(metadata_path)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 44, in from_path
    with open(name) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/benchmarks/datasets/nq/plan.json'
: File name too long
-bash: 
2024-11-14 09:08:48 - Loading dataset...
2024-11-14 09:08:48 - Loading Corpus...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2681468/2681468 [00:07<00:00, 356143.51it/s]
2024-11-14 09:08:56 - Loaded 2681468 TEST Documents.
2024-11-14 09:08:56 - Doc Example: {'text': \"In accounting, minority interest (or non-controlling interest) is the portion of a subsidiary corporation's stock that is not owned by the parent corporation. The magnitude of the minority interest in the subsidiary company is generally less than 50% of outstanding shares, or the corporation would generally cease to be a subsidiary of the parent.[1]\", 'title': 'Minority interest'}
2024-11-14 09:08:56 - Loading Queries...
2024-11-14 09:08:56 - Loaded 3452 TEST Queries.
2024-11-14 09:08:56 - Query Example: what is non controlling interest on balance sheet
2024-11-14 09:08:56 - Loading PLAID index...
Traceback (most recent call last):
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 94, in load_from_index
    loaded_config, _ = cls.from_path(metadata_path)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 44, in from_path
    with open(name) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/benchmarks/datasets/nq/metadata.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File \"/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/nq-plaid.py\", line 41, in <module>
    document_store = PLAIDDocumentStore(
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/fastrag/stores/plaid.py\", line 113, in __init__
    self._load_index()
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/fastrag/stores/plaid.py\", line 120, in _load_index
    self.store = Searcher(
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/searcher.py\", line 33, in __init__
    self.index_config = ColBERTConfig.load_from_index(self.index)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 97, in load_from_index
    loaded_config, _ = cls.from_path(metadata_path)
  File \"/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/colbert/infra/config/base_config.py\", line 44, in from_path
    with open(name) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/benchmarks/datasets/nq/plan.json'
: File name too long
Traceback (most recent call last):
  File \"/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/nq-bm25-sbert.py\", line 10, in <module>
    from haystack.document_stores import ElasticsearchDocumentStore
ImportError: cannot import name 'ElasticsearchDocumentStore' from 'haystack.document_stores' (/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/haystack/document_stores/__init__.py)
: File name too long
Traceback (most recent call last):
  File \"/home/pelirrojito/rag/fastrag/fastRAG/benchmarks/BEIR/msmarco-bm25-sbert.py\", line 10, in <module>
    from haystack.document_stores import ElasticsearchDocumentStore
ImportError: cannot import name 'ElasticsearchDocumentStore' from 'haystack.document_stores' (/home/pelirrojito/miniforge3/envs/rag/lib/python3.9/site-packages/haystack/document_stores/__init__.py)
"""

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions