Skip to content

!here causes ValueError/IndexError #123

@LongHairedHacker

Description

@LongHairedHacker

I've just set up pears on local machine for testing, according to the README.md.

Steps I used for set up
$ git clone https://github.com/PeARSearch/PeARS-federated.git 
# Commit the got checked out is the lastest main f4b1c972def1d5b4b9e1a0998209f8676480b31b
$ python -m venv .venv
$ . .venv/bin/activate
$ pip install -r requirements.txt
$ cp .env-template .env
# Replaced SECRET_KEY, SECURITY_PASSWORD_SALT and CSRF_SESSION_KEY with random strings
$ flask pears install-language de
# Set PEARS_LANGS to en,de
$ flask pears create-user sebastian XXXXXXX  sebastian@example.com
$ flask pears setadmin sebastian

If I run it using python3 run.py, open http://localhost:8080/ and search for !here test I get:

$ python3 run.py
PATH /home/sebastian/build/PeARS-federated/app
Installed languages: ['en', 'de']
	>> ERROR: filter_instances_by_language: got non-200 status code when trying to access https://pears.cc/api/languages...
 * Serving Flask app 'app'
 * Debug mode: on
PATH /home/sebastian/build/PeARS-federated/app
Installed languages: ['en', 'de']
	>> ERROR: filter_instances_by_language: got non-200 status code when trying to access https://pears.cc/api/languages...

>>>>>>>>>>>>>>>>>>>>>>
>> SEARCH:CONTROLLERS:get_local_search_results: searching in en
>>>>>>>>>>>>>>>>>>>>>>

 Getting results on this instance
QUERY LANG en
QUERY SPLIT: ['test']
WORDS TOKENIZED: [['▁test']]
WORDS TOKENIZED EXPANDED: [['▁experiments', '▁detection', '▁screening', '▁test', '▁evaluation', '▁preliminary', '▁tested', '▁testing', '▁tests', '▁trials']]
Traceback (most recent call last):
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 1536, in __call__
    return self.wsgi_app(environ, start_response)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 1514, in wsgi_app
    response = self.handle_exception(e)
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 1511, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 919, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 917, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 902, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/search/controllers.py", line 46, in index
    clean_query, results = get_local_search_results(query)
                           ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/search/controllers.py", line 132, in get_local_search_results
    r, s = score_pages.run_search(clean_query, lang, extended=app.config['EXTEND_QUERY'])
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/search/score_pages.py", line 194, in run_search
    document_scores = compute_scores(query, q_vectors, lang)
  File "/home/sebastian/build/PeARS-federated/app/utils.py", line 254, in wrap_func
    result = func(*args, **kwargs)
  File "/home/sebastian/build/PeARS-federated/app/search/score_pages.py", line 99, in compute_scores
    m, bins, podnames, urls = load_vec_matrix(lang)
                              ~~~~~~~~~~~~~~~^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/search/score_pages.py", line 90, in load_vec_matrix
    m, bins, podnames, urls = mk_vec_matrix(lang)
                              ~~~~~~~~~~~~~^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/utils.py", line 254, in wrap_func
    result = func(*args, **kwargs)
  File "/home/sebastian/build/PeARS-federated/app/search/score_pages.py", line 78, in mk_vec_matrix
    m = vstack(m)
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/scipy/sparse/_construct.py", line 849, in vstack
    return _block([[b] for b in blocks], format, dtype, return_spmatrix=True)
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/scipy/sparse/_construct.py", line 966, in _block
    raise ValueError('blocks must be 2-D')
ValueError: blocks must be 2-D

I assumed this was because there were no categories/themes/pods (the naming in the code is bit confusing to me) on this instance.
Still this should not cause an exception.

So I added a bunch of URLs from my blog test things out. As soons as I had two pods the error change to:

>>>>>>>>>>>>>>>>>>>>>>
>> SEARCH:CONTROLLERS:get_local_search_results: searching in en
>>>>>>>>>>>>>>>>>>>>>>

 Getting results on this instance
QUERY LANG en
QUERY SPLIT: ['test']
WORDS TOKENIZED: [['▁test']]
WORDS TOKENIZED EXPANDED: [['▁screening', '▁preliminary', '▁testing', '▁test', '▁tests', '▁tested', '▁evaluation', '▁trials', '▁experiments', '▁detection']]
>> TIMER: Function 'mk_vec_matrix' executed in 0.0018s
>> TIMER: Function 'compute_scores' executed in 0.0025s

>>>>>>>>>>>>>>>>>>>>>>
>> SEARCH:CONTROLLERS:get_local_search_results: searching in de
>>>>>>>>>>>>>>>>>>>>>>

 Getting results on this instance
QUERY LANG de
QUERY SPLIT: ['test']
WORDS TOKENIZED: [['▁test']]
WORDS TOKENIZED EXPANDED: [['▁test', '▁start', '▁pilot', 'flüge', '▁gestartet', '▁durchgeführt', '▁experiment', 'hubschrauber', '▁versuch', 'training']]
Traceback (most recent call last):
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 1536, in __call__
    return self.wsgi_app(environ, start_response)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 1514, in wsgi_app
    response = self.handle_exception(e)
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 1511, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 919, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 917, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/sebastian/build/PeARS-federated/.venv/lib/python3.13/site-packages/flask/app.py", line 902, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/search/controllers.py", line 46, in index
    clean_query, results = get_local_search_results(query)
                           ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/search/controllers.py", line 132, in get_local_search_results
    r, s = score_pages.run_search(clean_query, lang, extended=app.config['EXTEND_QUERY'])
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/search/score_pages.py", line 194, in run_search
    document_scores = compute_scores(query, q_vectors, lang)
  File "/home/sebastian/build/PeARS-federated/app/utils.py", line 254, in wrap_func
    result = func(*args, **kwargs)
  File "/home/sebastian/build/PeARS-federated/app/search/score_pages.py", line 99, in compute_scores
    m, bins, podnames, urls = load_vec_matrix(lang)
                              ~~~~~~~~~~~~~~~^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/search/score_pages.py", line 90, in load_vec_matrix
    m, bins, podnames, urls = mk_vec_matrix(lang)
                              ~~~~~~~~~~~~~^^^^^^
  File "/home/sebastian/build/PeARS-federated/app/utils.py", line 254, in wrap_func
    result = func(*args, **kwargs)
  File "/home/sebastian/build/PeARS-federated/app/search/score_pages.py", line 74, in mk_vec_matrix
    npz = npz[idvs,:]
          ~~~^^^^^^^^
IndexError: index 2 is out of bounds for axis 0 with size 2

Worth noting: My blog is in german and english, so I naturally have pods in both languages now.
As far as I can tell from the log output the english search worked just fine. However the german one failed.

I'm assuming both errors are somewhat related, since both point towards mk_vec_matrix.

Please let me know if you need any further input or if there is something I should try to narrow this down further.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions