Skip to content

Conversation

@xmedr
Copy link
Collaborator

@xmedr xmedr commented May 19, 2025

Overview

This branch boosts results which have a direct match to the search query. For some reason, haystack/solr has been inconsistent about placing direct matches for license numbers at the top of the search results.

Demo

Note: for the purposes of this demo, I've added the search scores of each result to the license number cell. This change has not been pushed here.

Searching for "556" before boosting (result found on second page):
image


After boosting (first result):
image

Notes

I don't have any building ids in my flat drawings so I tested those rankings using map numbers instead.

Testing Instructions

  • Pull down this branch and search for licenses or flat drawings
    • May need to create a super user
  • Confirm that results which are more relevant are now higher in search results

Comment on lines +150 to +167
q = self.request.GET.get('q')
if q:
"""
Boost results that contain direct matches of each term in the query,
regardless of whether any terms are enclosed in double quotes.
Terms in double quotes are exact match terms, so give them a bit of
an extra boost, and boost them as is - whitespace and all.
"""
exact_match_re = re.compile(r'"(?P<phrase>.*?)"')
tokens = exact_match_re.split(q)
exacts = exact_match_re.findall(q)

for t in tokens:
if t and not t.strip().startswith("-"):
if t in exacts:
sqs = sqs.boost(t, 1)
else:
sqs = sqs.boost(t, .5)
Copy link
Collaborator Author

@xmedr xmedr May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only relevant change so far.

The docs for haystack say that setting a boost above 1.0 increases scores, while less than that decreases it. I have noticed that's not the case here, and any positive amount increases. It's probably because we're using an older version.

Also, this change has made it so that exact match queries (wrapped in double quotes) no longer omit anything that isn't an exact match. Some partial matches now come through. However, exact matches are still at the top of results. I'm a bit green to haystack and solr so if there's anything obvious we can do about that, then I'm down. Otherwise we might just be able to ask them if that's a big deal.

@xmedr xmedr marked this pull request as ready for review May 20, 2025 14:05
@xmedr xmedr requested a review from antidipyramid May 20, 2025 14:05
Copy link
Collaborator

@antidipyramid antidipyramid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work on my end. We can get this up on staging and have them test it out.

@xmedr xmedr merged commit 9ae2036 into master May 27, 2025
2 checks passed
@xmedr xmedr deleted the patch/search-rankings branch May 27, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Search Result Issue on Flat Drawings Search result issue on License Documents

3 participants