Improve 'all' tab pagination to handle edge cases #290

jazairi · 2025-11-26T20:46:48Z

Why these changes are being introduced:

The zipper merge we implemented naively queries n/2 results from each API and interleaves them, where n is the per-page value. This works if both APIs return many results, but it can cause problems in smaller, unbalanced result sets.

For example, the query term doc edgerton returns 50 Primo results and 4 TIMDEX results. Page 1 only shows 14 results (4 TIMDEX and 10 Primo), and each subsequent page returns only 10 (all Primo).

Relevant ticket(s):

USE-179

How this addresses that need:

This implements more sophisticated logic that first checks the number of hits returned by each API and passes that, along with the pagination information, to a Merged Search Paginator class. This service object develops a 'merge plan', calculates API offsets, and merges the results for each page.

Queries on the 'all' tab now fetch twice from each API: once to determine the total number of hits for the Merged Search Paginator then again to fetch results at the appropriate offset. While hardly ideal, this was the only option I could figure to avoid losing results. I limited these extra calls to queries beyond page 1, which is the only case where they are needed.

Side effects of this change:

We now clear cache before each search controller test. This was done to avoid odd test behavior, but I ran the suite 50 times without any issues, so it might be excessively cautious.
The search controller continues to grow with this new logic. I tried to split things into multiple helper methods, so if we want to move more things to service objects later, it might be easier to do so.
A failing cassette has been replaced with a mock.

Developer

Accessibility

ANDI or WAVE has been run in accordance to our guide.
This PR contains no changes to the view layer.
New issues flagged by ANDI or WAVE have been resolved.
New issues flagged by ANDI or WAVE have been ticketed (link in the Pull Request details above).
No new accessibility issues have been flagged.

New ENV

All new ENV is documented in README.
All new ENV has been added to Heroku Pipeline, Staging and Prod.
ENV has not changed.

Approval beyond code review

UXWS/stakeholder approval has been confirmed.
UXWS/stakeholder review will be completed retroactively.
UXWS/stakeholder review is not needed.

Additional context needed to review

This is a pretty unwieldy changeset, so please reach out if you have questions!

Code Reviewer

Code

I have confirmed that the code works as intended.
Any CodeClimate issues have been fixed or confirmed as
added technical debt.

Documentation

The commit message is clear and follows our guidelines
(not just this pull request message).
The documentation has been updated or is unnecessary.
New dependencies are appropriate or there were no changes.

Testing

There are appropriate tests covering any new functionality.
No additional test coverage is required.

Why these changes are being introduced: The zipper merge we implemented naively queries n/2 results from each API and interleaves them, where n is the per-page value. This works if both APIs return many results, but it can cause problems in smaller, unbalanced result sets. For example, the query term `doc edgerton` returns 50 Primo results and 4 TIMDEX results. Page 1 only shows 14 results (4 TIMDEX and 10 Primo), and each subsequent page returns only 10 (all Primo). Relevant ticket(s): - [USE-179](https://mitlibraries.atlassian.net/browse/USE-179) How this addresses that need: This implements more sophisticated logic that first checks the number of hits returned by each API and passes that, along with the pagination information, to a Merged Search Paginator class. This service object develops a 'merge plan', calculates API offsets, and merges the results for each page. Queries on the 'all' tab now fetch twice from each API: once to determine the total number of hits for the Merged Search Paginator then again to fetch results at the appropriate offset. While hardly ideal, this was the only option I could figure to avoid losing results. I limited these extra calls to queries beyond page 1, which is the only case where they are needed. Side effects of this change: * We now clear cache before each search controller test. This was done to avoid odd test behavior, but I ran the suite 50 times without any issues, so it might be excessively cautious. * The search controller continues to grow with this new logic. I tried to split things into multiple helper methods, so if we want to move more things to service objects later, it might be easier to do so. * A failing cassette has been replaced with a mock.

jazairi requested review from JPrevost and matt-bernhardt November 26, 2025 20:46

mitlib deployed to timdex-ui-pi-use-179-pa-olfde1 November 26, 2025 20:49 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve 'all' tab pagination to handle edge cases #290

Improve 'all' tab pagination to handle edge cases #290

Uh oh!

jazairi commented Nov 26, 2025 •

edited by atlassian bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve 'all' tab pagination to handle edge cases #290

Are you sure you want to change the base?

Improve 'all' tab pagination to handle edge cases #290

Uh oh!

Conversation

jazairi commented Nov 26, 2025 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why these changes are being introduced:

Relevant ticket(s):

How this addresses that need:

Side effects of this change:

Developer

Accessibility

New ENV

Approval beyond code review

Additional context needed to review

Code Reviewer

Code

Documentation

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jazairi commented Nov 26, 2025 •

edited by atlassian bot

Loading