Skip to content

Commit faedd17

Browse files
committed
Add pagination support to 'all' tab
Why these changes are being introduced: We intentionally decided to implement pagination for the 'all' tab separately to avoid excessively large tickets. Relevant ticket(s): - [USE-178](https://mitlibraries.atlassian.net/browse/USE-178) - [USE-179](https://mitlibraries.atlassian.net/browse/USE-179) How this addresses that need: This updates the Search Controller and Analyzer Model to handle pagination from multiple sources ('all' tab) in addition to single sources. Side effects of this change: I found it very difficult to follow where and how the 'per-page' value is set. I tried to improve the situation by moving this value to env. I'm not sure it fully makes sense it, but I doubt it will without a full refactor of the data flow architecture.
1 parent 77970ca commit faedd17

File tree

6 files changed

+393
-128
lines changed

6 files changed

+393
-128
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,7 @@ may have unexpected consequences if applied to other TIMDEX UI apps.
116116
- `REQUEST_PERIOD` - time in minutes used along with `REQUESTS_PER_PERIOD`
117117
- `REDIRECT_REQUESTS_PER_PERIOD`- number of requests that can be made that the query string starts with our legacy redirect parameter to throttle per `REQUEST_PERIOD`
118118
- `REDIRECT_REQUEST_PERIOD`- time in minutes used along with `REDIRECT_REQUEST_PERIOD`
119+
- `RESULTS_PER_PAGE`: The number of results to display per page. Use an even number to avoid peculiarities. Defaults to 20 if unset.
119120
- `SENTRY_DSN`: Client key for Sentry exception logging.
120121
- `SENTRY_ENV`: Sentry environment for the application. Defaults to 'unknown' if unset.
121122
- `TACOS_SOURCE`: If set, this value is sent to TACOS (as the `sourceSystem` value) to distinguish which application

app/controllers/search_controller.rb

Lines changed: 61 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ def results
2323
render 'results_geo'
2424
return
2525
end
26-
26+
2727
# Route to appropriate search based on active tab
2828
case @active_tab
2929
when 'primo'
@@ -65,7 +65,8 @@ def load_geodata_results
6565
@errors = extract_errors(response)
6666
return unless @errors.nil?
6767

68-
@pagination = Analyzer.new(@enhanced_query, response, :timdex).pagination
68+
hits = response.dig(:data, 'search', 'hits') || 0
69+
@pagination = Analyzer.new(@enhanced_query, hits, :timdex).pagination
6970
raw_results = extract_results(response)
7071
@results = NormalizeTimdexResults.new(raw_results, @enhanced_query[:q]).normalize
7172
@filters = extract_filters(response)
@@ -87,53 +88,66 @@ def load_timdex_results
8788
end
8889

8990
def load_all_results
90-
# Parallel fetching from both APIs
91-
primo_thread = Thread.new { fetch_primo_data }
92-
timdex_thread = Thread.new { fetch_timdex_data }
93-
94-
# Wait for both threads to complete
95-
primo_data = primo_thread.value
96-
timdex_data = timdex_thread.value
91+
# Fetch results from both APIs in parallel
92+
primo_data, timdex_data = fetch_all_data
9793

98-
# Collect any errors from either API
99-
all_errors = []
100-
all_errors.concat(primo_data[:errors]) if primo_data[:errors]
101-
all_errors.concat(timdex_data[:errors]) if timdex_data[:errors]
102-
@errors = all_errors.any? ? all_errors : nil
94+
# Combine errors from both APIs
95+
@errors = combine_errors(primo_data[:errors], timdex_data[:errors])
10396

10497
# Zipper merge results from both APIs
105-
primo_results = primo_data[:results] || []
106-
timdex_results = timdex_data[:results] || []
107-
@results = primo_results.zip(timdex_results).flatten.compact
98+
@results = merge_results(primo_data[:results], timdex_data[:results])
99+
100+
# Use Analyzer for combined pagination calculation
101+
@pagination = Analyzer.new(@enhanced_query, timdex_data[:hits], :all,
102+
primo_data[:hits]).pagination
108103

109-
# For now, just use primo pagination as a placeholder
110-
@pagination = primo_data[:pagination] || {}
111-
112104
# Handle primo continuation for high page numbers
113105
@show_primo_continuation = primo_data[:show_continuation] || false
114106
end
115107

108+
def fetch_all_data
109+
# Parallel fetching from both APIs
110+
primo_thread = Thread.new { fetch_primo_data }
111+
timdex_thread = Thread.new { fetch_timdex_data }
112+
113+
[primo_thread.value, timdex_thread.value]
114+
end
115+
116+
def combine_errors(*error_arrays)
117+
all_errors = error_arrays.compact.flatten
118+
all_errors.any? ? all_errors : nil
119+
end
120+
121+
def merge_results(primo_results, timdex_results)
122+
(primo_results || []).zip(timdex_results || []).flatten.compact
123+
end
124+
116125
def fetch_primo_data
117126
current_page = @enhanced_query[:page] || 1
118-
per_page = @enhanced_query[:per_page] || 20
127+
per_page = if @active_tab == 'all'
128+
ENV.fetch('RESULTS_PER_PAGE', '20').to_i / 2
129+
else
130+
ENV.fetch('RESULTS_PER_PAGE', '20').to_i
131+
end
119132
offset = (current_page - 1) * per_page
120133

121134
# Check if we're beyond Primo API limits before making the request.
122135
if offset >= Analyzer::PRIMO_MAX_OFFSET
123-
return { results: [], pagination: {}, errors: nil, show_continuation: true }
136+
return { results: [], pagination: {}, errors: nil, show_continuation: true, hits: 0 }
124137
end
125138

126-
primo_response = query_primo
139+
primo_response = query_primo(per_page, offset)
140+
hits = primo_response.dig('info', 'total') || 0
127141
results = NormalizePrimoResults.new(primo_response, @enhanced_query[:q]).normalize
128-
pagination = Analyzer.new(@enhanced_query, primo_response, :primo).pagination
142+
pagination = Analyzer.new(@enhanced_query, hits , :primo).pagination
129143

130144
# Handle empty results from Primo API. Sometimes Primo will return no results at a given offset,
131145
# despite claiming in the initial query that more are available. This happens randomly and
132146
# seemingly for no reason (well below the recommended offset of 2,000). While the bug also
133147
# exists in Primo UI, sending users there seems like the best we can do.
134148
show_continuation = false
135149
errors = nil
136-
150+
137151
if results.empty?
138152
docs = primo_response['docs'] if primo_response.is_a?(Hash)
139153
if docs.nil? || docs.empty?
@@ -144,23 +158,37 @@ def fetch_primo_data
144158
end
145159
end
146160

147-
{ results: results, pagination: pagination, errors: errors, show_continuation: show_continuation }
161+
{ results: results, pagination: pagination, errors: errors, show_continuation: show_continuation,
162+
hits: hits }
148163
rescue StandardError => e
149-
{ results: [], pagination: {}, errors: handle_primo_errors(e), show_continuation: false }
164+
{ results: [], pagination: {}, errors: handle_primo_errors(e), show_continuation: false, hits: 0 }
150165
end
151166

152167
def fetch_timdex_data
153-
query = QueryBuilder.new(@enhanced_query).query
168+
# For all tab, modify query to use half page size
169+
if @active_tab == 'all'
170+
per_page = ENV.fetch('RESULTS_PER_PAGE', '20').to_i / 2
171+
page = @enhanced_query[:page] || 1
172+
from_offset = ((page - 1) * per_page).to_s
173+
174+
query_builder = QueryBuilder.new(@enhanced_query)
175+
query = query_builder.query
176+
query['from'] = from_offset
177+
else
178+
query = QueryBuilder.new(@enhanced_query).query
179+
end
180+
154181
response = query_timdex(query)
155182
errors = extract_errors(response)
156-
183+
157184
if errors.nil?
158-
pagination = Analyzer.new(@enhanced_query, response, :timdex).pagination
185+
hits = response.dig(:data, 'search', 'hits') || 0
186+
pagination = Analyzer.new(@enhanced_query, hits, :timdex).pagination
159187
raw_results = extract_results(response)
160188
results = NormalizeTimdexResults.new(raw_results, @enhanced_query[:q]).normalize
161-
{ results: results, pagination: pagination, errors: nil }
189+
{ results: results, pagination: pagination, errors: nil, hits: hits }
162190
else
163-
{ results: [], pagination: {}, errors: errors }
191+
{ results: [], pagination: {}, errors: errors, hits: 0 }
164192
end
165193
end
166194

@@ -189,16 +217,12 @@ def query_timdex(query)
189217
end
190218
end
191219

192-
def query_primo
220+
def query_primo(per_page, offset)
193221
# We generate unique cache keys to avoid naming collisions.
194222
cache_key = generate_cache_key(@enhanced_query)
195223

196224
Rails.cache.fetch("#{cache_key}/primo", expires_in: 12.hours) do
197225
primo_search = PrimoSearch.new
198-
per_page = @enhanced_query[:per_page] || 20
199-
current_page = @enhanced_query[:page] || 1
200-
offset = (current_page - 1) * per_page
201-
202226
primo_search.search(@enhanced_query[:q], per_page, offset)
203227
end
204228
end

app/models/analyzer.rb

Lines changed: 35 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,58 @@
11
class Analyzer
22
attr_accessor :pagination
33

4-
RESULTS_PER_PAGE = 20
5-
64
# Primo API theoretical maximum recommended offset is 2000 records (per Ex Libris documentation)
75
# but in practice, the API often can't deliver results beyond ~960 records for large result sets,
86
# likely due to performance constraints.
97
PRIMO_MAX_OFFSET = 960
108

11-
def initialize(enhanced_query, response, source)
9+
# Initializes pagination analysis for search results.
10+
#
11+
# @param enhanced_query [Hash] Query parameters including :page (current page number)
12+
# @param hits [Integer] Number of hits from primary source (TIMDEX for :all, source-specific otherwise)
13+
# @param source [Symbol] Source tab (:primo, :timdex, or :all)
14+
# @param secondary_hits [Integer, nil] Optional hit count from secondary source (Primo hits for :all)
15+
def initialize(enhanced_query, hits, source, secondary_hits = nil)
1216
@source = source
17+
@enhanced_query = enhanced_query
1318
@pagination = {}
14-
@pagination[:hits] = hits(response)
15-
@pagination[:start] = ((enhanced_query[:page] - 1) * RESULTS_PER_PAGE) + 1
16-
@pagination[:end] = [enhanced_query[:page] * RESULTS_PER_PAGE, @pagination[:hits]].min
17-
@pagination[:prev] = enhanced_query[:page] - 1 if enhanced_query[:page] > 1
18-
@pagination[:next] = next_page(enhanced_query[:page], @pagination[:hits]) if next_page(
19-
enhanced_query[:page], @pagination[:hits]
20-
)
21-
@pagination[:per_page] = RESULTS_PER_PAGE
19+
set_pagination(hits, secondary_hits)
2220
end
2321

2422
private
2523

26-
def hits(response)
27-
return 0 if response.nil?
28-
29-
if @source == :primo
30-
primo_hits(response)
31-
elsif @source == :timdex
32-
timdex_hits(response)
24+
# Sets the pagination hash with hit counts and per_page values.
25+
#
26+
# @param hits [Integer] Hit count from primary source
27+
# @param secondary_hits [Integer, nil] Optional hit count from secondary source
28+
def set_pagination(hits, secondary_hits = nil)
29+
if @source == :all
30+
@pagination[:hits] = (secondary_hits || 0) + (hits || 0)
31+
@pagination[:per_page] = ENV.fetch('RESULTS_PER_PAGE', '20').to_i
32+
calculate_pagination_values
3333
else
34-
0
34+
@pagination[:hits] = hits || 0
35+
@pagination[:per_page] = ENV.fetch('RESULTS_PER_PAGE', '20').to_i
36+
calculate_pagination_values
3537
end
3638
end
3739

38-
def primo_hits(response)
39-
return 0 unless response.is_a?(Hash)
40-
41-
response.dig('info', 'total') || 0
42-
end
43-
44-
def timdex_hits(response)
45-
return 0 unless response.is_a?(Hash) && response.key?(:data) && response[:data].key?('search')
46-
return 0 unless response[:data]['search'].is_a?(Hash) && response[:data]['search'].key?('hits')
47-
48-
response[:data]['search']['hits']
40+
# Calculates and sets pagination navigation values (start, end, prev, next).
41+
# Uses the already-set @pagination[:hits] and @pagination[:per_page] values.
42+
def calculate_pagination_values
43+
page = @enhanced_query[:page] || 1
44+
@pagination[:start] = ((page - 1) * @pagination[:per_page]) + 1
45+
@pagination[:end] = [page * @pagination[:per_page], @pagination[:hits]].min
46+
@pagination[:prev] = page - 1 if page > 1
47+
@pagination[:next] = next_page(page, @pagination[:hits]) if next_page(page, @pagination[:hits])
4948
end
5049

50+
# Calculates the next page number if more results are available.
51+
#
52+
# @param page [Integer] Current page number
53+
# @param hits [Integer] Total number of results available
54+
# @return [Integer, nil] Next page number or nil if no more pages
5155
def next_page(page, hits)
52-
page + 1 if page * RESULTS_PER_PAGE < hits
56+
page + 1 if page * @pagination[:per_page] < hits
5357
end
5458
end

app/models/query_builder.rb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
class QueryBuilder
22
attr_reader :query
33

4-
RESULTS_PER_PAGE = 20
54
QUERY_PARAMS = %w[q citation contributors fundingInformation identifiers locations subjects title booleanType].freeze
65
FILTER_PARAMS = %i[accessToFilesFilter contentTypeFilter contributorsFilter formatFilter languagesFilter
76
literaryFormFilter placesFilter sourceFilter subjectsFilter].freeze
@@ -10,7 +9,8 @@ class QueryBuilder
109

1110
def initialize(enhanced_query)
1211
@query = {}
13-
@query['from'] = calculate_from(enhanced_query[:page])
12+
@per_page = ENV.fetch('RESULTS_PER_PAGE', '20').to_i
13+
@query['from'] = calculate_from(enhanced_query[:page], @per_page)
1414

1515
if Feature.enabled?(:geodata)
1616
@query['geobox'] = 'true' if enhanced_query[:geobox] == 'true'
@@ -27,10 +27,10 @@ def initialize(enhanced_query)
2727

2828
private
2929

30-
def calculate_from(page = 1)
30+
def calculate_from(page = 1, per_page = ENV.fetch('RESULTS_PER_PAGE', '20').to_i)
3131
# This needs to return a string because Timdex needs $from to be a String
3232
page = 1 if page.to_i.zero?
33-
((page - 1) * RESULTS_PER_PAGE).to_s
33+
((page - 1) * per_page).to_s
3434
end
3535

3636
def extract_query(enhanced_query)

0 commit comments

Comments
 (0)