Skip to content

Commit 8542fee

Browse files
committed
Add models for Primo Search API integration
Why these changes are being introduced: The first iteration of USE UI will include the ability to toggle between Ex Libris CDI and TIMDEX results. Relevant ticket(s): * [USE-30](https://mitlibraries.atlassian.net/browse/USE-30) How this addresses that need: This adds a Primo Search model that calls the Primo Search API, and a Normalize Primo model that parses the each result and returns a normalized record. Side effects of this change: * For better or worse, much of the normalization code has been repurposed from Bento. This felt acceptable as we do not anticipate this to be a long-term solution. * While working on this, I noticed that most of the TIMDEX normalization logic happens in the views. It would be better to move this to a normalization model, as we've done here with Primo. I opened [USE-73](https://mitlibraries.atlassian.net/browse/USE-73) to address this. * It's not technically a side effect of this changeset, but it's worth noting that we still periodically see some GeoData test failures on random test runs. I've documented this in [USE-72](https://mitlibraries.atlassian.net/browse/USE-72).
1 parent 598e3da commit 8542fee

File tree

11 files changed

+2431
-3
lines changed

11 files changed

+2431
-3
lines changed

.env.test

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,10 @@
1-
TIMDEX_HOST=FAKE_TIMDEX_HOST
2-
TIMDEX_GRAPHQL=https://FAKE_TIMDEX_HOST/graphql
3-
TIMDEX_INDEX=FAKE_TIMDEX_INDEX
41
GDT=false
2+
MIT_PRIMO_URL=https://mit.primo.exlibrisgroup.com
3+
PRIMO_API_KEY=FAKE_PRIMO_API_KEY
4+
PRIMO_API_URL=https://api-na.hosted.exlibrisgroup.com/primo/v1
5+
PRIMO_SCOPE=cdi
6+
PRIMO_TAB=all
7+
PRIMO_VID=01MIT_INST:MIT
8+
TIMDEX_GRAPHQL=https://FAKE_TIMDEX_HOST/graphql
9+
TIMDEX_HOST=FAKE_TIMDEX_HOST
10+
TIMDEX_INDEX=FAKE_TIMDEX_INDEX

README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,12 @@ See `Optional Environment Variables` for more information.
9191

9292
### Required Environment Variables
9393

94+
- `MIT_PRIMO_URL`: The base URL for MIT Libraries' Primo instance (used to generate record links).
95+
- `PRIMO_API_KEY`: The Primo Search API key.
96+
- `PRIMO_API_URL`: The Primo Search API base URL.
97+
- `PRIMO_SCOPE`: The Primo Search API `scope` param (set to `cdi` for CDI-scoped results).
98+
- `PRIMO_TAB`: The Primo Search API `tab` param (typically `all`).
99+
- `PRIMO_VID`: The Primo Search API `vid` (or 'view ID`) param.
94100
- `TIMDEX_GRAPHQL`: Set this to the URL of the GraphQL endpoint. There is no default value in the application.
95101

96102
### Optional Environment Variables
@@ -121,6 +127,7 @@ may have unexpected consequences if applied to other TIMDEX UI apps.
121127
- `GLOBAL_ALERT`: The main functionality for this comes from our theme gem, but when set the value will be rendered as
122128
safe html above the main header of the site.
123129
- `PLATFORM_NAME`: The value set is added to the header after the MIT Libraries logo. The logic and CSS for this comes from our theme gem.
130+
- `PRIMO_TIMEOUT`: The number of seconds before a Primo request times out (default 6).
124131
- `REQUESTS_PER_PERIOD` - number of requests that can be made for general throttles per `REQUEST_PERIOD`
125132
- `REQUEST_PERIOD` - time in minutes used along with `REQUESTS_PER_PERIOD`
126133
- `REDIRECT_REQUESTS_PER_PERIOD`- number of requests that can be made that the query string starts with our legacy redirect parameter to throttle per `REQUEST_PERIOD`

app/models/normalize_primo.rb

Lines changed: 249 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,249 @@
1+
# Transforms results from Primo Search API into normalized records
2+
class NormalizePrimo
3+
def initialize(record)
4+
@record = record
5+
end
6+
7+
def normalize
8+
{
9+
'title' => title,
10+
'creators' => creators,
11+
'source' => source,
12+
'year' => year,
13+
'format' => format,
14+
'links' => links,
15+
'citation' => citation,
16+
'container' => container_title,
17+
'identifier' => record_id,
18+
'summary' => summary,
19+
'numbering' => numbering,
20+
'chapter_numbering' => chapter_numbering
21+
}
22+
end
23+
24+
private
25+
26+
def title
27+
if @record['pnx']['display']['title'].present?
28+
@record['pnx']['display']['title'].join
29+
else
30+
'unknown title'
31+
end
32+
end
33+
34+
def creators
35+
return [] unless @record['pnx']['display']['creator'] || @record['pnx']['display']['contributor']
36+
37+
author_list = []
38+
39+
if @record['pnx']['display']['creator']
40+
creators = sanitize_authors(@record['pnx']['display']['creator'])
41+
creators.each do |creator|
42+
author_list << { value: creator, link: author_link(creator) }
43+
end
44+
end
45+
46+
if @record['pnx']['display']['contributor']
47+
contributors = sanitize_authors(@record['pnx']['display']['contributor'])
48+
contributors.each do |contributor|
49+
author_list << { value: contributor, link: author_link(contributor) }
50+
end
51+
end
52+
53+
author_list.uniq
54+
end
55+
56+
def source
57+
'Primo'
58+
end
59+
60+
def year
61+
if @record['pnx']['display']['creationdate'].present?
62+
@record['pnx']['display']['creationdate'].join
63+
else
64+
return unless @record['pnx']['search'] && @record['pnx']['search']['creationdate']
65+
66+
@record['pnx']['search']['creationdate'].join
67+
end
68+
end
69+
70+
def format
71+
return unless @record['pnx']['display']['type']
72+
73+
normalize_type(@record['pnx']['display']['type'].join)
74+
end
75+
76+
# While the links object in the Primo response often contains more than the Alma openurl, that is
77+
# the one that is most predictably useful to us. The record_link is constructed.
78+
def links
79+
links = []
80+
81+
# Add direct record link as the first link
82+
if record_link.present?
83+
links << { 'url' => record_link, 'kind' => 'full record' }
84+
end
85+
86+
# Add openurl if available
87+
if openurl.present?
88+
links << { 'url' => openurl, 'kind' => 'openurl' }
89+
end
90+
91+
# Return links if we found any
92+
links.any? ? links : []
93+
end
94+
95+
def citation
96+
return unless @record['pnx']['addata']
97+
98+
if @record['pnx']['addata']['volume'].present?
99+
if @record['pnx']['addata']['issue'].present?
100+
"volume #{@record['pnx']['addata']['volume'].join} issue #{@record['pnx']['addata']['issue'].join}"
101+
else
102+
"volume #{@record['pnx']['addata']['volume'].join}"
103+
end
104+
elsif @record['pnx']['addata']['date'].present? && @record['pnx']['addata']['pages'].present?
105+
"#{@record['pnx']['addata']['date'].join}, pp. #{@record['pnx']['addata']['pages'].join}"
106+
end
107+
end
108+
109+
def container_title
110+
return unless @record['pnx']['addata']
111+
112+
if @record['pnx']['addata']['jtitle'].present?
113+
@record['pnx']['addata']['jtitle'].join
114+
elsif @record['pnx']['addata']['btitle'].present?
115+
@record['pnx']['addata']['btitle'].join
116+
end
117+
end
118+
119+
def record_id
120+
return unless @record['pnx']['control']['recordid']
121+
122+
@record['pnx']['control']['recordid'].join
123+
end
124+
125+
def summary
126+
return unless @record['pnx']['display']['description']
127+
128+
@record['pnx']['display']['description'].join(' ')
129+
end
130+
131+
# This constructs a link to the record in Primo.
132+
#
133+
# We've altered this method slightly to address bugs introduced in the Primo VE November 2021
134+
# release. The search_scope param is now required for CDI fulldisplay links, and the context param
135+
# is now required for local (catalog) fulldisplay links.
136+
#
137+
# In order to avoid more surprises, we're adding all of the params included in the fulldisplay
138+
# example links provided here, even though not all of them are actually required at present:
139+
# https://developers.exlibrisgroup.com/primo/apis/deep-links-new-ui/
140+
#
141+
# We should keep an eye on this over subsequent Primo reeleases and revert it to something more
142+
# minimalist/sensible when Ex Libris fixes this issue.
143+
def record_link
144+
return unless @record['pnx']['control']['recordid']
145+
return unless @record['context']
146+
147+
record_id = @record['pnx']['control']['recordid'].join
148+
base = [ENV.fetch('MIT_PRIMO_URL'), '/discovery/fulldisplay?'].join
149+
query = {
150+
docid: record_id,
151+
vid: ENV.fetch('PRIMO_VID'),
152+
context: @record['context'],
153+
search_scope: 'all',
154+
lang: 'en',
155+
tab: ENV.fetch('PRIMO_TAB')
156+
}.to_query
157+
[base, query].join
158+
end
159+
160+
def numbering
161+
return unless @record['pnx'] && @record['pnx']['addata']
162+
return unless @record['pnx']['addata']['volume']
163+
164+
if @record['pnx']['addata']['issue'].present?
165+
"volume #{@record['pnx']['addata']['volume'].join} issue #{@record['pnx']['addata']['issue'].join}"
166+
else
167+
"volume #{@record['pnx']['addata']['volume'].join}"
168+
end
169+
end
170+
171+
def chapter_numbering
172+
return unless @record['pnx'] && @record['pnx']['addata']
173+
return unless @record['pnx']['addata']['btitle']
174+
return unless @record['pnx']['addata']['date'] && @record['pnx']['addata']['pages']
175+
176+
"#{@record['pnx']['addata']['date'].join}, pp. #{@record['pnx']['addata']['pages'].join}"
177+
end
178+
179+
def openurl
180+
return unless @record['delivery'] && @record['delivery']['almaOpenurl']
181+
182+
@record['delivery']['almaOpenurl'].is_a?(Array) ? @record['delivery']['almaOpenurl'].join : @record['delivery']['almaOpenurl']
183+
end
184+
185+
def sanitize_authors(authors)
186+
authors.map! { |author| author.split(';') }.flatten! if authors.any? { |author| author.include?(';') }
187+
authors.map { |author| author.strip.gsub(/\$\$Q.*$/, '') }
188+
end
189+
190+
def author_link(author)
191+
[ENV.fetch('MIT_PRIMO_URL'),
192+
'/discovery/search?query=creator,exact,',
193+
encode_author(author),
194+
'&tab=', ENV.fetch('PRIMO_TAB'),
195+
'&search_scope=all&vid=',
196+
ENV.fetch('PRIMO_VID')].join
197+
end
198+
199+
def encode_author(author)
200+
URI.encode_uri_component(author)
201+
end
202+
203+
def normalize_type(type)
204+
r_types = {
205+
'BKSE' => 'eBook',
206+
'reference_entry' => 'Reference Entry',
207+
'Book_chapter' => 'Book Chapter'
208+
}
209+
r_types[type] || type.capitalize
210+
end
211+
212+
# It's possible we'll encounter records that use a different server,
213+
# so we want to test against our expected server to guard against
214+
# malformed URLs. This assumes all URL strings begin with https://.
215+
def openurl
216+
return unless @record['delivery'] && @record['delivery']['almaOpenurl']
217+
218+
# Check server match
219+
openurl_server = ENV['ALMA_OPENURL'][8, 4]
220+
record_openurl_server = @record['delivery']['almaOpenurl'][8, 4]
221+
if openurl_server == record_openurl_server
222+
construct_primo_openurl
223+
else
224+
Rails.logger.warn "Alma openurl server mismatch. Expected #{openurl_server}, but received #{record_openurl_server}. (record ID: #{record_id})"
225+
@record['delivery']['almaOpenurl']
226+
end
227+
end
228+
229+
def construct_primo_openurl
230+
return unless @record['delivery']['almaOpenurl']
231+
232+
# Here we are converting the Alma link resolver URL provided by the Primo
233+
# Search API to redirect to the Primo UI. This is done for UX purposes,
234+
# as the regular Alma link resolver URLs redirect to a plaintext
235+
# disambiguation page.
236+
primo_openurl_base = [ENV['MIT_PRIMO_URL'],
237+
'/discovery/openurl?institution=',
238+
ENV['EXL_INST_ID'],
239+
'&vid=',
240+
ENV['PRIMO_VID'],
241+
'&'].join
242+
primo_openurl = @record['delivery']['almaOpenurl'].gsub(ENV['ALMA_OPENURL'], primo_openurl_base)
243+
244+
# The ctx params appear to break Primo openurls, so we need to remove them.
245+
params = Rack::Utils.parse_nested_query(primo_openurl)
246+
filtered = params.delete_if { |key, _value| key.starts_with?('ctx') }
247+
URI::DEFAULT_PARSER.unescape(filtered.to_param)
248+
end
249+
end

app/models/primo_search.rb

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Searches Primo Search API and formats results
2+
#
3+
class PrimoSearch
4+
PRIMO_API_URL = ENV.fetch('PRIMO_API_URL', nil).freeze
5+
PRIMO_API_KEY = ENV.fetch('PRIMO_API_KEY', nil)
6+
PRIMO_SCOPE = ENV.fetch('PRIMO_SCOPE', nil)
7+
PRIMO_TAB = ENV.fetch('PRIMO_TAB', nil)
8+
PRIMO_VID = ENV.fetch('PRIMO_VID', nil)
9+
10+
def initialize
11+
@primo_http = HTTP.persistent(PRIMO_API_URL)
12+
@results = {}
13+
end
14+
15+
def search(term, per_page)
16+
url = search_url(term, per_page)
17+
Rails.logger.info "Primo Search URL: #{url}"
18+
19+
result = @primo_http.timeout(http_timeout)
20+
.headers(
21+
accept: 'application/json',
22+
Authorization: "apikey #{PRIMO_API_KEY}"
23+
)
24+
.get(url)
25+
26+
Rails.logger.info "Primo Response Status: #{result.status}"
27+
Rails.logger.info "Primo Response Headers: #{result.headers.to_h}"
28+
29+
raise "Primo Error Detected: #{result.status}" unless result.status == 200
30+
31+
JSON.parse(result)
32+
end
33+
34+
private
35+
36+
# Initial search term sanitization
37+
def clean_term(term)
38+
term.strip.tr(' :,', '+').gsub(/\++/, '+')
39+
end
40+
41+
# Constructs the search URL with required parameters for Primo API
42+
def search_url(term, per_page)
43+
[
44+
PRIMO_API_URL,
45+
'/search?q=any,contains,',
46+
clean_term(term),
47+
'&vid=',
48+
PRIMO_VID,
49+
'&tab=',
50+
PRIMO_TAB,
51+
'&scope=',
52+
PRIMO_SCOPE,
53+
'&limit=',
54+
per_page,
55+
'&apikey=',
56+
PRIMO_API_KEY
57+
].join
58+
end
59+
60+
# Timeout configuration for HTTP requests
61+
def http_timeout
62+
if ENV.fetch('PRIMO_TIMEOUT', nil).present?
63+
ENV['PRIMO_TIMEOUT'].to_f
64+
else
65+
6
66+
end
67+
end
68+
end
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
{
2+
"pnx": {
3+
"display": {
4+
"title": ["Testing the Limits of Knowledge"],
5+
"creator": ["Smith, John A.", "Jones, Mary B."],
6+
"contributor": ["Brown, Robert C."],
7+
"creationdate": ["2023"],
8+
"type": ["book"],
9+
"description": ["A comprehensive study of testing methodologies"],
10+
"subject": ["Computer Science", "Software Testing"]
11+
},
12+
"addata": {
13+
"btitle": ["Complete Guide to Testing"],
14+
"date": ["2023"],
15+
"volume": ["2"],
16+
"issue": ["3"],
17+
"pages": ["123-145"],
18+
"jtitle": ["Journal of Testing"]
19+
},
20+
"search": {
21+
"creationdate": ["2023"]
22+
},
23+
"control": {
24+
"recordid": ["MIT01000000001"]
25+
}
26+
},
27+
"context": "contextual",
28+
"delivery": {
29+
"bestlocation": {
30+
"mainLocation": "Hayden Library",
31+
"subLocation": "Stacks",
32+
"callNumber": "QA76.73.R83 2023"
33+
},
34+
"link": [],
35+
"almaOpenurl": "https://example.com/openurl?param=value"
36+
}
37+
}

0 commit comments

Comments
 (0)