[VUFIND-1811] Improved format detection for video games and datasets#4946
[VUFIND-1811] Improved format detection for video games and datasets#4946damien-git wants to merge 2 commits intovufind-org:devfrom
Conversation
demiankatz
left a comment
There was a problem hiding this comment.
Thanks, @damien-git, see below for one question (which it's probably safe to ignore if you disagree with any part of it...)
demiankatz
left a comment
There was a problem hiding this comment.
Here's my revised suggestion:
| } | ||
| boolean computerOrCartographicDS = desc.equals("computer dataset") || desc.equals("cartographic dataset"); | ||
| boolean crdOrCod = code.equals("crd") || code.equals("cod"); | ||
| if (source.equals("rdacontent") && (computerOrCartographicDS || crdOrCod)) { |
There was a problem hiding this comment.
I believe this should be:
| if (source.equals("rdacontent") && (computerOrCartographicDS || crdOrCod)) { | |
| if (computerOrCartographicDS || (source.equals("rdacontent") && crdOrCod)) { |
If the description matches, we don't care about the source... but if it's a code, we do need to take the source into account.
There was a problem hiding this comment.
It makes sense, but then shouldn't we do the same thing for "two-dimensional moving image" ?
There was a problem hiding this comment.
Good point, I misread the code on the second time through -- which is why I was asking about checking "rdacontent" consistently/globally the first time.
I think we need to make a decision: either just bail out of the function of it's not rdacontent on the assumption that we don't really know how it should be processed, or else restrict rdacontent checks to codes, on the assumption that plain-English descriptions mean the same thing regardless of the standard being used.
I don't have strong feelings on which is better -- one is more common-sense, the other is more cautious.
There was a problem hiding this comment.
OK. I prefer the common-sense one (restrict rdacontent checks to codes). It will give better results if the source is missing for some reason. I will prepare an update.
There was a problem hiding this comment.
Thanks, this looks reasonable to my eye now -- but I'll leave this open for a while to allow further discussion, and I'll try to find time to do a full index of Villanova's MARC records to check for consequences (either good or bad).
See VUFIND-1811.
This PR improves the format detection for video games and datasets.
336a = "computer program", rely on008/26instead of the33Xfields to detect the format33Xfields withrdacontentsourceA reindex is needed to take advantage of the change for existing records.