Noref more permissive gnd uri handling#114
Conversation
idutils/utils.py
Outdated
| https://support.orcid.org/hc/en-us/articles/360006897674-Structure-of-the-ORCID-Identifier | ||
| """ | ||
|
|
||
| gnd_resolver_url = "d-nb.info/gnd/" |
There was a problem hiding this comment.
I think gnd_resolver_url is not needed any more.
It's used at the beginning of the validate function, but I don't that's needed after this code change
Line 238 in 181b3db
There was a problem hiding this comment.
its also used at
Line 240 in 26318e4
idutils/utils.py
Outdated
|
|
||
| gnd_regexp = re.compile( | ||
| r"(gnd:|GND:)?(" | ||
| rf"(gnd:|GND:|http://{re.escape(gnd_resolver_url)}|https://{re.escape(gnd_resolver_url)})?(" |
There was a problem hiding this comment.
I think you can use ? instead of repeating http and https like
Line 62 in 181b3db
I would also lean toward putting the url in the regex instead of having a variable
idutils/normalizers.py
Outdated
| if val.startswith("http://" + gnd_resolver_url): | ||
| val = val[len("http://" + gnd_resolver_url) :] | ||
| elif val.startswith("https://" + gnd_resolver_url): | ||
| val = val[len("https://" + gnd_resolver_url) :] |
There was a problem hiding this comment.
Could you just use the regex in the normalize function like
idutils/idutils/normalizers.py
Line 73 in 181b3db
|
I updated the regex to match the one from https://www.wikidata.org/wiki/Property:P227 : |
idutils/validators.py
Outdated
| if val.startswith("d-nb.info/gnd/"): | ||
| val = val[len("d-nb.info/gnd/") :] |
There was a problem hiding this comment.
minor: is this actually needed now that the regex contains the URL? I feel this if ... clause would only match identifiers without the http(s) protocol in front, i.e. d-nb.info/gnd/12345. Maybe this logic can be captured in the regex?
There was a problem hiding this comment.
I agree, this can probably be removed.
There was a problem hiding this comment.
fixed that by chaning the regex accordingly, see https://regex101.com/r/C9VJpH/1
… remove additional check in validators

❤️ Thank you for your contribution!
Description
This PR aims at simplifying the process of adding a GND ID to a person in the creatibutors modal. Until now users could only paste the actual GND ID, not one of the GND URIs. When researching a GND URI the Website of the german national library looks like this:
So most people will just copy and paste this URI. When then saving or publishing the draft in RDM there will be a "Creators: No valid scheme recognized for identifier." error, which in our opinion is a major inconvenience.
This PR allows for pasting both
http://d-nb.info/gnd/<id>andhttps://d-nb.info/gnd/<id>in addition toGND:<id>,gnd:<id>and<id>. The normalization step is the same as before.Checklist
Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge: