Skip to content

Alternative option for Affiliation#260

Merged
RKrahl merged 3 commits intoicat-schema-extensionfrom
icat-schema-extension-affiliation
Feb 7, 2022
Merged

Alternative option for Affiliation#260
RKrahl merged 3 commits intoicat-schema-extensionfrom
icat-schema-extension-affiliation

Conversation

@RKrahl
Copy link
Copy Markdown
Member

@RKrahl RKrahl commented Dec 8, 2021

This provides a slightly modified option for the implementation of the Affiliation table with respect to #256.

Background is that in an earlier implementation, the name attribute was set to a STRING[1023]. This caused problems, because name is part of the uniqueness constraint and thus needs to be included in an index in the database. But STRING[1023] is too large to be added to an index. In #256 this is solved by shortening name to STRING[511].

The present PR suggest a different solution by splitting name in two attributes. In fact, in the version provided by #256, name serves a double purpose: it is used to disambiguate the affiliation entry in the case that one user has more than one affiliation in a publication and it sets the text to be displayed on the landing page and to be included in the publication metadata. This PR proposes two attributes: name is purely internal for the purpose of disambiguation and fullReference will set the text to be set in the visible metadata.

As a result, Affiliation will look like:


Affiliation

The home institute or other affiliation of a user in the context of a data publication

Uniqueness constraint: user, name

Relationships:

Card Class Field
1,1 DataPublicationUser user

Other fields:

Field Type Description
name String [255] NOT NULL An internal name for that affiliation entry, possibly the organization name
pid String [255] Identifier such as ROR or ISNI
fullReference String [1023] The full reference of the affiliation, optionally including street address and department, as it should appear in the publication

This has the following advantages:

  • we can shorten name even more to STRING[255]. This may improve performance in the database index.
  • name can at the same time be used to provide a well defined order to display the affiliations on the landing page.

The only disadvantage I can see is that it adds another attribute.

We briefly discussed this in the November collaboration meeting, but decided we would need more time for discussion. I submit this as a separate PR in order to open a space for this discussion.

- shorten Affiliation.name to 255 characters
- add more verbose comments on the purpose of
  Affiliation.fullReference versus Affiliation.name
@RKrahl RKrahl added the schema this involves changes to the ICAT schema label Dec 8, 2021
@RKrahl RKrahl requested a review from agbeltran December 8, 2021 16:14
@RKrahl
Copy link
Copy Markdown
Member Author

RKrahl commented Dec 8, 2021

To illustrate this with a real world example: with the implementation proposed here, the affiliations for the first author could be set as:

[{'fullReference': 'Optics for Solar Energy, Helmholtz-Zentrum Berlin für Materialien und Energie, Albert-Einstein-Straße 16, 12489 Berlin',
  'name': '01: HZB',
  'pid': 'ROR:02aj13c28'},
 {'fullReference': 'Computational Nano Optics, Zuse Institute Berlin, Takustraße 7, 14195 Berlin',
  'name': '02: ZIB',
  'pid': 'ROR:02eva5865'}]

The prefix 01: and 02: in the name attribute respectively would guarantee the proper ordering in the display in the landing page.

@kevinphippsstfc
Copy link
Copy Markdown
Contributor

I'm not entirely comfortable with the dual use of the name field regarding it also being used for ordering. Would it be better to put this in a separate "orderKey" field like has been done in DataPublicationUser?

@RKrahl
Copy link
Copy Markdown
Member Author

RKrahl commented Dec 21, 2021

I'm not entirely comfortable with the dual use of the name field regarding it also being used for ordering. Would it be better to put this in a separate "orderKey" field like has been done in DataPublicationUser?

How the new entity classes are actually being used in practice and whether they use the name attribute in Affiliation to establish a well defined order is up to the facilities to decide. What I illustrated in my example above was that the name attribute could be used in this way, a feature that we don't have in the schema version as implemented in #256.

As the result of adding fullReference, the remaining purpose of name is to disambiguate multiple affiliation entries of a user in a given data publication. The actual value essentially doesn't matter as it is probably never exposed, neither on the publication landing page, nor in the DataCite metadata.1 It seems a little exaggerated to me to add two attributes, name and orderKey, whose actual value doesn't matter other than that it differs and may define a particular order.

We could also rename name to orderKey if that makes you feel more comfortable.

Footnotes

  1. Again, it is up to the facilities to decide, how they design their landing pages and whether they expose Affiliation.name there. But the schema in the present PR is designed such that there is no need or compelling reason to expose it.

@kevinphippsstfc
Copy link
Copy Markdown
Contributor

OK, I'm happy to go with your implementation as is (no need to rename name to orderKey). You have clearly done a lot more thinking about the specifics of how this is going to be implemented than I have, so I trust your judgement.

@kevinphippsstfc kevinphippsstfc added this to the 5.0.0 milestone Feb 4, 2022
@RKrahl RKrahl merged commit ff5d1d8 into icat-schema-extension Feb 7, 2022
@RKrahl RKrahl deleted the icat-schema-extension-affiliation branch February 7, 2022 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement schema this involves changes to the ICAT schema

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants