-
Notifications
You must be signed in to change notification settings - Fork 531
Full-text Indexing Fixes #11494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full-text Indexing Fixes #11494
Conversation
Adding @landreev |
In both cases, it should not be relevant to the PR and I think testing can either use new datasets (for the non-globus, embargo issue), or reindexing of earlier Globus datasets (i.e. by using the index dataset api) (to check for Globus/full text warnings). |
|
I created a bug report - #11546 with more analysis of issue 2) |
|
Merging this - was able to finally test with globus up and running. |
What this PR does / why we need it: This PR addresses two problems with full-text indexing:
Which issue(s) this PR closes:
Special notes for your reviewer:
This PR is only a few lines, but is built on #11374 which makes many changes in this part of the code. Nominally #11374 will soon be merged, so checking the PR after that makes more sense.
The original design of the static isDataverseAccessible(String driverId) method caused the getInputStream method for Globus to return an exception rather than a null stream. However, all other possible failures return a null. In this PR, I changed it to return a null. I also added a new isDataverseAccessible() method to StorageIO that defaults to true. The Globus store overrides this to call the static method which looks up the relevant property to decide. This change is efficient for non-Globus stores and a convenience for Globus ones (in the full text indexing, we have a StorageIO class but don't have the driverId to call the static method.
Suggestions on how to test this: Add an embargoed file (not restricted), publish, check whether it's content is visible to search. To test the Globus part, reindex a dataset with Globus/NESE files, with full-text indexing on, verify that there's no null exception in the logs.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?: Not sure - #11374 makes broad indexing changes - it's release note probably covers this.
Additional documentation: