Skip to content

Filter Improvements#11454

Merged
ofahimIQSS merged 35 commits intoIQSS:developfrom
GlobalDataverseCommunityConsortium:FilterUpdates
Jun 4, 2025
Merged

Filter Improvements#11454
ofahimIQSS merged 35 commits intoIQSS:developfrom
GlobalDataverseCommunityConsortium:FilterUpdates

Conversation

@qqmyers
Copy link
Copy Markdown
Member

@qqmyers qqmyers commented Apr 29, 2025

What this PR does / why we need it: This PR refactors the filters Dataverse uses to add Cors Headers, handle API version mapping, and handle API blocking.

There are various changes including moving the API-specific code to a JAX-RS ContainerRequestFilter (so it doesn't run for non-API calls), caching the :AllowCors setting to avoid a db call per request, caching the regular expressions used, etc.

It also adds an 'X-Dataverse-unblock-key' header so the key isn't exposed in the URL/browser history and adds a warning if the key is weaker than is allowed for passwords.

FWIW: The refactoring has one potentially useful extension of functionality - because the mapping is done against the @path annotations, it would be possible to block individual calls of the form /api/datasets/{id}/delete (for example) where the URL contains a variable id. We usually block at the class level, e.g. /api/admin so not sure how often this might get used.

Which issue(s) this PR closes:

  • Closes #

Special notes for your reviewer: I've made the new JvmSettings read-once - at startup - and left the deprecated db settings dynamic. (The db settings are only used if the JvmSettings don't exist). I did this because there are multiple variants of install/config for classic and docker install that rely on being able to set the db settings and having them take effect immediately. It seemed like more work than worthwhile in this PR to try updating all of the install variants to use the new static JvmSettings.

Suggestions on how to test this: Verify that the existing db settings still work. Add JvmSettings and verify that they take effect in preference to the db settings. (Can check the server.log at startup to see what is in effect.) Verify that the drop, localhost-only, and unblock-key policies all work. Verify that the CORS headers are still applied (unless :AllowCors is false). Delete :AllowCors, set the new JvmSetting origin setting and verify that CORS headers are added as before. Optionally set the other CORS JvmSettings and verify that the headers are modified as expected.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?: included

Additional documentation: in config guide

@coveralls
Copy link
Copy Markdown

coveralls commented Apr 29, 2025

Coverage Status

coverage: 23.142% (+0.03%) from 23.114%
when pulling 5643f51 on GlobalDataverseCommunityConsortium:FilterUpdates
into 5db10ea on IQSS:develop.

@poikilotherm
Copy link
Copy Markdown
Contributor

poikilotherm commented Apr 29, 2025

As soon as I'm done with putting out the AI bot fires, I wanted to pick up the slack on #10618 . As a part of that, I wanted to add backport options to ship important updates for older app container image releases (like I did for base images). Is it worth backporting all changes of this PR or limit it to the urgent matter at hand?

@qqmyers qqmyers marked this pull request as ready for review April 30, 2025 14:50
@qqmyers qqmyers moved this to Ready for Review ⏩ in IQSS Dataverse Project Apr 30, 2025
@qqmyers qqmyers added this to the 6.7 milestone Apr 30, 2025
@qqmyers qqmyers added the Size: 10 A percentage of a sprint. 7 hours. label Apr 30, 2025
@cmbz cmbz added the FY25 Sprint 23 FY25 Sprint 23 (2025-05-07 - 2025-05-21) label May 7, 2025
@cmbz cmbz added the FY25 Sprint 24 FY25 Sprint 24 (2025-05-21 - 2025-06-04) label May 22, 2025
@pdurbin pdurbin self-assigned this Jun 2, 2025
@pdurbin pdurbin moved this from Ready for Review ⏩ to In Review 🔎 in IQSS Dataverse Project Jun 2, 2025
Copy link
Copy Markdown
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some quick feedback from a pass @poikilotherm and I made.

@qqmyers qqmyers removed their assignment Jun 3, 2025
Copy link
Copy Markdown
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More feedback. I tested the new X-Dataverse-unblock-key header and it works fine. I'm not sure what we'll do in Docker when we remove the deprecated settings. I started a thread about it.

echo "Revoke the key that allows for creation of builtin users..."
curl -sS -X DELETE "${DATAVERSE_URL}/api/admin/settings/BuiltinUsers.KEY"

# TODO: stop using these deprecated database settings. See https://github.com/IQSS/dataverse/pull/11454
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The :doc:`/api/native-api` contains a useful but potentially dangerous set of API endpoints called "admin" that allows you to change system settings, make ordinary users into superusers, and more. The "builtin-users" endpoints let admins do tasks such as creating a local/builtin user account if they know the key defined in :ref:`BuiltinUsers.KEY`.

By default, most APIs can be operated on remotely and a number of endpoints do not require authentication. The endpoints "admin" and "builtin-users" are limited to localhost out of the box by the settings :ref:`:BlockedApiEndpoints` and :ref:`:BlockedApiPolicy`.
By default in the code, most of these API endpoints can be operated on remotely and a number of endpoints do not require authentication. However, the endpoints "admin" and "builtin-users" are limited to localhost out of the box by the installer, using the JvmSettings :ref:`API_BLOCKED_ENDPOINTS <dataverse.api.blocked.endpoints>` and :ref:`API_BLOCKED_POLICY <dataverse.api.blocked.policy>`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the installer really doing this? I don't see any changes to the installer in this PR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we really want the link to say API_BLOCKED_ENDPOINTS? dataverse.api.blocked.endpoints is a bit less shout-y and what we normally do. 😄

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't change the installer, but I think it was doing this before. I just added "by the installer" because the code defaults are not to block any apis but something in the normal setup process sets those settings - I just called that the "installer" - could say by the normal install process? FWIW: What I found was

curl -X PUT -d burrito "${DATAVERSE_URL}/api/admin/settings/BuiltinUsers.KEY"
curl -X PUT -d localhost-only "${DATAVERSE_URL}/api/admin/settings/:BlockedApiPolicy"
and
curl -X DELETE "${DATAVERSE_URL}/api/admin/settings/BuiltinUsers.KEY"
curl -X PUT -d 'admin,builtin-users' "${DATAVERSE_URL}/api/admin/settings/:BlockedApiEndpoints"
. And the guides say
**IMPORTANT:** Please note, that "out of the box" the installer will configure the Dataverse installation to leave unrestricted access to the administration APIs from (and only from) localhost. Please consider the security implications of this arrangement (anyone with shell access to the server can potentially mess with your Dataverse installation). An alternative solution would be to block open access to these sensitive API endpoints completely; and to only allow requests supplying a pre-defined "unblock token" (password). If you prefer that as a solution, please consult the supplied script ``post-install-api-block.sh`` for examples on how to set it up. See also "Securing Your Installation" under the :doc:`config` section.
. There's also a separate post-install-api-block.sh script mentioned in a few places, and the configbaker init.sh blocks those apis with a key (not localhost).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started a thread in Slack about it. Happy to do a "talk after" at standup as well.

There's a monthly container meeting on Thursday. I'll bring up this PR there as well, the configbaker stuff especially.

Copy link
Copy Markdown
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really tested this in any meaningful way but I did help improve the docs a bit. I'm sending this to QA.

@github-project-automation github-project-automation bot moved this from In Review 🔎 to Ready for QA ⏩ in IQSS Dataverse Project Jun 3, 2025
@pdurbin pdurbin removed their assignment Jun 4, 2025
@ofahimIQSS ofahimIQSS self-assigned this Jun 4, 2025
@ofahimIQSS ofahimIQSS moved this from Ready for QA ⏩ to QA ✅ in IQSS Dataverse Project Jun 4, 2025
@ofahimIQSS
Copy link
Copy Markdown
Contributor

Fix looks good to me. Merging PR.

  • Reproduced the issue in internal using latest dev branch.
  • Migrated deprecated configuration to updated, secure runtime settings
  • tested the patch
  • Retested and confirmed the issue is no longer reproducible

Created #11553 to track deprecated DB settings in installer.

@ofahimIQSS ofahimIQSS merged commit 1d03a2f into IQSS:develop Jun 4, 2025
22 checks passed
@github-project-automation github-project-automation bot moved this from QA ✅ to Merged 🚀 in IQSS Dataverse Project Jun 4, 2025
@ofahimIQSS ofahimIQSS removed their assignment Jun 4, 2025
@pdurbin pdurbin moved this from Merged 🚀 to Done 🧹 in IQSS Dataverse Project Jun 5, 2025
@cmbz cmbz added the FY26 Sprint 4 FY26 Sprint 4 (2025-08-13 - 2025-08-27) label Aug 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FY25 Sprint 22 FY25 Sprint 22 (2025-04-23 - 2025-05-07) FY25 Sprint 23 FY25 Sprint 23 (2025-05-07 - 2025-05-21) FY25 Sprint 24 FY25 Sprint 24 (2025-05-21 - 2025-06-04) FY26 Sprint 4 FY26 Sprint 4 (2025-08-13 - 2025-08-27) Size: 10 A percentage of a sprint. 7 hours.

Projects

Status: Done 🧹

Development

Successfully merging this pull request may close these issues.

6 participants