Skip to content

Improve logging for harvester UUID collisions#9188

Merged
josegar74 merged 1 commit intogeonetwork:mainfrom
tylerjmchugh:improve-harvester-uuid-collision-logging
Mar 4, 2026
Merged

Improve logging for harvester UUID collisions#9188
josegar74 merged 1 commit intogeonetwork:mainfrom
tylerjmchugh:improve-harvester-uuid-collision-logging

Conversation

@tylerjmchugh
Copy link
Contributor

Currently when a UUID collision occurs during harvesting, it is handled internally according to the UUID merge policy (skip, overwrite, etc.), but without any explicit log entry.

As a result, records may be skipped, replaced, or otherwise handled without any visibility in the logs, making it difficult to understand why certain records were not created or were modified.

This PR aims to fix this issue by consistently logging info messages whenever a UUID collision occurs. This makes collision-related behavior easier to trace and debug.

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

@ianwallen ianwallen added this to the 4.4.10 milestone Feb 26, 2026
log.info(String.format("UUID collision detected for record with uuid '%s'. Record already exists in the catalogue but does not belong to this harvester (%s).",
ri.uuid, params.getName()));

switch (params.getOverrideUuid()) {
Copy link
Member

@josegar74 josegar74 Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the change in the log level here from debug to info. It's a harvester parameter params.getOverrideUuid(), logging this information in every metadata seems redundant and, in large catalogues, will generate a lot of log entries.

Please check similar change in the other files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve updated the PR so only SKIP collision logs are changed to info. Would this be more acceptable?

OVERRIDE and RANDOM are expected to handle duplicates, so I left those unchanged.

In our case we use SKIP because we cannot overwrite or generate new UUIDs. The issue is that collisions are currently silent, which makes it hard to understand why some records were not harvested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine @tylerjmchugh, thanks.

@tylerjmchugh tylerjmchugh force-pushed the improve-harvester-uuid-collision-logging branch from b22dbf7 to 9e84664 Compare March 2, 2026 14:08
Copy link
Contributor

@jodygarnett jodygarnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tylerjmchugh feedback addressed, and the change to log skipped records makes sense as INFO (rather than quiet, or WARNING).

@josegar74 josegar74 merged commit 9dad234 into geonetwork:main Mar 4, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants