-
Notifications
You must be signed in to change notification settings - Fork 4
API for creating interconnected FAIR-DOs (a FAIR-DO graph) in a single request #247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API for creating interconnected FAIR-DOs (a FAIR-DO graph) in a single request #247
Conversation
|
Note: bidirectional relations should be properly defined. Some bidirectional relations have different relation types. For example, "hasMetadata" and "isMetadataFor". Should / can we (reliably) support these cases? |
Probably, this would be useful at some point in time. For now, due to time constraints and a lack of a concrete use case, I moved these ideas to the original issue (#109). |
|
Thanks for contributing! Besides the CI failing, can you give an example here? To which degree does this work? Does the value need to fully match a placeholder or will "value where $insert_pid_here" work? I think this should be reflected in the documentation. I'll try to do a review in the next days. |
Signed-off-by: Maximilian Inckmann <[email protected]>
…try to enable cloning of the Builders Signed-off-by: Maximilian Inckmann <[email protected]>
Signed-off-by: Maximilian Inckmann <[email protected]>
Signed-off-by: Maximilian Inckmann <[email protected]>
Signed-off-by: Maximilian Inckmann <[email protected]>
Pfeil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just small comments, a proper/full review will come at a later point :)
src/test/java/edu/kit/datamanager/pit/web/PIDRecordBuilder.java
Outdated
Show resolved
Hide resolved
…f Batch Record creation endpoint + tests Signed-off-by: Maximilian Inckmann <[email protected]>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis update introduces comprehensive support for batch creation of PID records, including new API endpoints and internal logic for mapping, validating, and registering multiple connected records in a single request. It adds cloning and deep-copy capabilities to core domain classes, introduces new builder utilities and extensive tests for connected PID scenarios, and refines configuration and style consistency. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant RESTController
participant TypingRESTResourceImpl
participant PIDService
participant Database
Client->>RESTController: POST /api/v1/pit/pids (List<PIDRecord>)
RESTController->>TypingRESTResourceImpl: createPIDs(records, dryrun, ...)
TypingRESTResourceImpl->>TypingRESTResourceImpl: generatePIDMapping(records, dryrun)
TypingRESTResourceImpl->>TypingRESTResourceImpl: applyMappingsToRecordsAndValidate(records, mappings, prefix)
alt dryrun
TypingRESTResourceImpl-->>RESTController: BatchRecordResponse (HTTP 200)
else registration
loop for each record
TypingRESTResourceImpl->>PIDService: registerPID(record)
PIDService->>Database: persist(record)
end
alt all succeed
TypingRESTResourceImpl-->>RESTController: BatchRecordResponse (HTTP 201)
else any fail
TypingRESTResourceImpl->>PIDService: rollback successful registrations
TypingRESTResourceImpl-->>RESTController: Error response (HTTP 500)
end
end
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Signed-off-by: Maximilian Inckmann <[email protected]>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (9)
src/test/java/edu/kit/datamanager/pit/web/PIDBuilder.java (2)
116-132: Consider using SHA-256 instead of SHA-1.While SHA-1 is sufficient for generating test UUIDs, using SHA-256 aligns with modern security practices and eliminates the static analysis warning.
private static UUID generateUUID(String seed) { try { - MessageDigest md = MessageDigest.getInstance("SHA-1"); + MessageDigest md = MessageDigest.getInstance("SHA-256"); byte[] hash = md.digest(seed.getBytes(StandardCharsets.UTF_8)); long msb = 0; long lsb = 0; for (int i = 0; i < 8; i++) { msb = (msb << 8) | (hash[i] & 0xff); } for (int i = 8; i < 16; i++) { lsb = (lsb << 8) | (hash[i] & 0xff); } return new UUID(msb, lsb); } catch (NoSuchAlgorithmException e) { throw new RuntimeException(e); } }
174-182: Consider removing the confusing clone(PIDBuilder) method.Having both
clone()andclone(PIDBuilder)methods creates confusion. Theclone(PIDBuilder)method is actually a copy constructor pattern and should be renamed or removed in favor of using the standardclone()method.Either rename it to clarify its purpose:
-public PIDBuilder clone(PIDBuilder builder) { +public PIDBuilder copyFrom(PIDBuilder builder) { this.seed = builder.seed; this.random = new Random(seed); this.prefix = builder.prefix; this.suffix = builder.suffix; return this; }Or remove it entirely and use the standard clone pattern:
// Usage: PIDBuilder copy = original.clone();src/main/java/edu/kit/datamanager/pit/domain/PIDRecordEntry.java (1)
27-38: Simplify the clone() implementation.The manual field assignments are redundant since
super.clone()already copies all fields, and String fields are immutable.@Override public PIDRecordEntry clone() { try { - PIDRecordEntry clone = (PIDRecordEntry) super.clone(); - clone.setKey(this.key); - clone.setName(this.name); - clone.setValue(this.value); - return clone; + return (PIDRecordEntry) super.clone(); } catch (CloneNotSupportedException e) { throw new AssertionError(); } }src/main/java/edu/kit/datamanager/pit/web/BatchRecordResponse.java (2)
26-26: Consider using consistent terminology: "fictitious" instead of "fictionary".The term "fictionary" appears to be non-standard English. Consider using "fictitious" or stick with "fantasy" as used elsewhere in the codebase for consistency.
29-31: Use standard Javadoc @param tags instead of "Arguments:" section.For better IDE integration and Javadoc generation, consider using standard parameter documentation.
Replace the Arguments section with:
- * Arguments: - * - pidRecords: List of PIDRecord objects representing the processed records. (List<PIDRecord>) - * - mapping: Map where keys are user-provided identifiers (fictionary) and values are the corresponding real record Handle PIDs. (Map<String, String>) + * @param pidRecords List of PIDRecord objects representing the processed records + * @param mapping Map where keys are user-provided identifiers (fictitious) and values are the corresponding real record Handle PIDssrc/main/java/edu/kit/datamanager/pit/domain/PIDRecord.java (1)
88-91: Document that setEntries bypasses validation.The
setEntriesmethod allows direct manipulation of the entries map, bypassing the validation logic inaddEntry. Consider adding Javadoc to warn about this.+ /** + * Sets the entries map directly. Use with caution as this bypasses validation. + * Primarily intended for cloning and testing purposes. + * + * @param entries the entries map to set + */ public void setEntries(Map<String, List<PIDRecordEntry>> entries) { this.entries = entries; }src/test/java/edu/kit/datamanager/pit/web/PIDRecordBuilder.java (1)
502-509: Random string generation may produce problematic characters.The
generateRandomStringmethod can generate control characters, surrogates, and other non-printable characters that might cause issues in tests or systems that process these strings.Consider limiting the character range to printable ASCII or alphanumeric characters:
private String generateRandomString(int length) { StringBuilder result = new StringBuilder(); for (int i = 0; i < length; i++) { - char c = (char) random.nextInt(Character.MAX_VALUE); + // Generate printable ASCII characters (space to ~) + char c = (char) (random.nextInt(94) + 32); result.append(c); } return result.toString(); }src/main/java/edu/kit/datamanager/pit/web/impl/TypingRESTResourceImpl.java (1)
178-186: Consider tracking rollback failures for operational monitoring.The rollback logic logs errors but doesn't track which PIDs failed to rollback. This could lead to orphaned PIDs in the system.
Consider collecting rollback failures and including them in the response or metrics:
if (!failedRecords.isEmpty()) { + List<String> rollbackFailures = new ArrayList<>(); for (PIDRecord successfulRecord : validatedRecords) { try { LOG.debug("Rolling back PID creation for record with PID {}.", successfulRecord.getPid()); this.typingService.deletePid(successfulRecord.getPid()); } catch (Exception e) { LOG.error("Could not rollback PID creation for record with PID {}. Error: {}", successfulRecord.getPid(), e.getMessage()); + rollbackFailures.add(successfulRecord.getPid()); } } + if (!rollbackFailures.isEmpty()) { + LOG.error("Failed to rollback {} PIDs: {}", rollbackFailures.size(), rollbackFailures); + }src/test/java/edu/kit/datamanager/pit/web/ConnectedPIDsTest.java (1)
231-232: Use a more specific exception type.Instead of catching generic
Exception.class, consider catching a more specific exception type thatnullRecord().build()is expected to throw.- assertThrows(Exception.class, nullBuilder::build, "nullRecord should throw exception when built"); + assertThrows(IllegalStateException.class, nullBuilder::build, "nullRecord should throw exception when built");
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.gitignore(1 hunks)config/application-default.properties(2 hunks)src/main/java/edu/kit/datamanager/pit/common/RecordValidationException.java(1 hunks)src/main/java/edu/kit/datamanager/pit/domain/PIDRecord.java(13 hunks)src/main/java/edu/kit/datamanager/pit/domain/PIDRecordEntry.java(1 hunks)src/main/java/edu/kit/datamanager/pit/web/BatchRecordResponse.java(1 hunks)src/main/java/edu/kit/datamanager/pit/web/ITypingRestResource.java(6 hunks)src/main/java/edu/kit/datamanager/pit/web/impl/TypingRESTResourceImpl.java(1 hunks)src/test/java/edu/kit/datamanager/pit/web/ConnectedPIDsTest.java(1 hunks)src/test/java/edu/kit/datamanager/pit/web/PIDBuilder.java(1 hunks)src/test/java/edu/kit/datamanager/pit/web/PIDRecordBuilder.java(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/main/java/edu/kit/datamanager/pit/common/RecordValidationException.java (1)
src/main/java/edu/kit/datamanager/pit/domain/PIDRecord.java (1)
PIDRecord(34-288)
🪛 ast-grep (0.38.6)
src/test/java/edu/kit/datamanager/pit/web/PIDBuilder.java
[warning] 117-117: Detected SHA1 hash algorithm which is considered insecure. SHA1 is not collision resistant and is therefore not suitable as a cryptographic signature. Instead, use PBKDF2 for password hashing or SHA256 or SHA512 for other hash function applications.
Context: MessageDigest.getInstance("SHA-1")
Note: [CWE-328] Use of Weak Hash. [REFERENCES]
- https://owasp.org/Top10/A02_2021-Cryptographic_Failures
(use-of-sha1-java)
🔇 Additional comments (18)
.gitignore (1)
140-142: LGTM!The additions appropriately exclude test data and sensitive credential files from version control.
src/main/java/edu/kit/datamanager/pit/common/RecordValidationException.java (1)
1-54: LGTM!The formatting improvements and license header addition enhance code consistency.
config/application-default.properties (1)
134-139: LGTM!The property format standardization and RabbitMQ health check configuration are appropriate.
src/main/java/edu/kit/datamanager/pit/domain/PIDRecord.java (1)
270-288: Well-implemented deep cloning logic.The clone method properly performs deep copying of the entries map and all PIDRecordEntry objects, ensuring true independence between cloned instances.
src/main/java/edu/kit/datamanager/pit/web/ITypingRestResource.java (2)
54-55: Use consistent terminology throughout the documentation.The documentation uses both "imaginary" and "fictionary" to describe the temporary PIDs. Please use one term consistently.
Also applies to: 71-71, 74-74, 83-83
Likely an incorrect or invalid review comment.
48-104: Well-designed batch creation endpoint.The new
createPIDsmethod provides a clean interface for batch PID creation with proper support for dry-run validation and comprehensive error handling.src/test/java/edu/kit/datamanager/pit/web/PIDRecordBuilder.java (1)
25-67: Excellent documentation and test utility design.The builder class is very well documented with clear examples and comprehensive feature descriptions. The deterministic approach using seeds is perfect for reproducible tests.
src/main/java/edu/kit/datamanager/pit/web/impl/TypingRESTResourceImpl.java (1)
115-134: Excellent timing instrumentation for performance monitoring.The detailed timing logs for mapping, validation, and registration phases will be valuable for performance analysis and optimization.
Also applies to: 171-177
src/test/java/edu/kit/datamanager/pit/web/ConnectedPIDsTest.java (10)
1-50: LGTM! Well-structured test class setup.The test class is properly configured with appropriate Spring Boot test annotations and profile settings for integration testing.
68-81: Good practice: Test setup verification.The
checkTestSetup()method ensures the test environment is properly initialized before running other tests, which helps catch configuration issues early.
83-162: Comprehensive test coverage for PIDBuilder.The test thoroughly exercises all PIDBuilder methods including edge cases. Note that direct field access (line 96) suggests package-private visibility, which is acceptable for test builders.
247-251: Potential test flakiness with deterministic assumption.The test assumes that builders with the same seed will produce identical results (line 250). This might not hold true if there's any non-deterministic behavior in the builder implementation.
Consider verifying that the PIDRecordBuilder implementation is truly deterministic when using the same seed, or adjust the test expectations accordingly.
258-309: Well-designed connection functionality tests.The tests properly validate both the
addConnectioninstance method andconnectRecordBuildersstatic method, including error cases and custom key configurations.
311-458: Excellent API endpoint test coverage.These tests comprehensively validate the
/api/v1/pit/pidsendpoint behavior for:
- Valid connected records creation
- Single record creation
- Empty list rejection
- Dry-run mode without persistence
- Invalid JSON format handling
- Unsupported media type rejection
The combination of HTTP response assertions and database state verification ensures proper endpoint functionality.
460-588: Thorough edge case testing.These tests effectively validate critical scenarios:
- Circular references between records are handled correctly
- Duplicate temporary PIDs are properly rejected
- Records with incomplete profiles fail validation
- Large batches (200 records) are processed successfully
- External references are preserved
The edge case coverage helps ensure robustness of the batch PID creation feature.
668-987: Comprehensive coverage of builder patterns and failure scenarios.The remaining tests excellently cover:
- PIDBuilder edge cases with various prefix/suffix combinations
- Invalid configuration handling
- Builder method chaining behavior
- Multiple rollback scenarios ensuring atomicity
- Batch operations with duplicate PIDs
The rollback tests are particularly valuable as they ensure the batch creation operation maintains data integrity by preventing partial success states.
45-988: Outstanding test implementation for the batch PID creation feature.This test class provides exceptional coverage of the new
/api/v1/pit/pidsendpoint with:
- Comprehensive builder pattern testing
- Thorough API endpoint validation
- Edge case and error scenario coverage
- Transactional rollback verification
- Performance testing with large batches
The test structure follows best practices with clear naming, proper assertions, and good use of test data builders.
663-663: Action required: Configure Java 21+ or replacegetFirst()usageThe
List.getFirst()call requires Java 21 and above. Your Gradle build file doesn’t explicitly enforce a minimum JDK version, so invokinggetFirst()may break on older JDKs.• File: src/test/java/edu/kit/datamanager/pit/web/ConnectedPIDsTest.java (line 663)
String storedPid = knownPidsDao.findAll().getFirst().getPid();Consider one of the following:
– For Java < 21 compatibility, switch toget(0):- String storedPid = knownPidsDao.findAll().getFirst().getPid(); + String storedPid = knownPidsDao.findAll().get(0).getPid();– To keep
getFirst(), configure a Java 21 toolchain in yourbuild.gradle:java { toolchain { languageVersion = JavaLanguageVersion.of(21) } }
src/main/java/edu/kit/datamanager/pit/web/impl/TypingRESTResourceImpl.java
Show resolved
Hide resolved
Signed-off-by: Maximilian Inckmann <[email protected]>
Signed-off-by: Maximilian Inckmann <[email protected]>
Pfeil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall very nice. Just some minor comments we should discuss or address. You did some other nice refactors, too: thanks for putting so much work into it.
src/main/java/edu/kit/datamanager/pit/web/ITypingRestResource.java
Outdated
Show resolved
Hide resolved
src/main/java/edu/kit/datamanager/pit/web/ITypingRestResource.java
Outdated
Show resolved
Hide resolved
src/main/java/edu/kit/datamanager/pit/web/ITypingRestResource.java
Outdated
Show resolved
Hide resolved
src/main/java/edu/kit/datamanager/pit/web/ITypingRestResource.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Maximilian Inckmann <[email protected]>
Pfeil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks a lot!
# Conflicts: # build.gradle
Relationship management API
see issue #109
Why?
While creating a more complex use case of FAIR-DOs, I needed to create connected FAIR-DOs. The Typed PID Maker couldn't do it, therefore I initially planned to write a (highly disposable) Python script. Until I saw this open issue...
In issue #109 an idea of managing relationships of already existing FAIR-DOs was introduced. This was not quite what I needed for my use case, because I had to create a not small amount of FAIR-DOs (>2500 per data source --> 3*2500 = 7.500 FAIR-DOs). This made it undesirable to first create some half-finished FAIR-DOs, just to extend them later... Therefore, I extended the existing
/pit/pidendpoint to create connected FAIR-DOs from a list of PIDRecords with "fantasy PIDs" just for referencing. These "fantasy PIDs" are mapped to freshly-generated PIDs and simply replaced in the value fields of all the records in the same request.What?
/api/v1/pit/pidsendpoint for creating connected FAIR-DOScreatePIDs()methodSee future ideas in issue #109
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Refactor
Tests
Chores
.gitignoreto exclude new test data and sensitive files.