-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Import export improvements #25542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Import export improvements #25542
Conversation
...s/ui/src/components/Entity/EntityExportModalProvider/EntityExportModalProvider.component.tsx
Show resolved
Hide resolved
… circular dependency, generated changeEvents (#25582) * Fix tag clearing and circular dependency detection in batch CSV imports - **Tag clearing fix**: Add deleteTagsByTarget before applying new tags in batch imports to match single entity import behavior, ensuring empty CSV fields properly clear existing tags - **Circular dependency detection fix**: Pre-track entities in dryRunCreatedEntities before parent resolution to enable proper circular reference validation during CSV team imports - Resolves test failures in TeamResourceIT.test_importCsv_circularDependency_trueRun and tag-related import issues - Maintains batch import performance while restoring pre-batch-import validation contracts * improve storeRelationshipsInternal internal methods - make them truly batched operations * - Add storeEntities override to all repositories (57 repos) - Add batch lock check to HierarchicalLockManager - Add batch cache write to EntityRepository - Fix createManyEntitiesForImport with batched operations - Fix updateManyEntitiesForImport with batched operations - Add change event creation in flushPendingEntityOperations --------- Co-authored-by: sonika-shah <[email protected]>
🔍 CI failure analysis for 8486d24: Maven SonarCloud CI (MySQL) shows 4 failures (99.9% pass rate) - same infrastructure issues as PostgreSQL CI. Test Report failed as consequence. Critical TeamResourceTest bug remains fixed across both database backends.IssueMaven SonarCloud CI (job 62003021906, MySQL backend) shows 1 failure and 3 errors out of 7836 tests (99.9% pass rate: 7831 passed, 1 failure, 3 errors, 701 skipped). Root CauseMaven Test Failures (Same as PostgreSQL CI):
All failures are infrastructure/configuration issues unrelated to CSV batching changes. DetailsConsistency Across Databases:
Critical Success Confirmed: The Test Report (job 62025034391): Failed as downstream consequence of Maven test failures. This is a reporting/aggregation job, not a separate test suite. Complete CI Status:
Impact: All Maven failures are infrastructure issues, not blocking for CSV batching functionality. Code Review 👍 Approved with suggestions 4 resolved / 5 findingsSolid batched CSV import/export implementation. The previously identified unused batchNumber parameter in CsvImportProgressCallback remains unresolved - the parameter is captured in the lambda but not passed to sendCsvImportProgressNotification. 💡 Edge Case: Unused batchNumber parameter in CsvImportProgressCallback📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java:887 The void onProgress(int rowsProcessed, int totalRows, int batchNumber, String message);However, in CsvImportProgressCallback progressCallback =
(rowsProcessed, totalRows, batchNumber, message) ->
WebsocketNotificationHandler.sendCsvImportProgressNotification(
jobId, securityContext, rowsProcessed, totalRows, message); // batchNumber not passedThis means the batch number information is lost when sending WebSocket notifications, even though it's available. Suggested fix: Either pass ✅ 4 resolved✅ Bug: PendingEntityOperation records deleted CSV failures incorrectly
✅ Bug: Missing change event generation in batch entity operations
✅ Bug: Entity version not incremented on batch updates
✅ Bug: Division by zero possible in progress calculation
Rules ✅ All requirements metGitar Rules
2 rules not applicable. Show all rules by commenting Tip Comment OptionsAuto-apply is off → Gitar will not commit updates to this branch. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
Describe your changes:
Fixes
I worked on ... because ...
Summary by Gitar
This PR implements batched CSV import/export operations to significantly improve performance for large datasets:
insertMany()/updateMany()instead of individual INSERT/UPDATE statementsupdateEntitiesBulk()Type of change:
Checklist:
Fixes <issue-number>: <short explanation>