Conversation
|
The Java checkstyle failed. Please run You can install the pre-commit hooks with |
|
TypeScript types have been updated based on the JSON schema changes in the PR |
🔍 CI failure analysis for bbcd40c: Maven Sonarcloud CI failed with 64 test failures, primarily in translation tests - test expects English but gets Spanish translation, indicating a bug in the translation retrieval logic or test implementation.IssueMaven Sonarcloud CI build failed with 64 test failures:
Root CauseThis failure IS related to the PR changes - there is a bug in how translations are retrieved or tested. Primary Failure Pattern (60+ tests)Error at line 1338 in EntityResourceTest.test_translations: Affected test classes (60+ entity types):
What's HappeningTest sequence (from logs):
The bug: When retrieving an entity, the translation is being applied even when:
AnalysisThis is a critical bug in the translation feature logic: The translation retrieval logic in
Test expectation (line 1338): Secondary Failures (DataProduct tests)5 additional failures in DataProductResourceTest:
These suggest that the translation changes may have broken DataProduct port retrieval logic, possibly related to how entities are serialized/deserialized with the new translations field. DetailsFailure distribution:
All failures trace back to the translation feature implementation. Code Review 🚫 Blocked 0 resolved / 7 findingsAll 7 previous findings remain unresolved: authorization bypass on translation PATCH by ID, FQN locale patch not using patchWithTranslations, duplicate translation entries on every PATCH, EntityUpdater bypass losing change tracking, pervasive LOG.info debug statements on hot paths, empty string fallback blanking displayName/description, and unused opType variable. 🚨 Security: Translation PATCH by ID bypasses authorization entirely📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java:649 When Compare with the regular This means any authenticated user can modify any entity's translations regardless of their permissions, which is a significant access control bypass. The authorization check must be added before calling Suggested fix
|
| Auto-apply | Compact |
|
|
Was this helpful? React with 👍 / 👎 | Gitar
| return patchInternal(uriInfo, securityContext, id, patch, null); | ||
| } | ||
|
|
||
| public Response patchInternal( |
There was a problem hiding this comment.
🚨 Security: Translation PATCH by ID bypasses authorization entirely
When patchInternal(uriInfo, securityContext, id, patch, locale, changeSource) detects a translation patch (locale is non-null and not "en"), it directly calls repository.patchWithTranslations() at line 663 and returns, completely skipping the authorization check.
Compare with the regular patchInternal at line 687, which calls authorizer.authorize(securityContext, operationContext, getResourceContextById(id, ResourceContextInterface.Operation.PATCH)) before executing the patch.
This means any authenticated user can modify any entity's translations regardless of their permissions, which is a significant access control bypass. The authorization check must be added before calling patchWithTranslations.
Suggested fix:
// Process the patch based on locale
JsonPatch processedPatch = patch;
boolean isTranslationPatch = locale != null && !locale.isEmpty() && !"en".equals(locale);
if (isTranslationPatch) {
// Authorize the patch operation
OperationContext operationContext = new OperationContext(entityType, patch);
authorizer.authorize(
securityContext,
operationContext,
getResourceContextById(id, ResourceContextInterface.Operation.PATCH));
// Transform patch operations on displayName/description to translation updates
processedPatch = TranslationPatchUtil.handleTranslationPatch(patch, locale, false);
Was this helpful? React with 👍 / 👎
| return patchInternal(uriInfo, securityContext, fqn, patch, null); | ||
| } | ||
|
|
||
| public Response patchInternal( |
There was a problem hiding this comment.
⚠️ Bug: PATCH by FQN with locale doesn't use patchWithTranslations
The FQN-based patchInternal with locale (line 732-746) transforms the patch via TranslationPatchUtil.handleTranslationPatch() but then passes it to the regular patchInternal(uriInfo, securityContext, fqn, processedPatch, changeSource).
The regular patch path calls repository.patch() which fetches the entity through the standard flow. Unlike patchWithTranslations(), the standard path does NOT initialize the translations field on the entity before applying the patch. The transformed patch contains operations targeting /translations/translations/- (append to translations array), but this path won't exist on the entity if translations have never been set, causing the JSON Patch operation to fail.
This is inconsistent with the by-ID variant at line 649 which correctly calls repository.patchWithTranslations(). The by-FQN variant must follow the same pattern to work correctly.
Was this helpful? React with 👍 / 👎
| newOperations.add(convertToJsonValue(addFieldTranslationOp)); | ||
| } | ||
|
|
||
| // Add or update the translation for this locale |
There was a problem hiding this comment.
⚠️ Bug: Translation PATCH always appends, creating duplicate entries
convertToTranslationPatch() at line 213-217 always generates an "add" operation with path /translations/translations/- (append to end of array). It never checks whether a translation for the given locale already exists.
If a user PATCHes displayName for locale "es" twice, the entity will end up with two Translation entries for "es" in the translations array. Over time, repeated updates will accumulate unbounded duplicates for the same locale, causing:
- Data corruption —
findTranslation()uses.stream().findFirst()so it will always return the oldest (possibly stale) translation - Unbounded storage growth in the entity JSON
The fix should either: (a) find and replace the existing translation for the same locale instead of appending, or (b) add a deduplication step in patchWithTranslations() after applying the patch to merge/replace translations with the same locale.
Was this helpful? React with 👍 / 👎
| return patch(uriInfo, id, user, patch, changeSource, null, null); | ||
| } | ||
|
|
||
| @Transaction |
There was a problem hiding this comment.
⚠️ Bug: patchWithTranslations bypasses EntityUpdater and change tracking
patchWithTranslations() bypasses the normal EntityUpdater flow by directly calling:
updated.setVersion(EntityUtil.nextVersion(original.getVersion()))— always bumps the minor version without considering whether it's a major changestoreEntity(updated, true)— direct store without the EntityUpdater's change description generation, field-level diff tracking, or optimistic concurrency control (no If-Match/version check)postUpdate(original, updated)— creates change events but with noChangeDescription(the entity's change description is never populated)
This means:
- No optimistic locking — concurrent translation patches can silently overwrite each other (last-write-wins)
- No change description — the entity version history won't have details about what changed in translation patches, breaking audit trail
- Version always increments — even if the patch makes no effective change, the version bumps, polluting version history
Consider routing translation patches through the standard patch() method with the translations field properly initialized, or at minimum adding version conflict detection.
Was this helpful? React with 👍 / 👎
|
|
||
| default T jsonToEntity(String json, Object identity) { | ||
| Class<T> clz = getEntityClass(); | ||
| // Debug logging to check if translations field is in JSON |
There was a problem hiding this comment.
⚠️ Performance: Debug LOG.info on hot paths degrades performance for all users
There are approximately 30+ LOG.info() statements with "DEBUG" prefixes scattered across EntityDAO.jsonToEntity(), EntityRepository.setFieldsInBulk(), EntityRepository.listAfter(), EntityRepository.get(), EntityRepository.getByName(), TranslationUtil.applyTranslations(), and related methods. These are on the critical path for every entity read operation.
Key concerns:
- EntityDAO.jsonToEntity() (line 599) — called for EVERY entity deserialization, does
json.contains("\"translations\"")string scan on every entity load - EntityRepository.setFieldsInBulk() (lines 1151-1174) — streams translation locales for every entity in every list operation
- TranslationUtil has 15
LOG.info()calls that fire on every entity retrieval when locale is specified, including per-column/per-field logging
These are clearly development debugging statements that should be removed or downgraded to LOG.debug() before merging. At INFO level, they will generate enormous log volume in production even when no translations are in use, as they fire on every entity read path.
Was this helpful? React with 👍 / 👎
| if (translation.getDescription() != null) { | ||
| entity.setDescription(translation.getDescription()); | ||
| } | ||
| } else { |
There was a problem hiding this comment.
⚠️ Edge Case: Missing translation blanks displayName/description instead of using original
When a locale is requested but no translation exists for it, applyTranslations() at lines 88-91 sets the entity's displayName and description to empty strings:
entity.setDisplayName("");
entity.setDescription("");The same behavior occurs for child columns/fields (lines 236-237, 275-276, 315-316).
This is a destructive behavior that causes data loss in the response. If a user requests locale "ja" for an entity that only has "es" and "fr" translations, they get empty strings instead of the original English content. Most i18n systems use fallback to the default language when a translation is missing, rather than blanking out the content.
The PR description says this is "to signal translation can be provided," but this creates a terrible user experience — any UI component displaying this entity with an unsupported locale will show blank content, and API consumers must handle the empty-string case specially to fall back to re-fetching with locale=en.
Was this helpful? React with 👍 / 👎
|



Describe your changes:
Fixes
I worked on ... because ...
Type of change:
Checklist:
Fixes <issue-number>: <short explanation>Summary by Gitar
This PR introduces comprehensive multi-language (i18n) support for entity content in OpenMetadata, enabling locale-specific storage and retrieval of displayNames and descriptions across the platform.
Key Technical Changes:
Multi-Language Infrastructure: Added
translationsfield to 62+ entity schema JSON files (Table, Topic, Container, SearchIndex, DashboardDataModel, and more) to store locale-specific translations for displayName and description fields with full recursive support for nested columns/fields.Translation Processing Utilities: Created
TranslationUtilfor locale-aware entity retrieval with intelligent fallback logic (e.g., "fr-FR" → "fr" → default "en"), andTranslationPatchUtilfor handling PATCH operations on translations across entity hierarchies with validation and merge semantics.Locale-Aware Repository Layer: Enhanced EntityRepository and entity-specific repositories (TableRepository, TopicRepository, ContainerRepository, etc.) to support locale parameter in get/getList operations with automatic translation application before response serialization.
REST Endpoint Enhancements: Added
@QueryParam("locale")to all REST endpoints across data entity resources (Table, Topic, Container, SearchIndex, DashboardDataModel) for both GET and PATCH operations, enabling client-side language preference selection.Comprehensive Test Coverage: Added translation tests covering multiple entity types, column/field translations, locale fallback mechanisms, PATCH operations with locale handling, and proper serialization of translations.