Skip to content

Conversation

@chirag-madlani
Copy link
Collaborator

@chirag-madlani chirag-madlani commented Jan 28, 2026

This pull request enhances the static asset serving capabilities of the OpenMetadata backend by adding support for Brotli and Gzip pre-compressed assets, prioritizing Brotli when available. It also introduces comprehensive tests to ensure correct behavior and updates the build pipeline to generate and handle Brotli-compressed files.

Asset Serving Improvements:

  • The OpenMetadataAssetServlet now checks for .br (Brotli) and .gz (Gzip) versions of requested assets and serves them if the client supports the corresponding encoding, prioritizing Brotli over Gzip. If neither is available or accepted, it falls back to the normal asset. (openmetadata-service/src/main/java/org/openmetadata/service/socket/OpenMetadataAssetServlet.java) [1] [2] [3] [4] [5]

Testing Enhancements:

  • Added a new test suite covering scenarios for serving Brotli, Gzip, and normal assets, as well as prioritization and fallback logic. (openmetadata-service/src/test/java/org/openmetadata/service/socket/OpenMetadataAssetServletTest.java)
  • Added corresponding test resources for normal, Gzip-only, and Brotli assets. (openmetadata-service/src/test/resources/assets/normal.js, openmetadata-service/src/test/resources/assets/gzip_only.js, openmetadata-service/src/test/resources/assets/gzip_only.js.gz, openmetadata-service/src/test/resources/assets/test.js, openmetadata-service/src/test/resources/assets/test.js.gz, openmetadata-service/src/test/resources/assets/test.js.br) [1] [2] [3] [4] [5] [6]

Build Pipeline Updates:

  • The Maven UI build configuration now treats .br files as non-filtered resources, ensuring they are not altered during packaging. (openmetadata-ui/pom.xml)
  • The Vite build config now generates Brotli-compressed assets alongside Gzip assets in production builds. (openmetadata-ui/src/main/resources/ui/vite.config.ts)

With compression load times:

Screenshot 2026-01-28 at 4 57 13 PM

Without compression load times:

Screenshot 2026-01-28 at 5 12 01 PM

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@github-actions
Copy link
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

karanh37
karanh37 previously approved these changes Jan 28, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 28, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 65%
65.88% (55752/84628) 45.01% (28859/64123) 47.81% (8771/18346)

@chirag-madlani
Copy link
Collaborator Author

Gitar fix CI

- Fix import ordering by moving static imports to the end
- Add path traversal validation to prevent security vulnerability
- Normalize paths and validate against resource directory to prevent directory traversal attacks
- Handle null returns from getPathToCheck for invalid paths

Co-authored-by: chirag-madlani <[email protected]>
@gitar-bot
Copy link

gitar-bot bot commented Jan 28, 2026

Gitar fix CI

✅ CI issues fixed!

Changes made:

  1. Checkstyle: Fixed import ordering by moving static imports to the end
  2. CodeQL Security: Added path traversal validation to prevent directory traversal attacks
    • Normalized paths using Path.normalize()
    • Validated that resolved paths stay within the resource directory
    • Return null for invalid paths with proper null checking

What was fixed:

  • Import statements now follow proper Java conventions (static imports after regular imports)
  • getPathToCheck() method now validates paths to prevent ../ attacks
  • Both Brotli and Gzip compression paths now check for null before using the path

The integration test failure (TableResourceIT.test_tableLineage) is unrelated to these changes - it's a flaky Elasticsearch version conflict in table lineage functionality, which doesn't interact with asset serving.

aniketkatkar97
aniketkatkar97 previously approved these changes Jan 28, 2026
@gitar-bot
Copy link

gitar-bot bot commented Jan 29, 2026

🔍 CI failure analysis for 4baebd3: CI failures: Python 3.10 & 3.11 S3 tests (AWS SDK issue - recurring), Playwright (17 flaky tests). All unrelated to PR.

Issue

Multiple CI test failures - Python (both 3.10 & 3.11) and Playwright tests


Failure 1: Python 3.10 - S3 Storage Test

Root Cause: AWS SDK Version Mismatch

Error: botocore.errorfactory.MissingRequiredParameterException: Invalid STS API version 2010-08-01, expecting 2011-06-15

Test: ingestion/tests/integration/s3/test_s3_storage.py::test_s3_ingestion

Note: Recurring environmental issue across commits


Failure 2: Python 3.11 - S3 Storage Test (NEW)

Root Cause: Same AWS SDK Issue

Error: WorkflowExecutionError: Sampler reported errors: Sampler Summary: [0 Records, [0 Updated Records, 0 Warnings, 7 Errors, 0 Filtered]

Test: ingestion/tests/integration/s3/test_s3_storage.py

Note: Same "7 Errors" pattern - AWS SDK/CloudWatch metrics issue affecting both Python 3.10 and 3.11


Failures 3 & 4: Playwright (2 shards)

Shard 6/6: 1 failed, 6 flaky (passed on retry), 526 passed
Shard 3/6: 1 failed, 11 flaky (passed on retry), 581 passed

Total: 17 flaky tests (all passed on retry)


Combined Summary

Total:

  1. Python 3.10 & 3.11: S3 integration tests (same AWS SDK recurring issue)
  2. Playwright: 2 shards, 17 flaky tests (all passed on retry)

Assessment: All failures are infrastructure/test environment issues unrelated to PR's HTTP asset compression changes (Brotli/Gzip).

Code Review ✅ Approved 3 resolved / 3 findings

Well-implemented compression feature with proper security controls, Accept-Encoding handling, and comprehensive test coverage. No significant issues found.

✅ 3 resolved
Quality: Silent exception swallowing hides errors

📄 openmetadata-service/src/main/java/org/openmetadata/service/socket/OpenMetadataAssetServlet.java 📄 openmetadata-service/src/main/java/org/openmetadata/service/socket/OpenMetadataAssetServlet.java
The exception handling at lines 53-55 and 70-72 silently catches and ignores all exceptions with just "// Ignore and try next" or "// Fallback to default behavior". While falling back gracefully is appropriate, completely swallowing exceptions makes debugging difficult in production.

Consider:

  1. At minimum, log the exception at DEBUG or TRACE level for troubleshooting
  2. Be more specific about which exceptions are expected (e.g., IOException) vs unexpected

Example fix:

} catch (IOException e) {
    LOG.debug("Failed to check for brotli compressed asset: {}", fullResourcePath, e);
}
Edge Case: Accept-Encoding parsing doesn't handle q-values properly

📄 openmetadata-service/src/main/java/org/openmetadata/service/socket/OpenMetadataAssetServlet.java
The current implementation uses simple contains("br") and contains("gzip") checks on the Accept-Encoding header, which doesn't properly handle the HTTP specification for encoding quality values (q-values).

For example, a client might send:

  • Accept-Encoding: gzip;q=1.0, br;q=0 - meaning they prefer gzip and explicitly don't want brotli
  • Accept-Encoding: br;q=0.5, gzip;q=1.0 - meaning they prefer gzip over brotli

The current code would serve brotli in both cases because it checks brotli first and only uses contains("br").

While most modern clients send simple Accept-Encoding values without q-values, strictly following the HTTP spec would require parsing q-values. For a more robust solution, consider:

  1. Checking for br;q=0 patterns to detect explicitly rejected encodings
  2. Or documenting this as known behavior if q-value support isn't needed
Security: Path traversal check may be bypassed with resource: URLs

📄 openmetadata-service/src/main/java/org/openmetadata/service/socket/OpenMetadataAssetServlet.java
The path traversal validation in getPathToCheck() uses Paths.get() which operates on filesystem paths, but the actual resource loading uses this.getClass().getResource() which loads from the classpath. These use different resolution mechanisms.

The check Paths.get(fullPath).normalize() validates filesystem-style paths, but classpath resource resolution may handle paths differently (e.g., .. handling in JAR files). Additionally, the path comparison using normalizedPath.startsWith(baseResourcePath) may fail on Windows due to different path separators.

While the parent AssetServlet class likely has its own protections, the inconsistency between validation and actual resource loading could potentially be exploited. Consider:

  1. Using URI-based path normalization instead of filesystem paths
  2. Ensuring the check aligns with how getClass().getResource() actually resolves paths
  3. Adding explicit checks for .. and other traversal patterns in the raw input before normalization

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link

@sonarqubecloud
Copy link

@chirag-madlani chirag-madlani merged commit 13f2670 into main Jan 29, 2026
31 of 37 checks passed
@chirag-madlani chirag-madlani deleted the fix-initial-load-time branch January 29, 2026 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs UI UI specific issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants