Skip to content

Conversation

@ShashankFC
Copy link
Collaborator

Test 7nn---nReplicated from ai-code-review-evaluation/keycloak-greptile#7

Closes #37428

Signed-off-by: Alexander Schwartz <[email protected]>
@GitHoobar
Copy link

@claude

1 similar comment
@ShashankFC
Copy link
Collaborator Author

@claude

@claude
Copy link

claude bot commented Jan 12, 2026

Claude finished @ShashankFC's task in 2m 7s —— View job


Code Review Complete

Todo List:

  • Check for repository guidelines (CLAUDE.md)
  • Analyze PR changes and diff
  • Review HTML sanitizer implementation
  • Review test coverage
  • Check message property changes
  • Provide comprehensive feedback

Summary

This PR implements HTML sanitization for translated message resources in Keycloak to prevent XSS vulnerabilities through malicious translations. The implementation is well-designed with comprehensive test coverage. However, there are several issues that need to be addressed.


Critical Issues

1. Logic Error in santizeAnchors Method (misc/theme-verifier/src/main/java/org/keycloak/themeverifier/VerifyMessageProperties.java:150-162)

The anchor sanitization logic has a fundamental flaw. It modifies the value string while iterating, which will cause the matcher position to become invalid:

private String santizeAnchors(String key, String value, String englishValue) {
    Matcher matcher = ANCHOR_PATTERN.matcher(value);
    Matcher englishMatcher = ANCHOR_PATTERN.matcher(englishValue);
    while (matcher.find()) {
        if (englishMatcher.find() && Objects.equals(matcher.group(), englishMatcher.group())) {
            value = value.replaceFirst(Pattern.quote(englishMatcher.group()), "");  // ❌ Modifying string during iteration
        } else {
            messages.add("Didn't find anchor tag " + matcher.group() + " in original string");
            break;
        }
    }
    return value;
}

Problem: When you replace text in value, the matcher positions become invalid for subsequent iterations.

Suggested fix: Build a list of anchor tags to remove and process them after iteration completes, or create a new matcher after each replacement.

Fix this →


2. Typo in Method Name (misc/theme-verifier/src/main/java/org/keycloak/themeverifier/VerifyMessageProperties.java:88, 150)

Method name is santizeAnchors but should be sanitizeAnchors (missing 'i').

Fix this →


High Priority Issues

3. Incomplete HTML Tag Validation (misc/theme-verifier/src/main/java/org/keycloak/themeverifier/VerifyMessageProperties.java:58-61)

The allowed HTML elements policy only permits br, p, strong, and b, but anchor tags <a> are handled separately. This creates an inconsistency:

PolicyFactory POLICY_SOME_HTML = new org.owasp.html.HtmlPolicyBuilder()
        .allowElements(
                "br", "p", "strong", "b"
        ).toFactory();

Issue: The santizeAnchors method validates anchor tags but they're not in the allowed elements list, so they would be stripped by the sanitizer anyway. The validation logic seems backward - anchors are checked first, then stripped.

Recommendation: Either:

  • Add "a" to the allowed elements with .allowUrlProtocols("http", "https") and .allowAttributes("href").onElements("a"), OR
  • Remove the santizeAnchors method entirely if anchors shouldn't be allowed

4. Missing Test for Successful Anchor Validation

The tests verify that changed anchors are detected (misc/theme-verifier/src/test/java/org/keycloak/themeverifier/VerifyMessagePropertiesTest.java:49-52), but there's no test verifying that matching anchors between English and translated versions are correctly accepted.

Recommendation: Add a test case with matching anchors to ensure the validation doesn't produce false positives.


5. Hardcoded Path Manipulation (misc/theme-verifier/src/main/java/org/keycloak/themeverifier/VerifyMessageProperties.java:74-75)

String englishFile = file.getAbsolutePath().replaceAll("resources-community", "resources")
        .replaceAll("_[a-zA-Z-_]*\\.properties", "_en.properties");

Issues:

  • The regex _[a-zA-Z-_]*\\.properties is overly broad and could match unintended patterns
  • No validation that the English file exists before attempting to read it
  • The error message if the file doesn't exist is cryptic (line 79)

Recommendations:

  • Add Files.exists() check with a clear error message
  • Use a more specific regex pattern like _([a-z]{2}(_[A-Z]{2})?)\\.properties$
  • Consider making the English file path configurable

Medium Priority Issues

6. Unclear Variable Naming

In misc/theme-verifier/src/main/java/org/keycloak/themeverifier/VerifyMessageProperties.java:102-109, variables start and end represent the number of matching characters, not positions. Consider renaming to matchingPrefix and matchingSuffix for clarity.


7. Comment Typo (misc/theme-verifier/src/main/java/org/keycloak/themeverifier/VerifyMessageProperties.java:94)

"numberic tags" should be "numeric tags"

Fix this →


8. Missing Edge Case Handling

The normalizeValue method (lines 117-137) has several special cases, but:

  • The email RTL style handling has a TODO comment suggesting this should be moved (line 133)
  • No handling for malformed HTML or unclosed tags before sanitization

9. Test Resource Incompleteness

The test file illegalHtmlTag_en.properties has an unclosed <div> tag (line 17: key=Some <div>tag</div). While this may be intentional to test malformed HTML, it's not clear from the test name. Consider:

  • Adding a separate test for unclosed tags
  • Renaming to be more specific about what's being tested

Low Priority / Code Quality Issues

10. Inconsistent Exception Handling

  • Line 53: Throws MojoExecutionException
  • Line 70, 79: Throws RuntimeException

Recommendation: Standardize exception handling - either wrap all IOExceptions in MojoExecutionException or let them propagate.


11. Magic String in Regex

The regex pattern on line 139 (Pattern HTML_TAGS = Pattern.compile("<[a-z]+[^>]*>")) only matches lowercase tags. HTML tags are case-insensitive, so this might miss uppercase tags like <DIV>.

Recommendation: Use (?i)<[a-z]+[^>]*> for case-insensitive matching.


12. Potential Performance Issue

The verifySafeHtml method reads the English file once for the entire verification, but then reads it again using PropertyResourceBundle. Consider caching or reusing the bundle.


Security Assessment

Good security practices:

  • Uses OWASP HTML Sanitizer library (industry standard)
  • Validates translations against source English text
  • Restricts allowed HTML elements to safe subset
  • Prevents injection of arbitrary HTML

⚠️ Potential concerns:


Message Property Changes

The changes to message properties files appear correct:

  • Standardizing <br/> to <br /> (proper XHTML format)
  • Adding proper choice format for pluralization in error-invalid-multivalued-size

Example from themes/src/main/resources/theme/base/login/messages/messages_en.properties:530:

-error-invalid-multivalued-size=Attribute {0} must have at least {1} and at most {2} value(s).
+error-invalid-multivalued-size=Attribute {0} must have at least {1} and at most {2} {2,choice,0#values|1#value|1<values}.

This is an improvement using Java's ChoiceFormat for proper pluralization.


Test Coverage

✅ Tests cover:

  • Duplicate key detection
  • Illegal HTML tag detection
  • HTML in non-HTML contexts
  • Changed anchor tags

❌ Missing tests for:

  • Valid matching anchor tags (false positive check)
  • Edge cases (empty values, special characters)
  • All special normalization cases (templateHelp, optimizeLookupHelp, etc.)

Recommendations Priority

  1. MUST FIX - Critical Issue Fixing Re-authentication with passkeys #1: Fix the santizeAnchors iteration bug
  2. MUST FIX - Critical Issue Add AuthzClientCryptoProvider for authorization client cryptographic operations #3: Clarify anchor tag handling strategy
  3. SHOULD FIX - High Priority Add rolling-updates feature flag and compatibility framework #4: Add test for successful anchor validation
  4. SHOULD FIX - High Priority Add Client resource type and scopes to authorization schema #5: Improve English file path handling
  5. NICE TO HAVE - Other medium/low priority issues

Overall Assessment

The PR addresses an important security concern (XSS through malicious translations) with a solid approach using OWASP's HTML Sanitizer. The test coverage demonstrates thoughtful design. However, the anchor sanitization logic has critical bugs that will likely cause incorrect behavior. These issues should be resolved before merging.

Recommendation: Request changes to address Critical Issues #1 and #3, and High Priority Issues #4-5.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants