fix: specify UTF-8 charset in getBytes() calls on security-sensitive paths#26946
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThis PR adds explicit UTF-8 encoding (StandardCharsets.UTF_8) to password handling in an LDAP test utility and to SAML message encoding/parsing in SAML2 SSO test helpers, replacing implicit platform-default byte conversions. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@splinter @gnudeep @hevayo @chaturadilan Hi team, I've opened a PR to fix a medium-severity bug where String.getBytes() was called without an explicit charset in two security-sensitive files. This is a straightforward but important fix — the missing charset causes platform-dependent byte encoding, which can lead to inconsistent SAML assertion encoding and non-deterministic password hashing across different OS environments. Would appreciate a review, particularly on the ExternalLDAPServer password hashing path to confirm the UTF-8 encoding aligns with how passwords are expected to be stored and verified in the LDAP integration tests Thanks regards, |
There was a problem hiding this comment.
Pull request overview
This PR makes byte conversions deterministic in security-sensitive SAML and LDAP password-handling code paths by replacing platform-default String.getBytes() calls with explicit UTF-8 encoding.
Changes:
- Use
getBytes(StandardCharsets.UTF_8)for SAML POST-binding and fallback Base64 encoding. - Parse SAML XML using UTF-8 bytes in
getDocument(...). - Use UTF-8 for password hashing/storage byte conversions in
ExternalLDAPServer.changeConnectionUserPassword(...)and add the missingStandardCharsetsimport.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| product-scenarios/scenarios-commons/src/main/java/org/wso2/identity/scenarios/commons/SAML2SSOTestBase.java | Ensures SAML request encoding and SAML XML parsing use explicit UTF-8 rather than platform-default encoding. |
| modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/identity/integration/test/utils/ExternalLDAPServer.java | Ensures LDAP password digest/update and password attribute bytes use explicit UTF-8 and adds required import. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| messageDigest.update(password.getBytes(StandardCharsets.UTF_8)); | ||
| byte[] bytes = messageDigest.digest(); | ||
| String hash = Base64.encode(bytes); | ||
| passwordToStore = passwordToStore + hash; | ||
| adminPrincipal.setUserPassword(passwordToStore.getBytes()); | ||
| adminPrincipal.setUserPassword(passwordToStore.getBytes(StandardCharsets.UTF_8)); | ||
| Attribute passwordAttribute = new DefaultAttribute(getAttributeType("userPassword")); | ||
| try { | ||
| passwordAttribute.add(passwordToStore.getBytes()); | ||
| passwordAttribute.add(passwordToStore.getBytes(StandardCharsets.UTF_8)); | ||
| } catch (LdapInvalidAttributeValueException e) { |
There was a problem hiding this comment.
ExternalLDAPServer still has platform-default getBytes() calls in addAdminPassword(...) (e.g., md.update(password.getBytes()) and passwordToStore.getBytes()), so password hashing/storage can remain non-deterministic on non-UTF-8 platforms. Consider updating those remaining calls to use StandardCharsets.UTF_8 as well for consistency with the changes here.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/identity/integration/test/utils/ExternalLDAPServer.java`:
- Around line 197-204: The admin password bootstrap path still uses
platform-default charset; update addAdminPassword(...) in
ExternalLDAPServer.java so all uses of getBytes() are explicit about UTF-8: when
hashing (MessageDigest.update / digest inputs) and when storing/adding the
password (calls that pass byte[] to adminPrincipal.setUserPassword and
passwordAttribute.add), replace getBytes() with getBytes(StandardCharsets.UTF_8)
for both occurrences so the generated digest and LDAP-stored values are
deterministic across JVM locales.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b75fc77c-68d3-4afd-977a-022ea8e12ac0
📒 Files selected for processing (2)
modules/integration/tests-integration/tests-backend/src/test/java/org/wso2/identity/integration/test/utils/ExternalLDAPServer.javaproduct-scenarios/scenarios-commons/src/main/java/org/wso2/identity/scenarios/commons/SAML2SSOTestBase.java
...tests-backend/src/test/java/org/wso2/identity/integration/test/utils/ExternalLDAPServer.java
Show resolved
Hide resolved
|
|
@splinter @gnudeep @hevayo @chaturadilan Hi team, I've opened a PR to fix a medium-severity bug where String.getBytes() was called without an explicit charset in two security-sensitive files. This is a straightforward but important fix — the missing charset causes platform-dependent byte encoding, which can lead to inconsistent SAML assertion encoding and non-deterministic password hashing across different OS environments. Would appreciate a review, particularly on the ExternalLDAPServer password hashing path to confirm the UTF-8 encoding aligns with how passwords are expected to be stored and verified in the LDAP integration tests Thanks regards, |
|
@NipunaMadhushan I've opened a PR to fix a medium-severity bug where String.getBytes() was called without an explicit charset in two security-sensitive files. This is a straightforward but important fix — the missing charset causes platform-dependent byte encoding, which can lead to inconsistent SAML assertion encoding and non-deterministic password hashing across different OS environments. Would appreciate a review, particularly on the ExternalLDAPServer password hashing path to confirm the UTF-8 encoding aligns with how passwords are expected to be stored and verified in the LDAP integration tests Thanks regards, |



fix: specify UTF-8 charset in getBytes() calls on security-sensitive paths
String.getBytes() without an explicit charset uses the JVM's default platform encoding, which varies by OS and locale. This causes non-deterministic byte output in security-sensitive code paths.
In SAML2SSOTestBase.java, the redirect-binding branch already used UTF-8 explicitly, but the POST-binding and fallback branches did not. This inconsistency could produce garbled SAML assertions or signature verification failures on non-UTF-8 systems. The getDocument() helper also parsed SAML XML using platform-default bytes.
In ExternalLDAPServer.java, three getBytes() calls in the password hashing path were platform-dependent. Since MessageDigest operates on raw bytes, a charset mismatch between the system storing the hash and the system verifying it could cause authentication failures.Changes made:
SAML2SSOTestBase.java: replaced getBytes() with getBytes(StandardCharsets.UTF_8) in the POST-binding branch, fallback branch, and getDocument().
ExternalLDAPServer.java: replaced getBytes() with getBytes(StandardCharsets.UTF_8) in all three password-handling calls and added the missing import for StandardCharsets.
No behavioral change on systems already using UTF-8 as default. Fixes incorrect encoding on Windows or other non-UTF-8 locales. StandardCharsets.UTF_8 is always available per the Java spec so no new checked exceptions are introduced.
Summary by CodeRabbit