Skip to content

test: add Unicode combining character sequence cases to maxLength#847

Open
abhi-03-kh wants to merge 1 commit intojson-schema-org:mainfrom
abhi-03-kh:feat/unicode-string-tests
Open

test: add Unicode combining character sequence cases to maxLength#847
abhi-03-kh wants to merge 1 commit intojson-schema-org:mainfrom
abhi-03-kh:feat/unicode-string-tests

Conversation

@abhi-03-kh
Copy link

Summary

Adds Unicode-focused test cases to draft2020-12/maxLength.json clarifying that maxLength is defined in terms of Unicode code points.

Technical Context

Per the JSON Schema specification, maxLength is based on the number of Unicode code points, not grapheme clusters or normalization results.

This PR adds:

  • A precomposed character (\u00e9) which is a single code point and valid for maxLength: 1.
  • A combining character sequence (e\u0301) which consists of two code points and must be invalid for maxLength: 1.

This test specifically targets the distinction between code point length and user-perceived grapheme clusters, a common point of divergence in JSON Schema implementations.

Validation

Validated JSON integrity locally using Node.js.

node -e "JSON.parse(require('fs').readFileSync('tests/draft2020-12/maxLength.json','utf8')); console.log('JSON OK')"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant