Use UTF-8 in item metadata and JSON serialization#17010
Open
ryvnf wants to merge 2 commits intoluanti-org:masterfrom
Open
Use UTF-8 in item metadata and JSON serialization#17010ryvnf wants to merge 2 commits intoluanti-org:masterfrom
ryvnf wants to merge 2 commits intoluanti-org:masterfrom
Conversation
310a693 to
098db59
Compare
098db59 to
2bbf30d
Compare
sfan5
reviewed
Mar 10, 2026
2bbf30d to
25cfddc
Compare
Contributor
Author
|
I also added a |
Contributor
Author
|
The test that failed did so by a Edit: appears to have fixed itself |
25cfddc to
803da3c
Compare
SmallJoker
approved these changes
Mar 15, 2026
Member
SmallJoker
left a comment
There was a problem hiding this comment.
Works, tested with #12167 (comment) using the sample string:
local str = "baum\xe4\x01\xF5\xA3birne"and libjsoncpp26, version 1.9.6-3
Result
it's the same
str: 98 97 117 109 228 1 245 163 98 105 114 110 101
str_loaded: 98 97 117 109 228 1 245 163 98 105 114 110 101
Works.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Goal of the PR
Reduce size of itemstrings that contain non-ascii unicode characters. On master for example "☺" is encoded as "\u00e2\u0098\u00ba" in itemstrings, which is 6 times larger.
How does the PR work
core.write_jsonemit UTF-8 using theemitUTF8setting\u00XXfor item and node metadatacore.serializealready preserves UTF-8 encoded text so no change was necessary there\u00XXso metadata separators like\x01\x02can be nestedDoes it resolve any reported issue?
Implements what was suggested in #17007
Does this relate to a goal in the roadmap?
Probably not.
If not a bug fix, why is this PR needed? What usecases does it solve?
string.lenconsistent. For example, in Minetest Game the amount of text a user can enter into a book is limited to 10kB usingstring.len. If a user decides to spam the book with ☺ they can create a book itemstring which contains 60kB instead of 10kB as you would expect.If you have used an LLM/AI to help with code or assets, you must disclose this.
I have not.
Todo
How to test
Do the following in
devtestdump_itemstringcommand and verify it contains "☺" instead of "\u00e2\u0098\u00ba"There may be other things that need testing that I have not considered.