feat: new string format "extended-unicode"#760
feat: new string format "extended-unicode"#760nigrosimone wants to merge 6 commits intofastify:mainfrom
Conversation
Eomm
left a comment
There was a problem hiding this comment.
I wonder if setting the threshold at (via the fastJson(schema, {stringOptimizations: {...} }):
would do the job without introducing a non-standard format. It may break other side-features such as a swagger viewer.
I have also introduced the "unsafe" string format into FJS. But stringOptimizations seem good for me, for both unsafe and dirty. |
Signed-off-by: Nigro Simone <nigro.simone@gmail.com>
There was a problem hiding this comment.
I asked claude for some suggestions for technically descriptive alternatives for the name dirty . THats what claude answered:
Looking at the code and documentation, the dirty format is used for strings that are "known to contain non-printable characters or surrogate pairs."
Here are some more technically descriptive name suggestions:
Most technically accurate:
non-printable- directly describes what it handles (non-printable characters)surrogate-pairs- references the specific Unicode issue it addressesextended-unicode- indicates it handles extended Unicode characters
More general but descriptive:
escaped- indicates the string needs special character escapingcontrol-chars- references control/non-printable charactersbinary-safe- indicates it safely handles binary/control characters
My recommendation would be non-printable as it most accurately describes the primary use case - strings containing characters that aren't standard printable ASCII/Unicode characters. This is more technically precise than "dirty" while remaining clear about its purpose.
Alternatively, extended-unicode would work well if you want to emphasize that it's specifically for handling Unicode edge cases like surrogate pairs.
I personally tend to extended-unicode.
For string known to contain non-printable characters or surrogate pairs.
For very long strings whose presence of characters to escape is known and certain, it is useless to call
asString, it is just a waste of time (always execute a regexSTR_ESCAPE), eg. long product description with new lines.Checklist
npm run testandnpm run benchmarkand the Code of conduct
Benchmark