fix: unified lookahead for unescaped quotes (fixes #129, #144)#149
fix: unified lookahead for unescaped quotes (fixes #129, #144)#149majiayu000 wants to merge 4 commits intojosdejong:mainfrom
Conversation
…r quote Fixes issue josdejong#144 where JSON strings containing unescaped double quotes followed by parentheses or another quote would fail to parse. Examples of now correctly repaired JSON: - { "height": "53"" } -> { "height": "53\"" } - { "height": "(5'3")" } -> { "height": "(5'3\")" } - {"a": "test")" } -> {"a": "test\")" }
|
Thanks @majiayu000, I'll review your PR soon. |
josdejong
left a comment
There was a problem hiding this comment.
I think the issue addressed here is more generic and probably the two PR's that you have provided (#148, #149) should become a single PR which extends the condition to check whether a double-quote " is actually an end quote by looking ahead two steps to see whether it is followed by valid JSON tokens. What do you think?
| }) | ||
|
|
||
| test('should escape unescaped double quotes followed by parentheses (issue #144)', () => { | ||
| expect(jsonrepair('{ "height": "53"" }')).toBe('{ "height": "53\\"" }') |
There was a problem hiding this comment.
Can you also add a test case for #114, for strings like a 40" televison?
…dejong#144) Unified the quote validation logic to handle multiple cases: - Quotes followed by comma (issue josdejong#129) - Quotes followed by parentheses or another quote (issue josdejong#144) The solution uses a single lookahead approach that checks whether a quote is followed by valid JSON tokens to determine if it's a real end quote or an unescaped quote inside the string. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Thanks for the feedback! You're right - I've unified the two PRs into this single one. The new implementation uses a unified lookahead approach that checks whether a quote is followed by valid JSON tokens (looking ahead 2+ steps). This now handles:
I'll close PR #148 as it's now covered here. |
…g#151) Refactor lookahead logic into helper functions and add support for quotes followed by slash and measurement units like 65". 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Thanks for the feedback! I've made the following updates:
Let me know if anything else needs adjustment. |
|
@josdejong please merge and release new version |
|
I'm a bit in doubt about the solution. I think that it duplicates quite some parsing logic. What we basically try to do is parse two steps ahead to see what we encounter. Maybe we need to separate getting the next token from the input from the parsing step? Then we could quite easily look up the next two tokens without duplicating parsing logic. Or do you see other ways to better reuse the existing logic and keep the code better maintainable? What do you think @majiayu000? |
Summary
Unified fix for unescaped double quotes inside strings by using a single lookahead approach that checks whether a quote is followed by valid JSON tokens.
This PR handles:
"53"","(5'3")")"become an "Airbnb-free zone", which...")Examples of now correctly repaired JSON
Test plan
Closes #144
Closes #129
🤖 Generated with Claude Code