Skip to content

fix: unified lookahead for unescaped quotes (fixes #129, #144)#149

Open
majiayu000 wants to merge 4 commits intojosdejong:mainfrom
majiayu000:fix/issue-144-unescaped-quotes-with-parentheses
Open

fix: unified lookahead for unescaped quotes (fixes #129, #144)#149
majiayu000 wants to merge 4 commits intojosdejong:mainfrom
majiayu000:fix/issue-144-unescaped-quotes-with-parentheses

Conversation

@majiayu000
Copy link

@majiayu000 majiayu000 commented Dec 11, 2025

Summary

Unified fix for unescaped double quotes inside strings by using a single lookahead approach that checks whether a quote is followed by valid JSON tokens.

This PR handles:

Examples of now correctly repaired JSON

// Issue #144
jsonrepair('{ "height": "53"" }')      // -> '{ "height": "53\"" }'
jsonrepair('{ "height": "(5\'3")" }')  // -> '{ "height": "(5\'3\")" }'

// Issue #129  
jsonrepair('{"key": "become an "Airbnb-free zone", which is a political decision."}')
// -> '{"key": "become an \"Airbnb-free zone\", which is a political decision."}'

Test plan

Closes #144
Closes #129

🤖 Generated with Claude Code

…r quote

Fixes issue josdejong#144 where JSON strings containing unescaped double quotes
followed by parentheses or another quote would fail to parse.

Examples of now correctly repaired JSON:
- { "height": "53"" } -> { "height": "53\"" }
- { "height": "(5'3")" } -> { "height": "(5'3\")" }
- {"a": "test")" } -> {"a": "test\")" }
@josdejong
Copy link
Owner

Thanks @majiayu000, I'll review your PR soon.

Copy link
Owner

@josdejong josdejong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue addressed here is more generic and probably the two PR's that you have provided (#148, #149) should become a single PR which extends the condition to check whether a double-quote " is actually an end quote by looking ahead two steps to see whether it is followed by valid JSON tokens. What do you think?

})

test('should escape unescaped double quotes followed by parentheses (issue #144)', () => {
expect(jsonrepair('{ "height": "53"" }')).toBe('{ "height": "53\\"" }')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a test case for #114, for strings like a 40" televison?

…dejong#144)

Unified the quote validation logic to handle multiple cases:
- Quotes followed by comma (issue josdejong#129)
- Quotes followed by parentheses or another quote (issue josdejong#144)

The solution uses a single lookahead approach that checks whether
a quote is followed by valid JSON tokens to determine if it's
a real end quote or an unescaped quote inside the string.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@majiayu000
Copy link
Author

Thanks for the feedback! You're right - I've unified the two PRs into this single one.

The new implementation uses a unified lookahead approach that checks whether a quote is followed by valid JSON tokens (looking ahead 2+ steps). This now handles:

I'll close PR #148 as it's now covered here.

@majiayu000 majiayu000 changed the title fix: repair unescaped double quotes followed by parentheses or another quote fix: unified lookahead for unescaped quotes (fixes #129, #144) Dec 17, 2025
@josdejong
Copy link
Owner

Thanks for the update. The feedback I posted in #148 (review) still holds, can you address those concerns please? Can you also address #151 and #114 in this PR? I think #144, #129, #114, and #151 all have a similar cause and the same solution.

…g#151)

Refactor lookahead logic into helper functions and add support for
quotes followed by slash and measurement units like 65".

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@majiayu000
Copy link
Author

Thanks for the feedback! I've made the following updates:

Let me know if anything else needs adjustment.

@straxico
Copy link

@josdejong please merge and release new version

@josdejong
Copy link
Owner

I'm a bit in doubt about the solution. I think that it duplicates quite some parsing logic. What we basically try to do is parse two steps ahead to see what we encounter. Maybe we need to separate getting the next token from the input from the parsing step? Then we could quite easily look up the next two tokens without duplicating parsing logic. Or do you see other ways to better reuse the existing logic and keep the code better maintainable? What do you think @majiayu000?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

repair unescaped double quotes followed by parentheses Failing when parsing double quotes inside a string which are followed by a Comma

3 participants