-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: prevent apply_diff hanging on XML parsing with external interference (#4852) #6774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ence (#4852) - Implement dual-parser system with automatic fallback - Add XMLParserManager class to encapsulate parser state - Detect and handle xml2js interference from other extensions - Add circuit breaker pattern to prevent infinite loops - Add comprehensive test coverage (22 tests) - Remove all debug logging for production readiness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a critical issue where the apply_diff tool would hang indefinitely on large or complex XML files due to interference from external XML parsers (specifically xml2js loaded by other VSCode extensions). The solution implements a robust dual-parser system with automatic fallback and circuit breaker protection.
- Implements dual-parser system: fast-xml-parser as primary, regex-based fallback for xml2js interference
- Adds circuit breaker pattern to prevent repeated parser failures
- Enhances error detection and XML validation with improved diagnostics
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/utils/xml.ts | Core implementation of dual-parser system with fallback logic and circuit breaker |
| src/core/tools/multiApplyDiffTool.ts | XML validation and enhanced error handling with telemetry improvements |
| src/core/tools/tests/multiApplyDiffTool.test.ts | Comprehensive test suite covering validation, fallback scenarios, and edge cases |
| src/core/diff/strategies/tests/issue-4852-extended.spec.ts | Extended test suite focusing on xml2js interference detection and fallback parser functionality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the changes and the implementation looks solid overall. The dual-parser approach with fallback is a clever solution to handle external XML parser interference. I have some suggestions for improvement that could make the solution even more robust.
- Add comprehensive documentation explaining dual-parser system - Add size limit check (10MB) for fallback regex parser to prevent memory exhaustion - Improve error handling for non-Error types thrown by third-party code - Make path extraction throw explicit error instead of returning null - Improve tag balance validation to account for self-closing tags - Simplify error messages to be more user-friendly - Add thread-safety documentation note for parser instance
…MAX_FAILURES logic - Remove circuit breaker pattern with MAX_FAILURES counter - Use immediate fallback for any parse error, not just addChild errors - Simplify error handling logic as retrying the same XML parse multiple times is unnecessary - Update tests to reflect the simplified error handling approach
|
@hannesrudolph Do you think it's worth adding this just to prevent conflicts with other XML parsers? is there a way to prevent interference from them without having to implement this? |
- Updated PR description to clarify fast-xml-parser throws the error, not xml2js - Removed all xml2js interference references from code and comments - Updated tests to reflect the actual issue (fast-xml-parser error on complex XML) - The issue is about fast-xml-parser limitations, not external extension interference
PR Updated - Corrected Root Cause DescriptionI've updated this PR to accurately reflect the actual issue. The problem is not about interference from xml2js or other extensions, but rather a limitation/bug in fast-xml-parser itself when handling complex XML structures. Changes Made:
The Real Issue:
@daniel-lxs Your question was spot on - there is no actual conflict with other XML parsers. The solution remains valid (fallback parser for when fast-xml-parser fails), but the attribution to external interference was incorrect. |
- Fixed the root cause of fast-xml-parser errors by properly escaping content in XML examples - Used CDATA sections for diff content that contains special characters - This prevents the parser from misinterpreting markdown backticks as XML structure
Alternative Solution FoundAfter further investigation, I've identified the actual root cause and created a much cleaner fix in PR #6811. The Real IssueThe problem wasn't fast-xml-parser having a bug or limitation - it was that the XML examples in The Proper FixPR #6811 simply uses CDATA sections to properly escape the content, which is the standard XML way to handle special characters. This eliminates the need for:
I recommend closing this PR in favor of #6811, which is a much simpler and more correct solution. |
Related GitHub Issue
Closes: #4852
Roo Code Task Context (Optional)
Issue Fixer Orchestrator workflow completed with comprehensive analysis, implementation, testing, and critical review phases.
Description
This PR fixes the
apply_difftool throwing "Cannot read properties of undefined (reading 'addChild')" errors when parsing large or complex XML files. The error occurs due to a bug/limitation in the fast-xml-parser library when handling deeply nested or complex XML structures.The solution implements a robust dual-parser system:
This ensures the tool can handle edge cases where fast-xml-parser encounters parsing errors on complex XML structures.
Key Changes
Dual-parser system (
src/utils/xml.ts):Enhanced error handling (
src/core/tools/multiApplyDiffTool.ts):Comprehensive tests:
Testing
Notes
The fallback parser ensures the tool remains functional even when the primary parser encounters edge cases, providing a more robust solution for handling various XML structures.