-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: handle YAML parsing edge cases in CustomModesManager #5099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add BOM (Byte Order Mark) stripping for UTF-8 and UTF-16 - Normalize invisible characters including non-breaking spaces - Replace fancy quotes and dashes with standard characters - Remove zero-width characters that can cause parsing issues - Add comprehensive test coverage for all edge cases This fixes the YAML parsing limitations documented in PR #237 by implementing proper preprocessing before parsing YAML content.
|
We have finished reviewing your PR. We have found no vulnerabilities. Reply to this PR with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances YAML parsing in CustomModesManager by preprocessing content to strip BOMs and normalize problematic Unicode characters, updating error handling, and adding extensive edge-case tests.
- Added private methods
stripBOMandcleanInvisibleCharactersfor content normalization. - Refactored
parseYamlSafelyto apply preprocessing and surface user-friendly errors. - Introduced a new test suite
CustomModesManager.yamlEdgeCases.spec.tscovering BOMs, invisible characters, fancy quotes, dashes, error scenarios, and international characters.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/core/config/CustomModesManager.ts | Implemented BOM stripping, invisible character cleaning, and updated YAML parsing with enhanced error reporting |
| src/core/config/tests/CustomModesManager.yamlEdgeCases.spec.ts | Added comprehensive tests for YAML edge cases, including BOMs, invisible chars, quotes, dashes, errors, and UTF-8/Intl |
Comments suppressed due to low confidence (2)
src/core/config/CustomModesManager.ts:76
- Add
@paramand@returnsannotations to the JSDoc forstripBOM,cleanInvisibleCharacters, andparseYamlSafelyto clarify expected input and output types.
/**
src/core/config/tests/CustomModesManager.yamlEdgeCases.spec.ts:389
- Consider adding a test for files with Windows-style CRLF line endings to ensure the preprocessing handles all common line ending conventions.
expect(modes[0].name).toBe("📝 Writing Mode")
src/core/config/__tests__/CustomModesManager.yamlEdgeCases.spec.ts
Outdated
Show resolved
Hide resolved
src/core/config/__tests__/CustomModesManager.yamlEdgeCases.spec.ts
Outdated
Show resolved
Hide resolved
- Fix BOM handling to correctly handle UTF-16 (all BOMs appear as \uFEFF when decoded) - Optimize cleanInvisibleCharacters with single regex pass for better performance - Prevent duplicate error messages by marking errors as already handled - Refactor test file to use mockFsReadFile helper function to reduce duplication - Fix YAML indentation in tests (use spaces instead of tabs) - Add ESLint disable comment for character class warning (regex is correct)
hannesrudolph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thorough review! I've addressed all the feedback and pushed the changes. All tests are passing and the implementation is now more robust and performant.
- Added lineWidth: 0 option to all yaml.stringify() calls - Prevents automatic line wrapping at 80 characters - Improves readability of YAML output for long strings - Applied to CustomModesManager, SimpleInstaller, and migrateSettings
|
✅ Added the YAML line width fix as requested! I've updated all Changes made:
This ensures that long strings in YAML output remain on a single line, improving readability as you requested. |
- Added defaultStringType: 'PLAIN' to minimize formatting changes - This helps preserve plain scalars when possible - Works alongside lineWidth: 0 to prevent automatic line wrapping
|
I've pushed an additional update that adds However, I need to explain what's happening with the line breaks you're seeing:
The To truly preserve exact YAML formatting without any changes would require avoiding parsing/re-stringifying altogether, but that's not possible when we need to modify the content (add/remove modes). The current solution ensures:
|
- Move regex pattern to PROBLEMATIC_CHARS_REGEX static constant - Add comprehensive documentation for each character range - Improves maintainability and makes the pattern reusable
- Add test for mixed line endings (CRLF vs LF) - Add test for multiple BOMs in sequence - Add test for deeply nested structures with problematic characters - Ensures robustness across different real-world scenarios
- Update CustomModesManager tests to expect translation keys - Fix YAML edge case tests to match new i18n error messages - All tests now pass with the i18n integration
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is awesome!
LGTM
| /** | ||
| * Strip BOM (Byte Order Mark) from the beginning of a string | ||
| */ | ||
| private stripBOM(content: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a stripbom method elsewhere too - could we move to a shared utility file?
| .map((issue) => `• ${issue.path.join(".")}: ${issue.message}`) | ||
| .join("\n") | ||
|
|
||
| vscode.window.showErrorMessage(t("common:customModes.errors.schemaValidationError", { issues })) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
- Replace custom stripBOM method with existing strip-bom package - Fix duplicate error handling in parseYamlSafely by returning empty object instead of re-throwing - Addresses review comments from PR #5099
…#5099) * fix: handle YAML parsing edge cases in CustomModesManager - Add BOM (Byte Order Mark) stripping for UTF-8 and UTF-16 - Normalize invisible characters including non-breaking spaces - Replace fancy quotes and dashes with standard characters - Remove zero-width characters that can cause parsing issues - Add comprehensive test coverage for all edge cases This fixes the YAML parsing limitations documented in PR RooCodeInc#237 by implementing proper preprocessing before parsing YAML content. * fix: address PR review comments - Fix BOM handling to correctly handle UTF-16 (all BOMs appear as \uFEFF when decoded) - Optimize cleanInvisibleCharacters with single regex pass for better performance - Prevent duplicate error messages by marking errors as already handled - Refactor test file to use mockFsReadFile helper function to reduce duplication - Fix YAML indentation in tests (use spaces instead of tabs) - Add ESLint disable comment for character class warning (regex is correct) * fix: prevent YAML line breaks by setting lineWidth to 0 - Added lineWidth: 0 option to all yaml.stringify() calls - Prevents automatic line wrapping at 80 characters - Improves readability of YAML output for long strings - Applied to CustomModesManager, SimpleInstaller, and migrateSettings * fix: add defaultStringType option to yaml.stringify calls - Added defaultStringType: 'PLAIN' to minimize formatting changes - This helps preserve plain scalars when possible - Works alongside lineWidth: 0 to prevent automatic line wrapping * refactor: extract problematic characters regex as a named constant - Move regex pattern to PROBLEMATIC_CHARS_REGEX static constant - Add comprehensive documentation for each character range - Improves maintainability and makes the pattern reusable * test: add comprehensive edge case tests for YAML parsing - Add test for mixed line endings (CRLF vs LF) - Add test for multiple BOMs in sequence - Add test for deeply nested structures with problematic characters - Ensures robustness across different real-world scenarios * feat(i18n): add error messages for custom modes in multiple languages * fix: update tests to expect i18n keys instead of hardcoded strings - Update CustomModesManager tests to expect translation keys - Fix YAML edge case tests to match new i18n error messages - All tests now pass with the i18n integration * refactor: use strip-bom package and fix error handling - Replace custom stripBOM method with existing strip-bom package - Fix duplicate error handling in parseYamlSafely by returning empty object instead of re-throwing - Addresses review comments from PR RooCodeInc#5099 --------- Co-authored-by: Daniel Riccio <[email protected]>
|
Thanks folks |
- Add JSON fallback to parseYamlSafely for reading old JSON files - Remove defaultStringType: 'PLAIN' from yaml.stringify to ensure JSON-compatible output - Fixes issue introduced in #5099
* fix: handle YAML parsing edge cases in CustomModesManager - Add BOM (Byte Order Mark) stripping for UTF-8 and UTF-16 - Normalize invisible characters including non-breaking spaces - Replace fancy quotes and dashes with standard characters - Remove zero-width characters that can cause parsing issues - Add comprehensive test coverage for all edge cases This fixes the YAML parsing limitations documented in PR #237 by implementing proper preprocessing before parsing YAML content. * fix: address PR review comments - Fix BOM handling to correctly handle UTF-16 (all BOMs appear as \uFEFF when decoded) - Optimize cleanInvisibleCharacters with single regex pass for better performance - Prevent duplicate error messages by marking errors as already handled - Refactor test file to use mockFsReadFile helper function to reduce duplication - Fix YAML indentation in tests (use spaces instead of tabs) - Add ESLint disable comment for character class warning (regex is correct) * fix: prevent YAML line breaks by setting lineWidth to 0 - Added lineWidth: 0 option to all yaml.stringify() calls - Prevents automatic line wrapping at 80 characters - Improves readability of YAML output for long strings - Applied to CustomModesManager, SimpleInstaller, and migrateSettings * fix: add defaultStringType option to yaml.stringify calls - Added defaultStringType: 'PLAIN' to minimize formatting changes - This helps preserve plain scalars when possible - Works alongside lineWidth: 0 to prevent automatic line wrapping * refactor: extract problematic characters regex as a named constant - Move regex pattern to PROBLEMATIC_CHARS_REGEX static constant - Add comprehensive documentation for each character range - Improves maintainability and makes the pattern reusable * test: add comprehensive edge case tests for YAML parsing - Add test for mixed line endings (CRLF vs LF) - Add test for multiple BOMs in sequence - Add test for deeply nested structures with problematic characters - Ensures robustness across different real-world scenarios * feat(i18n): add error messages for custom modes in multiple languages * fix: update tests to expect i18n keys instead of hardcoded strings - Update CustomModesManager tests to expect translation keys - Fix YAML edge case tests to match new i18n error messages - All tests now pass with the i18n integration * refactor: use strip-bom package and fix error handling - Replace custom stripBOM method with existing strip-bom package - Fix duplicate error handling in parseYamlSafely by returning empty object instead of re-throwing - Addresses review comments from PR #5099 --------- Co-authored-by: Daniel Riccio <[email protected]>
Description
This PR addresses YAML parsing edge cases and formatting issues:
lineWidth: 0anddefaultStringType: 'PLAIN'optionsChanges Made
CustomModesManager.ts
stripBOM()method to remove Byte Order MarkscleanInvisibleCharacters()method to handle problematic Unicode charactersparseYamlSafely()method with enhanced error handlingyaml.stringify()calls to use{ lineWidth: 0, defaultStringType: 'PLAIN' }SimpleInstaller.ts & migrateSettings.ts
yaml.stringify()calls to use the same options for consistencyTesting
Notes
The YAML library may still convert between scalar styles (e.g., from folded
>-to literal|-) when parsing and re-stringifying content. This is inherent to YAML normalization and ensures content integrity while minimizing formatting changes.