Fix filename handling for percent signs and non-German diacritics#215
Fix filename handling for percent signs and non-German diacritics#215forketyfork merged 2 commits intomainfrom
Conversation
…n diacritics
Issue: Product filenames containing literal `%` (e.g. "Jogurt 3.7% (Pilos)")
caused decodeURIComponent to throw URIError, spamming the console. Additionally,
the filename sanitization regex only preserved German umlauts, replacing Czech,
Slovak, French, and other diacritical characters with underscores.
Solution: Escape bare `%` characters (not followed by two hex digits) as `%25`
before calling decodeURIComponent, so literal percent signs survive decoding
while valid percent-encoded sequences still decode normally. Replaced the
hardcoded character class with Unicode property escapes (\p{L}\p{N}) to
preserve letters from any language in generated filenames, and removed the
now-unnecessary convertGermanUmlauts function.
There was a problem hiding this comment.
Pull request overview
This PR fixes filename handling issues with percent signs and non-German diacritical characters in product filenames. It resolves console error spam when files contain literal % characters and expands filename sanitization to support Unicode letters from all languages, not just German umlauts.
Changes:
- Removed
convertGermanUmlautsfunction and replaced hardcoded German umlaut regex with Unicode property escapes (\p{L}\p{N}) - Added percent-escaping logic to handle literal
%characters before URI decoding - Updated tests to verify Unicode character preservation and percent sign handling
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/constants.ts | Replaced INVALID_FILENAME_CHARS_REGEX with Unicode property escapes and removed convertGermanUmlauts function |
| src/NutritionCalculator.ts | Added logic to escape bare percent signs before decodeURIComponent to prevent URI errors |
| src/NutrientModal.ts | Removed convertGermanUmlauts call from filename generation |
| src/tests/constants.test.ts | Replaced German umlaut conversion tests with Unicode character preservation tests |
| src/tests/NutritionCalculator.test.ts | Added comprehensive tests for normalizeFilename including percent sign handling |
| .gitignore | Added .tmp/ and .architect/ directories to ignore list |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ae4f4a15d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Issue: The Unicode regex missed \p{M} (combining marks), so decomposed
accents and scripts like Hindi that rely on combining characters would
have those marks stripped and replaced with underscores.
Solution: Add \p{M} to INVALID_FILENAME_CHARS_REGEX alongside \p{L}
and \p{N}, and add test cases for decomposed accents and Hindi script.
Closes #214
Summary
Product filenames containing a literal
%(like "Jogurt 3.7% (Pilos)") causeddecodeURIComponentto throwURIError, spamming the dev console on every file scan. Separately, the filename sanitization regex only allowed ASCII letters, digits, and German umlauts — so Czech, Slovak, French, Spanish, and other diacritical characters got replaced with underscores when creating products through the form.The fix escapes bare
%characters (those not followed by two hex digits) as%25before passing todecodeURIComponent. This lets valid percent-encoded sequences like%20decode normally while literal%signs pass through cleanly. For the sanitization regex, the hardcoded character class is replaced with Unicode property escapes (\p{L}\p{N}), which preserves letters from any script. TheconvertGermanUmlautsfunction is removed since it's no longer needed — umlauts are now kept as-is in filenames just like any other letter.Test plan
%in the name (e.g. "Jogurt 3.7% (Pilos).md"), open Obsidian, and confirm noFailed to decode filenameerrors appear in the developer console