index: skip entries with invalid bytes instead of panicking#295
Open
amaanq wants to merge 2 commits intonix-community:masterfrom
Open
index: skip entries with invalid bytes instead of panicking#295amaanq wants to merge 2 commits intonix-community:masterfrom
amaanq wants to merge 2 commits intonix-community:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The frcode format uses newlines as entry delimiters, so it can't handle paths or symlink targets containing literal newline characters. When indexing encounters such an entry (e.g., from rose-pine-icon-theme which has ~117k symlinks with trailing newlines in their targets), the encoder panics.
There is also a copy-paste bug in
write_pathwhere the second assertion checked for\x00but the error message said "newlines".Solution
This PR gracefully handles entries with invalid bytes by skipping them instead of panicking, and reporting them at the end. Here's an example of the output:
By default, only the first 5 broken entries are shown, so I added a verbose flag to show all of them. Frankly, I'd be surprised if someone had many packages with broken entries, so it might be ok to not add the verbose flag, I'll leave that up to the reviewer.