Conversation
…ignment of encoding.
Collaborator
Author
This stack of pull requests is managed by Graphite. Learn more about stacking. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Optimized the
decompressfunction infelt252_vec_compression.rsby replacing the manual limb-by-limb division with a more efficient bit manipulation approach. The new implementation uses a bit mask and shifting operations to extract values from the packed data, which is more performant than the previous division-based algorithm. Also extended theIntoOrPanictrait to supporti128andu128types.Type of change
Please check one:
Why is this change needed?
The previous implementation of the
decompressfunction used a relatively expensive division operation for each word extraction. By leveraging the fact that the padded code size is a power of 2, we can use more efficient bit manipulation operations (masking and shifting) to achieve the same result with better performance.What was the behavior or documentation before?
The previous implementation used a manual 4-limb long division approach to extract values from packed data, which was less efficient, especially for large inputs.
What is the behavior or documentation after?
The new implementation uses bit manipulation (masking and shifting) to extract values from the packed data, which is more efficient. It also adds
i128andu128implementations for theIntoOrPanictrait to support the new code.Additional context
This optimization maintains the same functionality while improving performance. The approach takes advantage of the fact that the padded code size is a power of 2, allowing us to use bit operations instead of division.