Skip to content

Conversation

@nicholas-codecov
Copy link

Description

We're currently unable to handle normalizing strings when hashes have a custom length set [hash:16] for example. This PR resolves this issue by searching for the ] character after finding the starting hash string match.

Notable Changes

  • Update normalizePath to handle custom hash lengths
  • Update tests

Copy link

@codecov codecov bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes overall are mostly good, addressing the concern of path normalisation in the event of a custom hash length being set. However, I have concerns about a minor issue in the normalizePath utility and the test cases which could be a bit more effective and efficient.

// grab the ending delimiter and create a regex group for it
let endingDelimiter =
format.at(match.hashIndex + match.hashString.length) ?? "";
let endingDelimiter = "";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic could be optimized. Currently you are slicing the string then iterating over the resulting characters. An alternate approach could be to directly find the next closing bracket starting at hashIndex in the string and then getting the following char for delimiter. This could save iterations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also correct, but not worth the performance win because of the size of string, array lookups are cheap because the n is small for this method in general.

Only worry is the extra allocations, but considering this should be immediately GC'd for the most part not a problem.

In general the optimization would be to do this in single pass, and explicitly iterate over format and collect the delimiters accordingly.

expected: "test.*",
},
{
name: "should replace '[hash:22]' with '*'",
Copy link

@codecov codecov bot Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case seems to duplicate the following ones. They all test the same process with the same kind of input, therefore, any error would cause all of them to fail. It would therefore be more efficient to just keep one of them.

},
expected: "test.*.chunk.js",
},
];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the test coverage is good for different kind of hashes, it seems to assume that the hashes will always be of length 3. It would be beneficial to test hashes of different lengths to ensure normalisation works correctly in those scenarios as well.

@codecov-notifications
Copy link

codecov-notifications bot commented Oct 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

Components Coverage Δ
Plugin core 96.70% <100.00%> (-0.01%) ⬇️
Rollup plugin 10.81% <ø> (ø)
Vite plugin 11.02% <ø> (ø)
Webpack plugin 49.88% <ø> (ø)

📢 Thoughts on this report? Let us know!

@codecov-staging
Copy link

Bundle Report

Bundle size has no change ✅

// added in `\-` to account for the `-` character which seems to be used by Rollup through testing
const regexString = `(${leadingRegex}(?<hash>[0-9a-zA-Z\/+=-]+)${endingRegex})`;
// added in `\-` and `\_` to account for the `-` `_` as they are included in the potential hashes: https://rollupjs.org/configuration-options/#output-hashcharacters
const regexString = `(${leadingRegex}(?<hash>[0-9a-zA-Z/\+=_\/+=-]+)${endingRegex})`;

Check failure

Code scanning / CodeQL

Useless regular-expression character escape High

The escape sequence '\+' is equivalent to just '+', so the sequence may still represent a meta-character when it is used in a
regular expression
.

Copilot Autofix

AI about 1 year ago

To fix the problem, we need to remove the unnecessary escape sequence \+ from the regular expression string. This will make the code cleaner and more readable without changing its functionality.

  • In general terms, we should remove the backslash before the + character in the regular expression string.
  • Specifically, we will edit the regexString on line 48 in the file packages/bundler-plugin-core/src/utils/normalizePath.ts to remove the unnecessary escape sequence.
  • No additional methods, imports, or definitions are needed to implement this change.
Suggested changeset 1
packages/bundler-plugin-core/src/utils/normalizePath.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/packages/bundler-plugin-core/src/utils/normalizePath.ts b/packages/bundler-plugin-core/src/utils/normalizePath.ts
--- a/packages/bundler-plugin-core/src/utils/normalizePath.ts
+++ b/packages/bundler-plugin-core/src/utils/normalizePath.ts
@@ -47,3 +47,3 @@
     // added in `\-` and `\_` to account for the `-` `_` as they are included in the potential hashes: https://rollupjs.org/configuration-options/#output-hashcharacters
-    const regexString = `(${leadingRegex}(?<hash>[0-9a-zA-Z/\+=_\/+=-]+)${endingRegex})`;
+    const regexString = `(${leadingRegex}(?<hash>[0-9a-zA-Z/+=_\/+=-]+)${endingRegex})`;
     const HASH_REPLACE_REGEX = new RegExp(regexString, "i");
EOF
@@ -47,3 +47,3 @@
// added in `\-` and `\_` to account for the `-` `_` as they are included in the potential hashes: https://rollupjs.org/configuration-options/#output-hashcharacters
const regexString = `(${leadingRegex}(?<hash>[0-9a-zA-Z/\+=_\/+=-]+)${endingRegex})`;
const regexString = `(${leadingRegex}(?<hash>[0-9a-zA-Z/+=_\/+=-]+)${endingRegex})`;
const HASH_REPLACE_REGEX = new RegExp(regexString, "i");
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link

@codecov codecov bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodecovAI submitted a new review for 5e6f67c

input: {
path: "test.CoScjXRp_rD9HKS--kYO73.chunk.js",
format: "[name].[hash:22].chunk.js",
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s very commendable that test cases have been added for different types and lengths of hashes. I would suggest adding more edge cases, for instance, to handle unexpected hash values or an unusual hash length.

format.at(match.hashIndex + match.hashString.length) ?? "";
let endingDelimiter = "";

endingDelimiter =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this code change 'format.at(match.hashIndex + format.slice(match.hashIndex).indexOf("]") + 1)' correctly finds the ending delimiter by adding 1 to the index of the matching hash, it might be easier to understand and more efficient if the '.indexOf("]")' portion is performed once outside of the function for reuse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree codecov bot :)

format.slice(match.hashIndex).indexOf("]") become it's own variable helps readability of why endingDelimiter was calculated in this way. I would extract it out.

// create a regex that will match the hash
// potential values gathered from: https://en.wikipedia.org/wiki/Base64
// added in `\-` to account for the `-` character which seems to be used by Rollup through testing
const regexString = `(${leadingRegex}(?<hash>[0-9a-zA-Z\/+=-]+)${endingRegex})`;
Copy link

@codecov codecov bot Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the updated regexString, there is a repeated pattern '/+=_/+=-'. This error may affect how the code is interpreted and could cause unexpected behaviors. Please revise this line of code to correct this potential issue.

Copy link

@codecov codecov bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodecovAI submitted a new review for 275d0f0

"name": "main-6c1d26e76f6ba1fc75c8.js",
"normalized": "main-*.js",
"size": 70961,
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in snapshot test names here (lines 55-56 and 73-74) seem inverted. Double-check these name changes since the first test now has a snapshot name referring to ESM output instead of checking for non-inclusion of source maps, and vice versa in the second test.

leadingDelimiter,
)})`;

// grab the ending delimiter and create a regex group for it
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reinitializing endingDelimiter to an empty string here seems unnecessary, as its value is immediately assigned again after this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - we can do.

let endingDelimiter =
      format.at(...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops, my bad. I used to have this in a loop and removed it but forgot to bump it back up.


endingDelimiter =
format.at(
match.hashIndex + format.slice(match.hashIndex).indexOf("]") + 1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index slicing here could be prone to out-of-bounds errors. Consider adding a safety check to ensure that index used in at() is less than format.length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slice doesn't throw errors codecov bot, you're thinking of the wrong language.

"gzipSize": 98808,
"name": "main-H2_1FSsQ.js",
"normalized": "main-H2_1FSsQ.js",
"normalized": "main-*.js",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful with normalizing to a wildcard. Depending on how these paths are consumed, using '*' might have some unexpected side-effects.

{
name: "should replace '[hash:22]' with '*'",
input: {
path: "test.CoScjXRp_rD9HKS--kYO73.chunk.js",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases here (lines 96-125) seem to be over-repetitive; they are all checking the same substitution logic but with different has lengths. Suggest simplifying these.

leadingDelimiter,
)})`;

// grab the ending delimiter and create a regex group for it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - we can do.

let endingDelimiter =
      format.at(...

// grab the ending delimiter and create a regex group for it
let endingDelimiter =
format.at(match.hashIndex + match.hashString.length) ?? "";
let endingDelimiter = "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also correct, but not worth the performance win because of the size of string, array lookups are cheap because the n is small for this method in general.

Only worry is the extra allocations, but considering this should be immediately GC'd for the most part not a problem.

In general the optimization would be to do this in single pass, and explicitly iterate over format and collect the delimiters accordingly.

format.at(match.hashIndex + match.hashString.length) ?? "";
let endingDelimiter = "";

endingDelimiter =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree codecov bot :)

format.slice(match.hashIndex).indexOf("]") become it's own variable helps readability of why endingDelimiter was calculated in this way. I would extract it out.


endingDelimiter =
format.at(
match.hashIndex + format.slice(match.hashIndex).indexOf("]") + 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slice doesn't throw errors codecov bot, you're thinking of the wrong language.

Copy link

@codecov codecov bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodecovAI submitted a new review for 6f8d670

},
expected: "test.*.chunk.js",
},
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previously, this test case seems to duplicate others. Removing it would lighten the test suite without losing any value.

expected: "test.*.chunk.js",
},
{
name: "should replace '[fullhash:22]' with '*'",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another example of a redundant test that could be removed to make the test suite more efficient.

expected: "test.*.chunk.js",
},
{
name: "should replace '[chunkhash:22]' with '*'",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, this test case seems to duplicate the previous ones. It tests the same process with similar inputs, therefore, any problematic behavior would cause both of them to fail. We should remove it to lighten the test suite.

@nicholas-codecov nicholas-codecov merged commit 0ea4d42 into main Oct 18, 2024
61 of 62 checks passed
@nicholas-codecov nicholas-codecov deleted the fix-issue-when-hash-length-is-set branch October 18, 2024 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants