Skip to content

Conversation

@sdn4z
Copy link
Collaborator

@sdn4z sdn4z commented Oct 30, 2025

The api from where we retrieve the top NPM packages was some times returning 5xx errors, so I've lowered the number of items per page and increased the number of retries and timeout.

@github-actions github-actions bot added the CI label Oct 30, 2025
@sdn4z
Copy link
Collaborator Author

sdn4z commented Oct 30, 2025

/lgtm review

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦉 lgtm Review

Score: Needs Work 🔧

🔍 Summary

This PR does a good job of improving the reliability of the package download script by adjusting request parameters and increasing retries. The refactoring of the Ecosystem definitions into data class instances is also a great change that improves code clarity.

More information
  • Id: e894c90cd32342b8958df1ab207ad848
  • Model: gemini-2.5-pro
  • Created at: 2025-10-30T14:21:00.277457+00:00
Usage summary
  • Request count: 2
  • Request tokens: 216,844
  • Response tokens: 10,990
  • Total tokens: 227,834

See the 📚 lgtm-ai repository for more information about lgtm.

json.dump(data, fp)

logger.info("Saved %d packages to `%s` file.", len(set(all_packages)), fpath)
logger.info("Saved %d packages to `%s` file.", len(all_packages), fpath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦉 ✨ Quality

Severity: LOW 🔵

logger.info("Saved %d packages to `%s` file.", len(all_packages), fpath)

The log message now reports the total number of packages fetched, which might include duplicates if the API returns them across different pages. The previous implementation used len(set(all_packages)) which correctly reports the number of unique packages saved. Reporting the unique count is more informative.

With the current bug in pagination, this change actually hides the issue by reporting a large number of packages when in fact only a small set of unique packages are being fetched repeatedly. Once the pagination is fixed, it would still be better to report on the unique set.

Suggested change
logger.info("Saved %d packages to `%s` file.", len(all_packages), fpath)
logger.info("Saved %d packages to `%s` file.", len(set(all_packages)), fpath)
More information about this comment
  • File: dependencies/scripts/download_packages.py
  • Line: 121
  • Relative line: 126
  • With suggestion: Yes
  • Suggestion ready for replacement: Yes

@elementsinteractive elementsinteractive deleted a comment from github-actions bot Oct 30, 2025
@sdn4z sdn4z closed this Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant