forked from ArchiveBox/ArchiveBox
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] dev from ArchiveBox:dev #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Deleted dead/duplicate hooks: - wget/on_Crawl__10_install_wget.py (duplicate of __10_wget_validate_config.py) - chrome/on_Crawl__00_chrome_install.py (simpler version, kept full one) - chrome/on_Crawl__20_chrome_launch.bg.js (legacy, kept __30 version) - singlefile/on_Crawl__20_install_singlefile_extension.js (disabled/dead) - istilldontcareaboutcookies/on_Crawl__20_install_*.js (legacy) - ublock/on_Crawl__03_ublock.js (legacy, kept __20 version) - Entire captcha2/ plugin (legacy version of twocaptcha/) Renamed hooks to follow consistent pattern: on_Crawl__XX_<plugin>_<action>.<ext> Priority bands: 00-09: Binary/extension installation 10-19: Config validation 20-29: Browser launch and post-launch config Final hooks: 00 ripgrep_install.py, 01 chrome_install.py 02 istilldontcareaboutcookies_install.js 03 ublock_install.js, 04 singlefile_install.js 05 twocaptcha_install.js 10 chrome_validate.py, 11 wget_validate.py 20 chrome_launch.bg.js, 25 twocaptcha_config.js
Deleted dead/duplicate hooks: - wget/on_Crawl__10_install_wget.py (duplicate of __10_wget_validate_config.py) - chrome/on_Crawl__00_chrome_install.py (simpler version, kept full one) - chrome/on_Crawl__20_chrome_launch.bg.js (legacy, kept __30 version) - singlefile/on_Crawl__20_install_singlefile_extension.js (disabled/dead) - istilldontcareaboutcookies/on_Crawl__20_install_*.js (legacy) - ublock/on_Crawl__03_ublock.js (legacy, kept __20 version) - Entire captcha2/ plugin (legacy version of twocaptcha/) Renamed hooks to follow consistent pattern: on_Crawl__XX_<plugin>_<action>.<ext> Priority bands: 00-09: Binary/extension installation 10-19: Config validation 20-29: Browser launch and post-launch config Final hooks: 00 ripgrep_install.py, 01 chrome_install.py 02 istilldontcareaboutcookies_install.js 03 ublock_install.js, 04 singlefile_install.js 05 twocaptcha_install.js 10 chrome_validate.py, 11 wget_validate.py 20 chrome_launch.bg.js, 25 twocaptcha_config.js <!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line length changes. --> # Summary <!--e.g. This PR fixes ABC or adds the ability to do XYZ...--> # Related issues <!-- e.g. #123 or Roadmap goal # https://github.com/pirate/ArchiveBox/wiki/Roadmap --> # Changes these areas - [ ] Bugfixes - [ ] Feature behavior - [ ] Command line interface - [ ] Configuration options - [ ] Internal architecture - [ ] Snapshot data layout on disk <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Cleaned up Crawl-level hooks by removing legacy/duplicate code and standardizing hook names and priorities. Chrome launch is now a single, updated hook with better extension detection and cleaner outputs. - **Refactors** - Removed dead hooks (legacy chrome install/launch, singlefile extension, old ublock/cookies scripts, duplicate wget validate) and the legacy captcha2 plugin in favor of twocaptcha. - Renamed hooks to on_Crawl__XX_<plugin>_<action> with priority bands: 00-09 install, 10-19 validate, 20-29 launch/config. - Consolidated Chrome launch into on_Crawl__20_chrome_launch.bg.js; writes outputs to the current dir, resolves real extension IDs via chrome://extensions, and records extensions.json after verification. - **Migration** - If you used captcha2, switch to the twocaptcha hooks (on_Crawl__05_twocaptcha_install.js and on_Crawl__25_twocaptcha_config.js). - Update any docs/scripts that reference old hook filenames. <sup>Written for commit 4c77949. Summary will update on new commits.</sup> <!-- End of auto-generated description by cubic. -->
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )