Skip to content

Comments

Fix queue num/count metadata for blacklisted child links#9082

Closed
harshang03 wants to merge 1 commit intomikf:masterfrom
harshang03:fix/8513-blacklist-num-counter
Closed

Fix queue num/count metadata for blacklisted child links#9082
harshang03 wants to merge 1 commit intomikf:masterfrom
harshang03:fix/8513-blacklist-num-counter

Conversation

@harshang03
Copy link

Summary

  • add queue metadata rollback markers for extractors that emit child URLs and pre-increment counters
  • rollback configured top-level and nested counters when queued child extraction is skipped by extractor filtering
  • add regression tests to keep num/count metadata aligned with actually processed child downloads

Testing

  • python3 -m unittest test.test_job (fails in this environment: missing dependency 'requests')

Fixes #8513

Allow queue metadata to opt into counter rollback when a queued child URL is skipped by extractor filtering so num/count fields stay aligned with actual downloads.
@mikf mikf closed this Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Links to blacklisted sites are being counted towards num metadata key

2 participants