Skip to content

add circular reference detection to prevent infinite loops#184

Open
ryo-rm wants to merge 1 commit intoseantomburke:masterfrom
ryo-rm:master
Open

add circular reference detection to prevent infinite loops#184
ryo-rm wants to merge 1 commit intoseantomburke:masterfrom
ryo-rm:master

Conversation

@ryo-rm
Copy link

@ryo-rm ryo-rm commented Jul 18, 2025

Issue

Some websites have sitemaps with circular references, which can cause infinite loops and prevent the crawler from completing.

Fix

Added circular reference detection by tracking visited URLs to prevent infinite loops.

- Only check for circular references on first attempt (retryIndex === 0)
- Allow retries to proceed without circular reference interference
- Maintain visitedUrls tracking only on first attempt
- Add comprehensive circular reference test coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant