Skip to content

Graceful & Proper Termination, both internally and via Extender #73

@Kleissner

Description

@Kleissner

The current stop functionality via Crawler.Stop is insufficient for multiple reasons:

  • Calling Stop twice results in a panic (because it would close c.stop twice)
  • Inside the functions Extender.Visit and Extender.Error there is right now no way of terminating the crawler. Often though, there is absolutely the need to terminate the crawler, for example if some limits are reached (like max amount of visited URLs, determined in a custom Extender.Visit function)
  • In Crawler.collectUrls when res.idleDeath is true, it may delete workers. However, if there are 0 workers it does NOT terminate the entire crawler, leaving it in limbo (memory leak). This I consider a bug.

I'll create a pull request with a fix. Since it affects the Extender functions, it will break compatibility, but that's a good thing here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions