Skip to content

Better URL analysis during -bake #26

@tdammers

Description

@tdammers

The -bake command needs to analyze link hrefs and such to figure out what else to scrape. But the algorithm it uses is a bit too crude - anything not starting with http: or https: is considered a local link, but this is false; things can use other protocols, such as mailto:, javascript:, file:, data: etc., which the scraper shouldn't touch.

See #23 for example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions