Skip to content
This repository was archived by the owner on Jan 17, 2023. It is now read-only.

Retrieve and parse web page (remote resource in general) incrementally #19

@llucax

Description

@llucax

We only want to get some information about the URL, and we acknowledge this information won't be perfect, as we'll need to make assumptions and use heuristics to figure out where to get the information from.

Because of this, to avoid retrieving huge documents and avoid parsing huge documents, we should ideally retrieve and parse the remote resource incrementally, and stop when we have enough information about it. For example, generated links will always have a maximum length, and if we are asked to generate a link for a resource storing the complete Shakespeare works, we only need to get the first 4K at most and then we are done. A lot of CPU power and network traffic can be saved this way.

Metadata

Metadata

Assignees

No one assigned

    Labels

    optimizationMakes the software use less resources or run faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions