Skip to content

Do not assume input to be a URL if the local path doesn't exist #1595

@mre

Description

@mre

Context

At the moment, lychee uses the http:// scheme as a fallback for local input paths that don't exist:

> lychee foo
Error: Network error

Caused by:
    0: error sending request for url (http://foo/)
    1: client error (Connect)
    2: dns error: failed to lookup address information: nodename nor servname provided, or not known
    3: failed to lookup address information: nodename nor servname provided, or not known

The idea was to model the behavior after curl, which does the same:
https://github.com/curl/curl/blob/70ac27604a2abfa809a7b2736506af0da8c3c8a9/lib/urlapi.c#L1104-L1124

Issue

Getting the scheme guessing correct is quite tricky.

Furthermore, it can be misleading if the user assumes a local path exists, but doesn't. In this case, we make an unexpected network request.
This can even go unnoticed in CI/CD in case there happens to be a URL with the same name.

For example, assume we want to check a ZIP archive (which isn't supported, but might be in the future).
Furthermore, assume the file doesn't exist.
If we run:

lychee --dump-inputs url.zip
http://url.zip/

This would assume http://url.zip/ instead! 💥
(Yes, .zip is a TLD.)

Proposal

I propose to remove the fallback to http. This is a breaking change, which we should make before 1.0.

Necessary changes

Remove the condition here and return an error instead:

// Invalid path; check if a valid URL can be constructed from the input
// by prefixing it with a `http://` scheme.
// Curl also uses http (i.e. not https), see
// https://github.com/curl/curl/blob/70ac27604a2abfa809a7b2736506af0da8c3c8a9/lib/urlapi.c#L1104-L1124
let url = Url::parse(&format!("http://{value}")).map_err(|e| {
ErrorKind::ParseUrl(e, "Input is not a valid URL".to_string())
})?;
InputSource::RemoteUrl(Box::new(url))

Add a test to prove correct behavior.

Pull requests greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions