Introduce new options for downloading a list of URLs from file #66

samueloph · 2025-09-22T20:26:25Z

There are two ways of doing this now:

wget way: -i, --input-file
curl way: providing an URL argument starting with "@", wcurl will
see "@filename" and download URLs from "filename".

Lines starting with "#" inside input files are ignored.

This is a continuation of #58

I don't want to rush a change like this, I'm publishing the PR so that it gets more
visibility and I can verify the .md changes. This PR is still missing tests.

Overall we shouldn't try to implement every wget feature, this one might be
important enough to warrant the extra code.

sergiodj

Thanks @samueloph.

I'm leaving a few inline comments about possible enhancements.

I won't oppose if you think this use case is important enough to be implemented in wcurl, but my two cents is that I can't remember one single time when I needed to provide a list of URLs for wget/curl to download and I decided to use the tool instead of invoking things using for and xargs on shell. But maybe that's just me.

README.md

wcurl

wcurl.md

wcurl

sergiodj · 2025-09-23T01:31:29Z

wcurl

            ;;

+        --input-file=*)
+            add_urls "@$(printf "%s\n" "${1}" | sed 's/^--input-file=//')"


Since the program now takes an input file, I believe it's necessary to check if it exists and is readable.

I was hopping we could let the shell handle this:

./wcurl -i non_existent ./wcurl: 135: cannot open nonexistent: No such file ./wcurl -i not_allowed ./wcurl: 135: cannot open not_allowed: Permission denied

Is it better do do it ourselves instead?

It depends on how "easy" we want wcurl to be. If our intention is to make it "user friendly", then I think it's worth checking if the file exists and throw a nice error (possibly with a suggestion to run --help). Otherwise, I'm fine with relying on the shell to do error throwing for us :-).

sergiodj · 2025-09-23T01:38:00Z

wcurl

+# If the argument starts with "@", then it's an input file name.
+# When parsing an input file, ignore lines starting with "#".
+# This function also percent-encodes the whitespaces in URLs.
+add_urls()


I wrote a version of add_urls that uses printf instead of case, but I think this one is alright :-).

Nice, is it simpler to read and understand?

Not necessarily. This is a back-of-the-napkin implementation to give you an idea:

set -x U="" b() { n=$(printf "%s\n" "${1}" | sed 's/ /%20/g') U="${U} ${n}" } a() { if [ "$(printf '%c' "$1")" = "@" ]; then while read -r line; do if [ "$(printf '%c' "$line")" != '#' ]; then b "$line" fi done < "${1#@}" else b "${1}" fi } a "o" a "aaa" a "qweq" a "@2as" echo "$U"

samueloph · 2025-09-23T18:30:22Z

I'm leaving a few inline comments about possible enhancements.

All good suggestions, thank you.

I won't oppose if you think this use case is important enough to be implemented in wcurl, but my two cents is that I can't remember one single time when I needed to provide a list of URLs for wget/curl to download and I decided to use the tool instead of invoking things using for and xargs on shell. But maybe that's just me.

Yeah, for context; my flight got delayed due to the cyberattack and I wanted to see what an implementation would look like, it ended up simpler than I expected but I'm not sure yet about whether we should do it. This implementation is not equivalent to wget's -i as it doesn't support the form -i - to read from stdin, and I'm wondering if the fact that we do parallel downloads by default will be an issue to some users (due to throttling).

Thank you for the review!

@filename

There are two ways of doing this now: 1) wget way: -i, --input-file 2) curl way: providing an URL argument starting with "@", wcurl will see "@filename" and download URLs from "filename". Lines starting with "#" inside input files are ignored. This is a continuation of #58 Co-authored-by: Sergio Durigan Junior <[email protected]>

sergiodj · 2025-09-25T03:00:28Z

I won't oppose if you think this use case is important enough to be implemented in wcurl, but my two cents is that I can't remember one single time when I needed to provide a list of URLs for wget/curl to download and I decided to use the tool instead of invoking things using for and xargs on shell. But maybe that's just me.

Yeah, for context; my flight got delayed due to the cyberattack and I wanted to see what an implementation would look like, it ended up simpler than I expected but I'm not sure yet about whether we should do it. This implementation is not equivalent to wget's -i as it doesn't support the form -i - to read from stdin, and I'm wondering if the fact that we do parallel downloads by default will be an issue to some users (due to throttling).

The issue with throttling has always existed, because it's always been possible to invoke wcurl with multiple URLs via the command line. So I wouldn't worry too much about that. It is well documented that wcurl will parallelize downloads; if there is a need, we can make this optional.

But I find it funny to see that there's a bit of a disconnect between what wcurl's goals are vs. what we find ourselves discussing sometimes :-). For example, it's OK to don't treat certain errors (e.g., the non-existing file provided via -i) and let the shell bail out for us, but we're also discussing more complex ways to deal with multi-URL downloads and parallelization. Just something that crossed my mind :-).

samueloph mentioned this pull request Sep 22, 2025

Add example for downloading a list of URLs from a file #58

Open

2 tasks

sergiodj requested changes Sep 23, 2025

View reviewed changes

samueloph force-pushed the samueloph/input_file branch from b292543 to b959f4e Compare September 23, 2025 18:23

samueloph force-pushed the samueloph/input_file branch from b959f4e to f36ff08 Compare September 23, 2025 18:31

samueloph force-pushed the samueloph/input_file branch from f36ff08 to 48cef54 Compare September 24, 2025 21:40

Introduce new options for downloading a list of URLs from file #66

Are you sure you want to change the base?

Introduce new options for downloading a list of URLs from file #66

Uh oh!

Conversation

samueloph commented Sep 22, 2025

Uh oh!

sergiodj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sergiodj Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

samueloph Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

sergiodj Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

sergiodj Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

samueloph Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

sergiodj Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

samueloph commented Sep 23, 2025

Uh oh!

sergiodj commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants