Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,9 @@ sudo mandb

```text
wcurl <URL>...
wcurl [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [--dry-run] [--] <URL>...
wcurl [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--dry-run] [--] <URL>...
wcurl -i <INPUT_FILE>...
wcurl [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [-i|--input-file <PATH>]... [--dry-run] [--] [<URL>]...
wcurl [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--input-file=<PATH>]... [--dry-run] [--] [<URL>]...
wcurl -V|--version
wcurl -h|--help
```
Expand Down Expand Up @@ -85,6 +86,12 @@ should be using curl directly if your use case is not covered.
the end (curl >= 7.83.0). If this option is provided multiple times, only the
last value is considered.

* `-i, --input-file=<PATH>`

Download all URLs listed in the input file. Can be used multiple times and
mixed with URLs as parameters. This is equivalent to setting `@\<PATH\>` as an
URL argument. Lines starting with `#` are ignored.

* `--no-decode-filename`

Don't percent-decode the output filename, even if the percent-encoding in the
Expand Down Expand Up @@ -112,6 +119,8 @@ instead forwarded to the curl invocation.
URL to be downloaded. Anything that is not a parameter is considered
an URL. Whitespaces are percent-encoded and the URL is passed to curl, which
then performs the parsing. May be specified more than once.
Arguments starting with `@` are considered as a file containing multiple URLs to
be downloaded; `@\<PATH\>` is equivalent to using `--input-file \<PATH\>`.

# Examples

Expand Down
62 changes: 53 additions & 9 deletions wcurl
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,9 @@ usage()
${PROGRAM_NAME} -- a simple wrapper around curl to easily download files.
Usage: ${PROGRAM_NAME} <URL>...
${PROGRAM_NAME} [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [--dry-run] [--] <URL>...
${PROGRAM_NAME} [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--dry-run] [--] <URL>...
${PROGRAM_NAME} -i <INPUT_FILE>...
${PROGRAM_NAME} [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [-i|--input-file <PATH>]... [--dry-run] [--] [<URL>]...
${PROGRAM_NAME} [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--input-file=<PATH>]... [--dry-run] [--] [<URL>]...
${PROGRAM_NAME} -h|--help
${PROGRAM_NAME} -V|--version
Expand All @@ -64,6 +65,10 @@ Options:
number appended to the end (curl >= 7.83.0). If this option is provided
multiple times, only the last value is considered.
-i, --input-file <PATH>: Download all URLs listed in the input file. Can be used multiple times
and mixed with URLs as parameters. This is equivalent to setting
"@<PATH>" as an URL argument. Lines starting with "#" are ignored.
--no-decode-filename: Don't percent-decode the output filename, even if the percent-encoding in
the URL was done by wcurl, e.g.: The URL contained whitespaces.
Expand All @@ -79,6 +84,8 @@ Options:
<URL>: URL to be downloaded. Anything that is not a parameter is considered
an URL. Whitespaces are percent-encoded and the URL is passed to curl, which
then performs the parsing. May be specified more than once.
Arguments starting with "@" are considered as a file containing multiple URLs to be
downloaded; "@<PATH>" is equivalent to using "--input-file <PATH>".
_EOF_
}

Expand Down Expand Up @@ -116,6 +123,34 @@ readonly PER_URL_PARAMETERS="\
# Whether to invoke curl or not.
DRY_RUN="false"

# Add URLs to list of URLs to be downloaded.
# If the argument starts with "@", then it's a file containing the URLs
# to be downloaded (an "input file").
# When parsing an input file, ignore lines starting with "#".
# This function also percent-encodes the whitespaces in URLs.
add_urls()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a version of add_urls that uses printf instead of case, but I think this one is alright :-).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, is it simpler to read and understand?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily. This is a back-of-the-napkin implementation to give you an idea:

set -x

U=""

b()
{
	n=$(printf "%s\n" "${1}" | sed 's/ /%20/g')
	U="${U} ${n}"
}

a()
{
	if [ "$(printf '%c' "$1")" = "@" ]; then
		while read -r line; do
			if [ "$(printf '%c' "$line")" != '#' ]; then
				b "$line"
			fi
		done < "${1#@}"
	else
		b "${1}"
	fi
}

a "o"
a "aaa"
a "qweq"
a "@2as"

echo "$U"

{
case "$1" in
@*)
while read -r url; do
case "$url" in
\#*) : ;;
*)
# Percent-encode whitespaces into %20, since wget supports those URLs.
newurl=$(printf "%s\n" "${url}" | sed 's/ /%20/g')
URLS="${URLS} ${newurl}"
;;
esac
done < "${1#@}"
;;
*)
# Percent-encode whitespaces into %20, since wget supports those URLs.
newurl=$(printf "%s\n" "${1}" | sed 's/ /%20/g')
URLS="${URLS} ${newurl}"
;;
esac
}

# Sanitize parameters.
sanitize()
{
Expand Down Expand Up @@ -279,6 +314,19 @@ while [ -n "${1-}" ]; do
OUTPUT_PATH="${opt}"
;;

--input-file=*)
add_urls "@$(printf "%s\n" "${1}" | sed 's/^--input-file=//')"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the program now takes an input file, I believe it's necessary to check if it exists and is readable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hopping we could let the shell handle this:

./wcurl -i non_existent
./wcurl: 135: cannot open nonexistent: No such file

./wcurl -i not_allowed
./wcurl: 135: cannot open not_allowed: Permission denied

Is it better do do it ourselves instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on how "easy" we want wcurl to be. If our intention is to make it "user friendly", then I think it's worth checking if the file exists and throw a nice error (possibly with a suggestion to run --help). Otherwise, I'm fine with relying on the shell to do error throwing for us :-).

;;

-i | --input-file)
shift
add_urls "@${1}"
;;

-i*)
add_urls "@$(printf "%s\n" "${1}" | sed 's/^-i//')"
;;

--no-decode-filename)
DECODE_FILENAME="false"
;;
Expand All @@ -296,10 +344,8 @@ while [ -n "${1-}" ]; do
--)
# This is the start of the list of URLs.
shift
for url in "$@"; do
# Encode whitespaces into %20, since wget supports those URLs.
newurl=$(printf "%s\n" "${url}" | sed 's/ /%20/g')
URLS="${URLS} ${newurl}"
for arg in "$@"; do
add_urls "${arg}"
done
break
;;
Expand All @@ -310,9 +356,7 @@ while [ -n "${1-}" ]; do

*)
# This must be a URL.
# Encode whitespaces into %20, since wget supports those URLs.
newurl=$(printf "%s\n" "${1}" | sed 's/ /%20/g')
URLS="${URLS} ${newurl}"
add_urls "${1}"
;;
esac
shift
Expand Down
14 changes: 12 additions & 2 deletions wcurl.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,11 @@ Added-in: n/a

**wcurl \<URL\>...**

**wcurl [--curl-options \<CURL_OPTIONS\>]... [--dry-run] [--no-decode-filename] [-o|-O|--output \<PATH\>] [--] \<URL\>...**
**wcurl -i \<INPUT_FILE\>...**

**wcurl [--curl-options=\<CURL_OPTIONS\>]... [--dry-run] [--no-decode-filename] [--output=\<PATH\>] [--] \<URL\>...**
**wcurl [--curl-options \<CURL_OPTIONS\>]... [--dry-run] [--no-decode-filename] [-o|-O|--output \<PATH\>] [-i|--input-file \<PATH\>]... [--] [\<URL\>]...**

**wcurl [--curl-options=\<CURL_OPTIONS\>]... [--dry-run] [--no-decode-filename] [--output=\<PATH\>] [--input-file=\<PATH\>]... [--] [\<URL\>]...**

**wcurl -V|--version**

Expand Down Expand Up @@ -82,6 +84,12 @@ URLs are provided, resulting files share the same name with a number appended to
the end (curl \>= 7.83.0). If this option is provided multiple times, only the
last value is considered.

## -i, --input-file=\<PATH\>

Download all URLs listed in the input file. Can be used multiple times and
mixed with URLs as parameters. This is equivalent to setting `@\<PATH\>` as an
URL argument. Lines starting with `#` are ignored.

## --no-decode-filename

Don't percent-decode the output filename, even if the percent-encoding in the
Expand Down Expand Up @@ -109,6 +117,8 @@ is instead forwarded to the curl invocation.
URL to be downloaded. Anything that is not a parameter is considered
an URL. Whitespaces are percent-encoded and the URL is passed to curl, which
then performs the parsing. May be specified more than once.
Arguments starting with `@` are considered as a file containing multiple URLs to be
downloaded; `@\<PATH\>` is equivalent to using `--input-file \<PATH\>`.

# EXAMPLES

Expand Down