-
Notifications
You must be signed in to change notification settings - Fork 17
Introduce new options for downloading a list of URLs from file #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,8 +49,9 @@ usage() | |
${PROGRAM_NAME} -- a simple wrapper around curl to easily download files. | ||
Usage: ${PROGRAM_NAME} <URL>... | ||
${PROGRAM_NAME} [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [--dry-run] [--] <URL>... | ||
${PROGRAM_NAME} [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--dry-run] [--] <URL>... | ||
${PROGRAM_NAME} -i <INPUT_FILE>... | ||
${PROGRAM_NAME} [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [-i|--input-file <PATH>]... [--dry-run] [--] [<URL>]... | ||
${PROGRAM_NAME} [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--input-file=<PATH>]... [--dry-run] [--] [<URL>]... | ||
${PROGRAM_NAME} -h|--help | ||
${PROGRAM_NAME} -V|--version | ||
|
@@ -64,6 +65,10 @@ Options: | |
number appended to the end (curl >= 7.83.0). If this option is provided | ||
multiple times, only the last value is considered. | ||
-i, --input-file <PATH>: Download all URLs listed in the input file. Can be used multiple times | ||
and mixed with URLs as parameters. This is equivalent to setting | ||
"@<PATH>" as an URL argument. Lines starting with "#" are ignored. | ||
--no-decode-filename: Don't percent-decode the output filename, even if the percent-encoding in | ||
the URL was done by wcurl, e.g.: The URL contained whitespaces. | ||
|
@@ -79,6 +84,8 @@ Options: | |
<URL>: URL to be downloaded. Anything that is not a parameter is considered | ||
an URL. Whitespaces are percent-encoded and the URL is passed to curl, which | ||
then performs the parsing. May be specified more than once. | ||
Arguments starting with "@" are considered as a file containing multiple URLs to be | ||
downloaded; "@<PATH>" is equivalent to using "--input-file <PATH>". | ||
_EOF_ | ||
} | ||
|
||
|
@@ -116,6 +123,34 @@ readonly PER_URL_PARAMETERS="\ | |
# Whether to invoke curl or not. | ||
DRY_RUN="false" | ||
|
||
# Add URLs to list of URLs to be downloaded. | ||
# If the argument starts with "@", then it's a file containing the URLs | ||
# to be downloaded (an "input file"). | ||
# When parsing an input file, ignore lines starting with "#". | ||
# This function also percent-encodes the whitespaces in URLs. | ||
add_urls() | ||
{ | ||
case "$1" in | ||
@*) | ||
while read -r url; do | ||
case "$url" in | ||
\#*) : ;; | ||
*) | ||
# Percent-encode whitespaces into %20, since wget supports those URLs. | ||
newurl=$(printf "%s\n" "${url}" | sed 's/ /%20/g') | ||
URLS="${URLS} ${newurl}" | ||
;; | ||
esac | ||
done < "${1#@}" | ||
;; | ||
*) | ||
# Percent-encode whitespaces into %20, since wget supports those URLs. | ||
newurl=$(printf "%s\n" "${1}" | sed 's/ /%20/g') | ||
URLS="${URLS} ${newurl}" | ||
;; | ||
esac | ||
} | ||
|
||
# Sanitize parameters. | ||
sanitize() | ||
{ | ||
|
@@ -279,6 +314,19 @@ while [ -n "${1-}" ]; do | |
OUTPUT_PATH="${opt}" | ||
;; | ||
|
||
--input-file=*) | ||
add_urls "@$(printf "%s\n" "${1}" | sed 's/^--input-file=//')" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since the program now takes an input file, I believe it's necessary to check if it exists and is readable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was hopping we could let the shell handle this:
Is it better do do it ourselves instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It depends on how "easy" we want wcurl to be. If our intention is to make it "user friendly", then I think it's worth checking if the file exists and throw a nice error (possibly with a suggestion to run |
||
;; | ||
|
||
-i | --input-file) | ||
shift | ||
add_urls "@${1}" | ||
;; | ||
|
||
-i*) | ||
add_urls "@$(printf "%s\n" "${1}" | sed 's/^-i//')" | ||
;; | ||
|
||
--no-decode-filename) | ||
DECODE_FILENAME="false" | ||
;; | ||
|
@@ -296,10 +344,8 @@ while [ -n "${1-}" ]; do | |
--) | ||
# This is the start of the list of URLs. | ||
shift | ||
for url in "$@"; do | ||
# Encode whitespaces into %20, since wget supports those URLs. | ||
newurl=$(printf "%s\n" "${url}" | sed 's/ /%20/g') | ||
URLS="${URLS} ${newurl}" | ||
for arg in "$@"; do | ||
add_urls "${arg}" | ||
done | ||
break | ||
;; | ||
|
@@ -310,9 +356,7 @@ while [ -n "${1-}" ]; do | |
|
||
*) | ||
# This must be a URL. | ||
# Encode whitespaces into %20, since wget supports those URLs. | ||
newurl=$(printf "%s\n" "${1}" | sed 's/ /%20/g') | ||
URLS="${URLS} ${newurl}" | ||
add_urls "${1}" | ||
;; | ||
esac | ||
shift | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a version of
add_urls
that usesprintf
instead ofcase
, but I think this one is alright :-).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, is it simpler to read and understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily. This is a back-of-the-napkin implementation to give you an idea: