Skip to content

Latest commit

 

History

History
457 lines (334 loc) · 11.1 KB

File metadata and controls

457 lines (334 loc) · 11.1 KB

Introduction

The wget command is a widely used non-interactive command-line utility for downloading files from the internet. It supports various protocols, including HTTP, HTTPS, and FTP, and is designed to work in the background without requiring user interaction. This makes it particularly useful for downloading large files, retrieving content from unreliable connections, mirroring websites, and scheduling downloads via scripts or cron jobs. This guide consolidates all key concepts, syntax, and examples to help you master wget.


Basic Syntax

wget [options] [URL]
  • options: Command-line switches that control wget's behavior.
  • URL: The web address of the file or website you want to download.

Download a File

wget https://example.com/file.zip

Downloads file.zip to the current directory.


Download and Save with a Specific Filename

wget -O newname.zip https://example.com/file.zip

Saves the downloaded file as newname.zip.


Resume an Interrupted Download

wget -c https://example.com/file.zip

Continues downloading a partially downloaded file.


Download in the Background

wget -b https://example.com/largefile.iso

Runs the download process in the background and logs output to wget-log.

Check progress with:

tail -f wget-log

Limit Download Speed

wget --limit-rate=200k https://example.com/file.zip

Limits download speed to 200 KB/s.


Download to a Specific Directory

wget -P /home/user/Downloads https://example.com/file.zip

Saves the file in the specified directory.


Read URLs from a File (Download Multiple Files)

Create a file urls.txt:

https://example.com/file1.zip
https://example.com/file2.zip

Download all:

wget -i urls.txt

Recursive Download (Entire Website)

wget -r https://example.com/

Downloads the entire site recursively.

Limit recursion depth:

wget -r -l 2 https://example.com/

Download for Offline Viewing

wget -r -p --convert-links https://example.com/

Downloads entire website, including assets, and converts links for local viewing.


Mirror a Website

wget --mirror -p --convert-links -P ./local-dir https://example.com/

Creates a full local mirror of a website with all necessary files and links.


Download All PDFs from a Website

wget -r -A pdf https://example.com/

Recursively downloads all .pdf files from the website.


Use a Custom User-Agent

wget --user-agent="Mozilla/5.0" https://example.com/file.zip

Spoofs a browser user-agent to avoid download blocking.


Handle Authentication

With credentials:

wget --user=username --password=password https://example.com/protected.zip

Prompt for password:

wget --user=username --ask-password https://example.com/protected.zip

Check URLs Without Downloading (Link Checking)

wget --spider https://example.com/

Validates the URL or checks website availability without downloading files.


Set Retry Attempts

wget --tries=10 https://example.com/file.zip

Sets the number of retries (default is 20).


Set Wait Time Between Requests

wget -w 10 https://example.com/file.zip

Waits 10 seconds between each download to reduce server load.


Redirect Output to a Log File

wget https://example.com/file.txt -o download-log.txt

Logs system-generated messages to download-log.txt.

Append instead of overwrite:

wget -a download-log.txt https://example.com/file.txt

Display Version and Help

wget -v

Displays the current version.

wget -h

Displays the help message and available options.


Summary Table of Common Options

Option Description
-v Display version of wget
-h Show help message
-o logfile Save output to specified logfile
-a logfile Append output to logfile
-b Run in background
-i file Read URLs from a file
-c Resume partially downloaded file
-r Enable recursive download
-l depth Set recursion depth
-p Download all required page assets
--convert-links Make links suitable for offline viewing
--mirror Enable mirror mode
--user-agent Set a custom user-agent string
--user, --password Use HTTP authentication
--limit-rate Limit download speed
--spider Check if the URL exists without downloading
--tries=number Set the number of retry attempts
-w time Wait time between downloads
-P dir Save files to a specified directory

Conclusion

The wget command is a versatile and robust tool for downloading files and entire websites directly from the terminal. Its non-interactive nature allows background operations, script integration, and support for slow or unstable connections. With numerous options for customization—such as limiting speed, resuming downloads, mirroring, and handling authentication—wget is essential for efficient web content retrieval in Linux and Unix-based systems.


cURL Command Notes

cURL is a command-line tool for transferring data to or from a server, supporting various protocols including HTTP, HTTPS, FTP, SCP, SFTP, and more. It is lightweight, scriptable, and widely used for tasks like downloading files, testing APIs, and debugging web applications. cURL is pre-installed on most Linux distributions (e.g., Ubuntu, Debian, CentOS) and is accessible directly from the terminal.

Key Features

  • Versatility: Supports numerous protocols (e.g., HTTP, HTTPS, FTP, SMTP, DICT).
  • Scriptable: Ideal for automation and cron jobs.
  • Lightweight: Fast and efficient, launches in seconds.
  • Common Uses:
    • Test API functionality.
    • Display HTTP headers.
    • Send HTTP requests (GET, POST, PUT, DELETE).
    • Transfer data to/from servers.

Syntax

curl [options] [URL]
  • [options]: Command-line flags to modify behavior (e.g., -o, -X, -u).
  • [URL]: The target location for data transfer.

Supported Protocols

  • DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS, WSS.

Common Use Cases and Examples

1. Fetching Data

Retrieve and display content from a URL.

curl https://example.com
curl https://www.geeksforgeeks.org
  • Multiple URLs:
curl http://site.{one,two,three}.com
  • Numeric Sequence:
curl ftp://ftp.example.com/file[1-20].jpeg

Progress Meter

Displays transfer rate, data transferred, and time remaining.

  • Default Meter:
curl -O ftp://ftp.example.com/file.zip
  • Progress Bar:
curl -# -O ftp://ftp.example.com/file.zip
  • Silent Mode:
curl --silent ftp://ftp.example.com/file.zip

2. Handling HTTP Requests

Send custom HTTP requests with various methods.

  • GET Request:
curl -X GET https://api.sampleapis.com/coffee/hot
  • POST Request:
curl -X POST -d "key1=value1&key2=value2" https://api.sampleapis.com/coffee/hot

3. Downloading Files

Download files from a URL.

  • Save with Custom Name (-o):
curl -o hello.zip ftp://speedtest.tele2.net/1MB.zip
  • Save with Original Name (-O):
curl -O ftp://speedtest.tele2.net/1MB.zip

4. Uploading Files

Upload files to a server (e.g., via FTP).

curl -T uploadfile.txt ftp://example.com/upload/

5. Handling Authentication

Access protected resources using credentials.

curl -u username:password https://example.com/api

6. Resuming Interrupted Downloads

Resume a partially downloaded file.

curl -C - -O ftp://speedtest.tele2.net/1MB.zip

7. Limiting Data Rate

Restrict the transfer rate (in bytes).

curl --limit-rate 1000K -O ftp://speedtest.tele2.net/1MB.zip

8. Authenticated FTP Downloads

Download files from a user-authenticated FTP server.

curl -u demo:password -O ftp://test.rebex.net/readme.txt

9. Uploading to FTP

Upload a file to an FTP server.

curl -u username:password -T filename ftp://example.com/
  • Append to Existing File:
curl -u username:password -T filename --append ftp://example.com/

10. Generating C Source Code

Output C code using libcurl for the specified command.

curl https://www.geeksforgeeks.org > log.html --libcurl code.c

11. Sending Email

Send emails using SMTP.

curl --url [SMTP_URL] --mail-from [sender_mail] --mail-rcpt [receiver_mail] -n --ssl-reqd -u {email}:{password} -T [Mail_text_file]

12. Using DICT Protocol

Retrieve word definitions.

curl dict://dict.org/d:overclock

Additional Options

  • Show Help:
curl --help
curl --help all  # Full list of options
  • Verbose Output (-v or --verbose):
curl -v https://example.com
  • Version Info (-V or --version):
curl -V
  • Follow Redirects (--location or -L):
curl -L https://example.com
  • Show HTTP Headers (-I or --head):
curl -I https://example.com
  • Show Headers and Content (-i or --include):
curl -i https://example.com
  • Set User Agent (-A or --user-agent):
curl -A "Mozilla/5.0" https://example.com

Installation Verification

Check if cURL is installed:

curl
  • Expected Output:
curl: try 'curl --help' for more information
  • If not installed, refer to the cURL documentation for installation instructions specific to your OS.

Notes

  • Case Sensitivity: Options are case-sensitive (e.g., -v-V).
  • Default HTTP Method: GET is used unless specified (e.g., -d implies POST).
  • Error Handling: For password-protected pages without credentials, expect a 401 error.
  • Redirects: Use -L to follow redirects automatically.
  • FTP Usage: Requires valid credentials and correct file paths.

Conclusion

Developed by Daniel Stenberg, cURL is a powerful and flexible tool for Linux users, enabling efficient data transfer across multiple protocols. Its extensive options and scriptability make it indispensable for automation, debugging, and server communication. For further details, consult the cURL manual or project website.