Linux wget Command: A Practical Guide from Basics to Advanced Downloads
Linux wget Command: A Practical Guide from Basics to Advanced Downloads#
wget is one of the most classic command-line download tools in Linux. Unlike graphical downloaders, it supports resume, background downloads, rate limiting, and is perfect for automation on servers. This article covers everything from implementation principles to practical tips.
The Core Mechanism of wget#
wget’s design philosophy is “non-interactive download” — it handles retries, timeouts, and redirects without user intervention.
Basic workflow:
- Parse URL, establish TCP connection
- Send HTTP/HTTPS/FTP request
- Receive response, write to local file
- Track progress for resume on interruption
This means you can leave it running in the background, schedule it with crontab, and it handles everything automatically.
Essential Parameters#
1. Specify Filename and Path#
# Rename while downloading
wget -O backup.tar.gz https://example.com/file.tar.gz
# Specify download directory
wget -P /tmp/downloads https://example.com/file.zip
-O overwrites existing files, while -P keeps the original filename. Both can be combined.
2. Resume Downloads: The Network Savior#
# Enable resume
wget -c https://example.com/large-file.iso
The principle is wget tracks downloaded bytes locally and sends Range: bytes=downloaded-bytes- header on reconnection. The server returns remaining data. This requires server support (response header contains Accept-Ranges: bytes).
3. Background Download with Logging#
# Run in background, log to wget-log
wget -b https://example.com/bigfile.zip
# Specify log file
wget -b -o download.log https://example.com/bigfile.zip
In background mode, wget redirects output to a file, so closing the terminal won’t interrupt it.
4. Rate Limiting: Don’t Kill Your Network#
# Limit download speed to 1MB/s
wget --limit-rate=1m https://example.com/file.iso
This is especially useful on servers to prevent downloads from saturating bandwidth. Supports k (KB/s) and m (MB/s) units.
5. Recursive Website Download#
# Mirror entire website
wget --mirror --convert-links --adjust-extension https://example.com
--mirror: Enable mirror mode (equivalent to-r -N -l inf --no-remove-listing)--convert-links: Convert links to local paths--adjust-extension: Add correct file extensions based on Content-Type
6. Handling HTTPS Certificate Issues#
# Skip certificate check (testing only)
wget --no-check-certificate https://self-signed.badssl.com/
Useful when testing self-signed servers. Not recommended for production.
Real-World Scenarios#
Batch Download from URL List#
# Read URLs from file
cat > urls.txt << EOF
https://example.com/file1.zip
https://example.com/file2.zip
https://example.com/file3.zip
EOF
wget -i urls.txt
The -i parameter reads URLs from a file, one per line — perfect for batch downloads.
Simulate Browser Requests#
# Add User-Agent and Referer
wget --user-agent="Mozilla/5.0" \
--referer="https://example.com" \
https://example.com/download/file.zip
Some sites check User-Agent and Referer. This bypasses simple anti-scraping checks.
Download Authenticated Resources#
# HTTP Basic Authentication
wget --user=username --password=secret https://example.com/protected.zip
# Use Cookies
wget --header="Cookie: session=abc123" https://example.com/user/file.zip
Timeout and Retry Control#
# Set 30-second timeout, max 5 retries
wget --timeout=30 --tries=5 https://example.com/file.zip
Proper timeout and retry settings improve success rates on unstable networks.
wget vs curl: When to Use Which#
| Feature | wget | curl |
|---|---|---|
| Design Goal | Download files | Data transfer |
| Resume Support | Native | Manual implementation |
| Recursive Download | Yes | No |
| Protocol Support | HTTP/HTTPS/FTP | HTTP/HTTPS/FTP/SCP/SFTP/etc |
| Upload Support | No | Yes |
| Library Support | None | libcurl |
Simple rule: Use wget for downloads, curl for API calls.
Performance Optimization#
1. Parallel Download Large Files#
wget doesn’t support multi-threading natively, but you can use xargs to run multiple instances:
# Chunked download (requires server Range support)
# First get file size
size=$(curl -sI https://example.com/bigfile.zip | grep -i content-length | awk '{print $2}')
# Then manually chunk download
Better alternatives: axel or aria2 for multi-threaded downloads.
2. Connection Pool Reuse#
For downloading many small files, use --keep-session-cookies to reuse connections:
wget --keep-session-cookies --save-cookies cookies.txt \
--load-cookies cookies.txt \
-i urls.txt
3. DNS Cache Optimization#
# Force specific DNS settings
wget --no-dns-cache --inet4-only https://example.com/file.zip
Disabling DNS cache can resolve resolution errors in some cases.
Troubleshooting Common Issues#
Download Hangs#
Check network and firewall:
# Test connectivity
curl -I https://example.com/file.zip
# View detailed process
wget -d https://example.com/file.zip
The -d flag outputs debug info showing complete request and response headers.
Filename Garbled#
Chinese characters in URLs get encoded as %E4%B8%AD%E6%96%87. Solution:
# Control encoding with --restrict-file-names
wget --restrict-file-names=nocontrol https://example.com/chinese-file.zip
Incomplete Downloads#
Check disk space and permissions:
df -h /tmp
ls -l /tmp/downloads
Some filesystems (like FAT32) have 4GB file size limits.
Summary#
wget’s core value is automation-friendly — background execution, resume support, and rate limiting make it the go-to choice for server downloads. Once you master the essential parameters, you can build reliable download workflows in scripts.
Key Takeaways:
-cfor resume is a must-remember parameter-bfor background downloads on large files--limit-rateprotects bandwidth--mirrorfor recursive website downloads-ifor batch downloads from files
Related Tools:
- cURL Command - More powerful data transfer tool
- tar Command - Archive and compression tool
- HTTP Status Codes - Look up status codes during downloads