- Сообщения
- 7,555
- Реакции
- 6,410
Справочник по командам wget
wget — консольная утилита для скачивания файлов/сайтов.
Умеет выкачивать файлы рекурсивно, следуя по ссылкам автоматически.
Скачать wget:
1) Для Windows - для работы требуется библиотека libeau32.dll
2) Для Windows - не требует библиотек [ver.1.11.4].
3) для других платформ.
Оригинальное руководство (на англ.)
wget http://example.com/file.zip |скачивание файла file.zip в текущую директорию
wget -P /path/to/save http://example.com/file.zip |скачивание файла file.zip в директорию /path/to/save
wget -c http://example.com/file.zip |докачивание файла file.zip в случаи обрыва
wget -O arch.zip http://example.com/file.zip |скачивание файла file.zip и сохранение под именем arch.zip
wget -i files.txt |скачивание файлов из списка в files.txt
wget --tries=10 http://example.com/file.zip |количество попыток на скачивание
wget -Q5m -i http://example.com/ |квота на максимальный размер скачанных файлов, квота действует только при рекурсивном скачивании (-r)
wget --save-cookies cookies.txt --post-data 'username=proft&password=1' http://example.com/auth.php |идентификация на сервере с сохранением кук для последующего доступа
wget --user-agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.9 Safari/536.5" http://example.com/ |указание User Agent
wget ftp://example.com/dir/*.zip |скачивание всех файлов по шаблону
wget http://example.com/dir/file{1..10}.zip |скачивание всех файлов по шаблону
wget -S http://example.com/ |вывод заголовков HTTP серверов и ответов FTP серверов
wget --spider -i urls.txt |проверка ссылок в файле на доступность
wget -b http://example.com/file.zip |скачивание файла в фоне, лог пишется в wget.log, wget.log.1 и т.д.
wget http://example.com/file.zip |скачивание файла *file.zip* через прокси
wget -m -w 2 http://example.com/ |зеркалирование сайта с сохранением абсолютных ссылок и ожиданием 2-х секунд между запросами
wget --limit-rate=200k http://example.com/file.zip |ограничение скорости скачивания
wget -R bmp http://example.com/ |не скачивать bmp файлы
wget -A png,jpg http://example.com/ |скачивать только файлы png и jpg
Пример использования для скачивания документации Django:
Код:
wget -r -k -l 5 -p -E -nc -np https://docs.djangoproject.com/en/1.5/
-r - ходим по ссылкам (рекурсивное скачивание)
-k - преобразовываем ссылки к локальному виду
-p - скачивание ресурсов необходимых для отображения html-страницы (стили, картинки и т.д.)
-l - глубина скачивания, 0 - бесконечная вложенность ссылок
-nc - не перезаписывать существующие файлы
-np - не подниматься выше начального адреса при рекурсивной загрузке
Каждый ключ имеет алиас (синоним),
например ключ -h соответствует --help (вызов помощи по команде).
Для докачивания файла от себя рекомендую применять такие ключи:
Код:
wget.exe --no-cache --no-dns-cache --continue --timestamping --tries=5 --timeout=15 "http://download.geo.drweb.com/pub/drweb/cureit/cureit.exe"
--timestamping - перекачивать файл только, если он новее локального
--tries=5 - кол-во попыток подключений при неудаче
--timeout=15 - задержка между попытками (секунд)
--no-cache - запретить кеширование данных
--no-dns-cache - запретить кеширование DNS адреса
Батник получения и подстановки для WGET настроек прокси Internet Explorer:
CMD/BATCH:
@echo off
SetLocal EnableExtensions
call :GetProxyConfig
set Link=[URL]http://live.sysinternals.com/Handle.exe[/URL]
set Tools=c:\temp
set Wget_UpdateMode=UpdateMode
call :RequestFile "%Link%" "%Tools%\temp" "%Wget_UpdateMode%"
if not errorlevel 1 echo "Скачан (обновлен) успешно"
pause
Exit /B
:GetProxyConfig
:: Получает настройки прокси, заданные для Internet Explorer, создает конфигурацию для WGET
if Defined ProxyAddress_Force (
;;; call :msgbox " Прокси-сервер %ProxyAddress_Force% задан принудительно.\n Использовать его?" YesNo
if errorlevel 1 (
set "wgetProxy=-e use_proxy=yes"
set "ProxyServer=%ProxyAddress_Force%"
Exit /B
))
set hive=HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings
for %%N in (ProxyServer ProxyOverride ProxyEnable) do For /F "Tokens=2*" %%A In ('Reg.exe Query "%hive%" ^| Find /I "%%N"') do set "%%N=%%B"
if "%ProxyEnable%"=="0x1" (
;;; call :msgbox " Обнаружены локальные настройки прокси. Сервер - %ProxyServer% \n Вам знаком этот адрес?" YesNo
if errorlevel 1 (set "use_proxy=yes") else (set "use_proxy=no")
) else (
set "use_proxy=no"
)
if "%use_proxy%"=="yes" set "wgetProxy=-e use_proxy=%use_proxy%"
exit /B
:RequestFile [HTTP(s) or FTP File] [Папка, куда скачивать] [UpdateMode - скачивание в режиме обновления] {LogFile - Имя файла, куда дописывать лог}
:: Скачивает заданный файл, если он изменился (ключ --timestamping)
if "%~3"=="UpdateMode" (set "Wget_UpdateMode=--timestamping ") else (set Wget_UpdateMode=)
if "%~4" neq "" (
set "verbose="
(set wget_log=--append-output="%~4" )
set "minimize=/MIN "
) else (
set "wget_log="
set "verbose=--no-verbose "
set "minimize="
)
set "sURL=%~1"
if "%use_proxy%"=="yes" (
if /i "%sURL:~0,3%"=="ftp" set "ProxyProtocol=-e ftp_proxy=%ProxyServer%"
if /i "%sURL:~0,4%"=="http" set "ProxyProtocol=-e http_proxy=%ProxyServer%"
if /i "%sURL:~0,5%"=="https" set "ProxyProtocol=-e https_proxy=%ProxyServer%"
)
::%verbose%
start "" %minimize%/WAIT "%bin%\wget.exe" %wgetProxy% %ProxyProtocol% --no-cache --no-dns-cache --tries=1 %wget_log%%Wget_UpdateMode%--directory-prefix="%~2" "%~1"
:: Title %app.Name%
exit /B %errorlevel%
GNU Wget 1.11.4, a non-interactive network retriever.
Usage: wget [OPTION]... ...
Mandatory arguments to long options are mandatory for short options too.
Startup:
-V, --version display the version of Wget and exit.
-h, --help print this help.
-b, --background go to background after startup.
-e, --execute=COMMAND execute a `.wgetrc'-style command.
Logging and input file:
-o, --output-file=FILE log messages to FILE.
-a, --append-output=FILE append messages to FILE.
-d, --debug print lots of debugging information.
-q, --quiet quiet (no output).
-v, --verbose be verbose (this is the default).
-nv, --no-verbose turn off verboseness, without being quiet.
-i, --input-file=FILE download URLs found in FILE.
-F, --force-html treat input file as HTML.
-B, --base=URL prepends URL to relative links in -F -i file.
Download:
-t, --tries=NUMBER set number of retries to NUMBER (0 unlimits).
--retry-connrefused retry even if connection is refused.
-O, --output-document=FILE write documents to FILE.
-nc, --no-clobber skip downloads that would download to
existing files.
-c, --continue resume getting a partially-downloaded file.
--progress=TYPE select progress gauge type.
-N, --timestamping don't re-retrieve files unless newer than
local.
-S, --server-response print server response.
--spider don't download anything.
-T, --timeout=SECONDS set all timeout values to SECONDS.
--dns-timeout=SECS set the DNS lookup timeout to SECS.
--connect-timeout=SECS set the connect timeout to SECS.
--read-timeout=SECS set the read timeout to SECS.
-w, --wait=SECONDS wait SECONDS between retrievals.
--waitretry=SECONDS wait 1..SECONDS between retries of a retrieval.
--random-wait wait from 0...2*WAIT secs between retrievals.
--no-proxy explicitly turn off proxy.
-Q, --quota=NUMBER set retrieval quota to NUMBER.
--bind-address=ADDRESS bind to ADDRESS (hostname or IP) on local host.
--limit-rate=RATE limit download rate to RATE.
--no-dns-cache disable caching DNS lookups.
--restrict-file-names=OS restrict chars in file names to ones OS allows.
--ignore-case ignore case when matching files/directories.
--user=USER set both ftp and http user to USER.
--password=PASS set both ftp and http password to PASS.
Directories:
-nd, --no-directories don't create directories.
-x, --force-directories force creation of directories.
-nH, --no-host-directories don't create host directories.
--protocol-directories use protocol name in directories.
-P, --directory-prefix=PREFIX save files to PREFIX/...
--cut-dirs=NUMBER ignore NUMBER remote directory components.
HTTP options:
--http-user=USER set http user to USER.
--http-password=PASS set http password to PASS.
--no-cache disallow server-cached data.
-E, --html-extension save HTML documents with `.html' extension.
--ignore-length ignore `Content-Length' header field.
--header=STRING insert STRING among the headers.
--max-redirect maximum redirections allowed per page.
--proxy-user=USER set USER as proxy username.
--proxy-password=PASS set PASS as proxy password.
--referer=URL include `Referer: URL' header in HTTP request.
--save-headers save the HTTP headers to file.
-U, --user-agent=AGENT identify as AGENT instead of Wget/VERSION.
--no-http-keep-alive disable HTTP keep-alive (persistent connections).
--no-cookies don't use cookies.
--load-cookies=FILE load cookies from FILE before session.
--save-cookies=FILE save cookies to FILE after session.
--keep-session-cookies load and save session (non-permanent) cookies.
--post-data=STRING use the POST method; send STRING as the data.
--post-file=FILE use the POST method; send contents of FILE.
--content-disposition honor the Content-Disposition header when
choosing local file names (EXPERIMENTAL).
--auth-no-challenge Send Basic HTTP authentication information
without first waiting for the server's
challenge.
HTTPS (SSL/TLS) options:
--secure-protocol=PR choose secure protocol, one of auto, SSLv2,
SSLv3, and TLSv1.
--no-check-certificate don't validate the server's certificate.
--certificate=FILE client certificate file.
--certificate-type=TYPE client certificate type, PEM or DER.
--private-key=FILE private key file.
--private-key-type=TYPE private key type, PEM or DER.
--ca-certificate=FILE file with the bundle of CA's.
--ca-directory=DIR directory where hash list of CA's is stored.
--random-file=FILE file with random data for seeding the SSL PRNG.
--egd-file=FILE file naming the EGD socket with random data.
FTP options:
--ftp-user=USER set ftp user to USER.
--ftp-password=PASS set ftp password to PASS.
--no-remove-listing don't remove `.listing' files.
--no-glob turn off FTP file name globbing.
--no-passive-ftp disable the "passive" transfer mode.
--retr-symlinks when recursing, get linked-to files (not dir).
--preserve-permissions preserve remote file permissions.
Recursive download:
-r, --recursive specify recursive download.
-l, --level=NUMBER maximum recursion depth (inf or 0 for infinite).
--delete-after delete files locally after downloading them.
-k, --convert-links make links in downloaded HTML point to local files.
-K, --backup-converted before converting file X, back up as X.orig.
-m, --mirror shortcut for -N -r -l inf --no-remove-listing.
-p, --page-requisites get all images, etc. needed to display HTML page.
--strict-comments turn on strict (SGML) handling of HTML comments.
Recursive accept/reject:
-A, --accept=LIST comma-separated list of accepted extensions.
-R, --reject=LIST comma-separated list of rejected extensions.
-D, --domains=LIST comma-separated list of accepted domains.
--exclude-domains=LIST comma-separated list of rejected domains.
--follow-ftp follow FTP links from HTML documents.
--follow-tags=LIST comma-separated list of followed HTML tags.
--ignore-tags=LIST comma-separated list of ignored HTML tags.
-H, --span-hosts go to foreign hosts when recursive.
-L, --relative follow relative links only.
-I, --include-directories=LIST list of allowed directories.
-X, --exclude-directories=LIST list of excluded directories.
-np, --no-parent don't ascend to the parent directory.
Mail bug reports and suggestions to <bug-wget@gnu.org>.
Usage: wget [OPTION]... ...
Mandatory arguments to long options are mandatory for short options too.
Startup:
-V, --version display the version of Wget and exit.
-h, --help print this help.
-b, --background go to background after startup.
-e, --execute=COMMAND execute a `.wgetrc'-style command.
Logging and input file:
-o, --output-file=FILE log messages to FILE.
-a, --append-output=FILE append messages to FILE.
-d, --debug print lots of debugging information.
-q, --quiet quiet (no output).
-v, --verbose be verbose (this is the default).
-nv, --no-verbose turn off verboseness, without being quiet.
-i, --input-file=FILE download URLs found in FILE.
-F, --force-html treat input file as HTML.
-B, --base=URL prepends URL to relative links in -F -i file.
Download:
-t, --tries=NUMBER set number of retries to NUMBER (0 unlimits).
--retry-connrefused retry even if connection is refused.
-O, --output-document=FILE write documents to FILE.
-nc, --no-clobber skip downloads that would download to
existing files.
-c, --continue resume getting a partially-downloaded file.
--progress=TYPE select progress gauge type.
-N, --timestamping don't re-retrieve files unless newer than
local.
-S, --server-response print server response.
--spider don't download anything.
-T, --timeout=SECONDS set all timeout values to SECONDS.
--dns-timeout=SECS set the DNS lookup timeout to SECS.
--connect-timeout=SECS set the connect timeout to SECS.
--read-timeout=SECS set the read timeout to SECS.
-w, --wait=SECONDS wait SECONDS between retrievals.
--waitretry=SECONDS wait 1..SECONDS between retries of a retrieval.
--random-wait wait from 0...2*WAIT secs between retrievals.
--no-proxy explicitly turn off proxy.
-Q, --quota=NUMBER set retrieval quota to NUMBER.
--bind-address=ADDRESS bind to ADDRESS (hostname or IP) on local host.
--limit-rate=RATE limit download rate to RATE.
--no-dns-cache disable caching DNS lookups.
--restrict-file-names=OS restrict chars in file names to ones OS allows.
--ignore-case ignore case when matching files/directories.
--user=USER set both ftp and http user to USER.
--password=PASS set both ftp and http password to PASS.
Directories:
-nd, --no-directories don't create directories.
-x, --force-directories force creation of directories.
-nH, --no-host-directories don't create host directories.
--protocol-directories use protocol name in directories.
-P, --directory-prefix=PREFIX save files to PREFIX/...
--cut-dirs=NUMBER ignore NUMBER remote directory components.
HTTP options:
--http-user=USER set http user to USER.
--http-password=PASS set http password to PASS.
--no-cache disallow server-cached data.
-E, --html-extension save HTML documents with `.html' extension.
--ignore-length ignore `Content-Length' header field.
--header=STRING insert STRING among the headers.
--max-redirect maximum redirections allowed per page.
--proxy-user=USER set USER as proxy username.
--proxy-password=PASS set PASS as proxy password.
--referer=URL include `Referer: URL' header in HTTP request.
--save-headers save the HTTP headers to file.
-U, --user-agent=AGENT identify as AGENT instead of Wget/VERSION.
--no-http-keep-alive disable HTTP keep-alive (persistent connections).
--no-cookies don't use cookies.
--load-cookies=FILE load cookies from FILE before session.
--save-cookies=FILE save cookies to FILE after session.
--keep-session-cookies load and save session (non-permanent) cookies.
--post-data=STRING use the POST method; send STRING as the data.
--post-file=FILE use the POST method; send contents of FILE.
--content-disposition honor the Content-Disposition header when
choosing local file names (EXPERIMENTAL).
--auth-no-challenge Send Basic HTTP authentication information
without first waiting for the server's
challenge.
HTTPS (SSL/TLS) options:
--secure-protocol=PR choose secure protocol, one of auto, SSLv2,
SSLv3, and TLSv1.
--no-check-certificate don't validate the server's certificate.
--certificate=FILE client certificate file.
--certificate-type=TYPE client certificate type, PEM or DER.
--private-key=FILE private key file.
--private-key-type=TYPE private key type, PEM or DER.
--ca-certificate=FILE file with the bundle of CA's.
--ca-directory=DIR directory where hash list of CA's is stored.
--random-file=FILE file with random data for seeding the SSL PRNG.
--egd-file=FILE file naming the EGD socket with random data.
FTP options:
--ftp-user=USER set ftp user to USER.
--ftp-password=PASS set ftp password to PASS.
--no-remove-listing don't remove `.listing' files.
--no-glob turn off FTP file name globbing.
--no-passive-ftp disable the "passive" transfer mode.
--retr-symlinks when recursing, get linked-to files (not dir).
--preserve-permissions preserve remote file permissions.
Recursive download:
-r, --recursive specify recursive download.
-l, --level=NUMBER maximum recursion depth (inf or 0 for infinite).
--delete-after delete files locally after downloading them.
-k, --convert-links make links in downloaded HTML point to local files.
-K, --backup-converted before converting file X, back up as X.orig.
-m, --mirror shortcut for -N -r -l inf --no-remove-listing.
-p, --page-requisites get all images, etc. needed to display HTML page.
--strict-comments turn on strict (SGML) handling of HTML comments.
Recursive accept/reject:
-A, --accept=LIST comma-separated list of accepted extensions.
-R, --reject=LIST comma-separated list of rejected extensions.
-D, --domains=LIST comma-separated list of accepted domains.
--exclude-domains=LIST comma-separated list of rejected domains.
--follow-ftp follow FTP links from HTML documents.
--follow-tags=LIST comma-separated list of followed HTML tags.
--ignore-tags=LIST comma-separated list of ignored HTML tags.
-H, --span-hosts go to foreign hosts when recursive.
-L, --relative follow relative links only.
-I, --include-directories=LIST list of allowed directories.
-X, --exclude-directories=LIST list of excluded directories.
-np, --no-parent don't ascend to the parent directory.
Mail bug reports and suggestions to <bug-wget@gnu.org>.
Минусы утилиты:
- Не умеет скачивать файл в папку под конкретным именем - всегда берет в роле имени адрес за последним слешем URL.
- Давно не развивается, например, не поддерживает новые протоколы подключения по шифрованному соединению.
- Сравнение Curl vs Wget (англ.)
Альтернативы: curl