Linux wget: Your Ultimate Command Line Downloader

It is a common practice to manage UNIX/Linux/BSD servers remotely over the ssh session. You may need to download download the software or other files for installation. There are a few powerful graphical download manager exits for Linux and UNIX like operating systems:

  • d4x: Downloader for X is a Linux/Unix userfriendly program with nice X interface to download files from the Internet. It suppotrs both FTP and HTTP protocols, supports resuming
  • kget: KGet is a versatile and user-friendly download manager for KDE desktop system.
  • gwget / gwget2: Gwget is a download manager for the Gnome Desktop

However, when it comes to command line (shell prompt) wget the non-interactive downloader rules. It supports http, ftp, https protocols along with authentication facility, and tons of other options. Here are some tips to get most out of it:

Download a Single File Using wget

Type the following command:
$ wget http://www.cyberciti.biz/here/lsst.tar.gz
$ wget ftp://ftp.freebsd.org/pub/sys.tar.gz
$ wget -O output.file http://nixcraft.com/some/path/file.name.tar.gz
$ wget http://www.kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.36-rc3.tar.bz2

Sample outputs:

Fig.01: wget example

Fig.01: wget example

 

How Do I Download Multiple Files Using wget?

Use the following syntax:
$ wget http://www.cyberciti.biz/download/lsst.tar.gz ftp://ftp.freebsd.org/pub/sys.tar.gz ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpm
You can create a shell variable that holds all urls and use the ‘BASH for loop‘ to download all files:

URLS=”http://www.cyberciti.biz/download/lsst.tar.gz \
ftp://ftp.freebsd.org/pub/sys.tar.gz \
ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpm \
http://xyz.com/abc.iso"
for u in $URLS
do
 wget $u
done

How Do I Read URLs From a File?

You can put all urls in a text file and use the -i option to wget to download all files. First, create a text file:
$ vi /tmp/download.txt
Append a list of urls:

http://www.cyberciti.biz/download/lsst.tar.gz
ftp://ftp.freebsd.org/pub/sys.tar.gz
ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpm
http://xyz.com/abc.iso

Type the wget command as follows:
$ wget -i /tmp/download.txt

Resume Downloads

You can also force wget to get a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of wget, or by another program:
$ wget -c http://www.cyberciti.biz/download/lsst.tar.gz
$ wget -c -i /tmp/download.txt

Please note that the -c option only works with FTP / HTTP servers that support the “range” header.

Force wget To Download All Files In Background

The -o option used to force wget to go into background immediately after startup. If no output file is specified via the -o option, output is redirected to wget-log file:
$ wget -cb -o /tmp/download.log -i /tmp/download.txt
OR
$ nohup wget -c -o /tmp/download.log -i /tmp/download.txt &
nohup runs the given COMMAND (in this example wget) with hangup signals ignored, so that the command can continue running in the background after you log out.

How Do I Limit the Download Speed?

You can limit the download speed to amount bytes per second. Amount may be expressed in bytes, kilobytes with the k suffix, or megabytes with the m suffix. For example, –limit-rate=100k will limit the retrieval rate to 100KB/s. This is useful when, for whatever reason, you don’t want Wget to consume the entire available bandwidth. This is useful when you want to download a large file file, such as an ISO image:
$ wget -c -o /tmp/susedvd.log --limit-rate=50k ftp://ftp.novell.com/pub/suse/dvd1.iso
Use m suffix for megabytes (–limit-rate=1m). The above command will limit the retrieval rate to 50KB/s. It is also possible to specify disk quota for automatic retrievals to avoid disk DoS attack. The following command will be aborted when the quota is (100MB+) exceeded.
$ wget -cb -o /tmp/download.log -i /tmp/download.txt --quota=100m
From the wget man page:

Please note that Wget implements the limiting by sleeping the appropriate amount of time after a network read that took less time than specified by the rate. Eventually this strategy causes the TCP transfer to slow down to approximately the specified rate. However, it may take some time for this balance to be achieved, so don’t be surprised if limiting the rate doesn’t work well with very small files.

Use wget With the Password Protected Sites

You can supply the http username/password on server as follows:
$ wget --http-user=vivek --http-password=Secrete http://cyberciti.biz/vivek/csits.tar.gz
Another way to specify username and password is in the URL itself.
$ wget 'http://username:password@cyberciti.biz/file.tar.gz
Either method reveals your password to anyone who bothers to run ps command:
$ ps aux
Sample outputs:

vivek     27370  2.3  0.4 216156 51100 ?        S    05:34   0:06 /usr/bin/php-cgi
vivek     27744  0.1  0.0  97444  1588 pts/2    T    05:38   0:00 wget http://test:test@www.kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.36-rc3.tar.bz2
vivek    27746  0.5  0.0  97420  1240 ?        Ss   05:38   0:00 wget -b http://test:test@www.kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.36-rc3.tar.bz2

To prevent the passwords from being seen, store them in .wgetrc or .netrc, and make sure to protect those files from other users with “chmod”.

If the passwords are really important, do not leave them lying in those files either—edit the files and delete them after Wget has started the download.
G) Download all mp3 or pdf file from remote FTP server:
Generally you can use shell special character aka wildcards such as *, ?, [] to specify selection criteria for files. Same can be use with FTP servers while downloading files.
$ wget ftp://somedom.com/pub/downloads/*.pdf
$ wget ftp://somedom.com/pub/downloads/*.pdf
OR$ wget -g on ftp://somedom.com/pub/downloads/*.pdfH) Use aget when you need multithreaded http download:
aget fetches HTTP URLs in a manner similar to wget, but segments the retrieval into multiple parts to increase download speed. It can be many times as fast as wget in some circumstances( it is just like Flashget under MS Windows but with CLI):
$ aget -n=5 http://download.soft.com/soft1.tar.gzAbove command will download soft1.tar.gz in 5 segments.

Please note that wget command is available on Linux and UNIX/BSD like oses.

 

Run wget In Background For an Unattended Download of Files

 

Here is a quick tip, if you wish to perform an unattended download of large files such as Linux DVD ISO file use the wget command as follows:

wget -bqc http://path.com/url.iso
Where,

=> -b : Go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log.

=> -q : Turn off Wget’s output aka save disk space.

=> -c : Resume broken download i.e. continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of Wget, or by another program.

This tip will save your time while downloading large ISO file from the Internets.

nohup command

You can also use the nohup command to execute commands after you exit from a shell prompt:
$ nohup wget http://domain.com/dvd.iso &
$ exit

Python Time Conversion

Python Time Conversion Table

Table 1 shows the python conversion table. Click on the pattern you want to see.

Table 1. python time conversion table
input \ output datetime time tuple time stamp string
datetime datetime

time tuple
datetime

time stamp
datetime

string
time tuple time tuple

datetime
time tuple

time stamp
time tuple

string
time stamp time stamp

datetime
time stamp

time tuple
time stamp

string
string string

datetime
string

time tuple
string

time stamp

datetime → time tuple

>>> dt = datetime.datetime(2010, 12, 31, 23, 59, 59)
>>> tt = dt.timetuple()
>>> print tt
time.struct_time(tm_year=2010, tm_mon=12, tm_mday=31, tm_hour=23, tm_min=59, tm_sec=59, ...)
datetime → time tuple

datetime → time stamp

>>> dt = datetime.datetime(2010, 12, 31, 23, 59, 59)
>>> ts = time.mktime(dt.timetuple())
>>> print ts
1293868799.0
datetime → time stamp

datetime → string

>>> dt = datetime.datetime(2010, 12, 31, 23, 59, 59)
>>> st = dt.strftime('%Y-%m-%d %H:%M:%S')
>>> print st
2010-12-31 23:59:59
datetime → string

time tuple → datetime

>>> tt = (2010, 12, 31, 23, 59, 59, 4, 365, 0)
>>> dt = datetime.datetime(tt[0], tt[1], tt[2], tt[3], tt[4], tt[5])
>>> print dt
2010-12-31 23:59:59
>>>
>>> dt = datetime.datetime(*tt[0:6])  # same with the code above
>>> print dt
2010-12-31 23:59:59
time tuple → datetime

time tuple → time stamp

>>> tt = (2010, 12, 31, 23, 59, 59, 4, 365, 0)
>>> ts = time.mktime(tt)
>>> print ts
1293868799.0
time tuple → time stamp

time tuple → string

>>> tt = (2010, 12, 31, 23, 59, 59, 4, 365, 0)
>>> st = time.strftime('%Y-%m-%d %H:%M:%S', tt)
>>> print st
2010-12-31 23:59:59
time tuple → string

time stamp → datetime

>>> ts = 1293868799.0
>>> dt = datetime.datetime.fromtimestamp(ts)     # for local time
>>> print dt
2010-12-31 23:59:59
>>>
>>> dt = datetime.datetime.utcfromtimestamp(ts)  # for UTC
>>> print dt
2011-01-01 07:59:59
time stamp → datetime

time stamp → time tuple

>>> ts = 1293868799.0
>>> tt = time.localtime(ts)
>>> print tt
time.struct_time(tm_year=2010, tm_mon=12, tm_mday=31, tm_hour=23, tm_min=59, tm_sec=59, ...)
>>>
>>> tt = time.gmtime(ts)
>>> print tt
time.struct_time(tm_year=2011, tm_mon=1, tm_mday=1, tm_hour=7, tm_min=59, tm_sec=59, ...)
time stamp → time tuple

time stamp → string

>>> ts = 1293868799.0
>>> st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
>>> print st
2010-12-31 23:59:59
>>>
>>> st = datetime.datetime.utcfromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
>>> print st
2011-01-01 07:59:59
time stamp → string

string → datetime

>>> s = '2010-12-31 23:59:59'
>>> dt = datetime.datetime.strptime(s, '%Y-%m-%d %H:%M:%S')
>>> print dt
2010-12-31 23:59:59
string → datetime

string → time tuple

>>> st = '2010-12-31 23:59:59'
>>> tt = time.strptime(st, '%Y-%m-%d %H:%M:%S')
>>> print tt
time.struct_time(tm_year=2010, tm_mon=12, tm_mday=31, tm_hour=23, tm_min=59, tm_sec=59, ...)
string → time tuple

string → time stamp

>>> s = '2010-12-31 23:59:59'
>>> ts = time.mktime(time.strptime(s, '%Y-%m-%d %H:%M:%S'))
>>> print ts
1293868799.0
string → time stamp
timezone conversion of RFC 822 time
from datetime import timedelta,datetime

import time

try:
offset = int(time_data[-5:])
except:
print “Error”

print offset
delta = timedelta(hours = offset / 100)
print delta
fmt = “%a, %d %b %Y %H:%M:%S”
t = datetime.strptime(time_data[:-6], fmt)
t -= delta

print t.year
print t.strftime(‘%m-%d-%Y’)
print t.strftime(‘%H%M’)
print int(time.mktime(t.timetuple()))

Setup chroot ftp user in Centos with Selinux enabled

The requirement is to setup a FTP server as that can be mounted in Mac OSX.

a) Install vsftpd

yum install vsftpd

b) Add a user

useradd ftpu

c) Change the permission of user by editing /etc/passwd

Change the shell to /sbin/nologin as we do not want this user to login

Change the directory to /ftpdir

d) Enabled the chroot in /etc/vsftpd/vsftpd.conf

chroot_local_user=YES

This will ensure the ftp user will not be allowed to move out side the home directory.

e)Now in case selinux is enabled the user will not be able to login. You need to allow the access in selinux

setsebool -P ftp_home_dir on

f) After this the user will be able to login but will not be able to write inside the directory we need to enabled that as well in selinux

setsebool -P allow_ftpd_full_access=1

Please note the selinux command will take some time to complete. So be patient

SET UP TOMCAT TO REDIRECT HTTP REQUESTS TO HTTPS

 

 
We had Tomcat app server,
accepting both HTTP and HTTPS requests. We wanted to restrict it so
that all JIRA access was only via SSL. The config changes were pretty
simple.
1. Change Tomcat’s server.xml.
Edit the non-SSL <Connector> entry listening on port 80 and add or
edit the redirectPort atribute to point to the port on which the SSL
<Connector> is listening. By default, the redirectPort was pointing
to port 443.
Was:
<Connector port=”80″
enableLookups=”false” redirectPort=”8443
maxThreads=”100″ minSpareThreads=”100″ maxSpareThreads=”100″/>

Changed to:
<Connector port=”80″
enableLookups=”false” redirectPort=”443
maxThreads=”100″ minSpareThreads=”100″ maxSpareThreads=”100″/>
Because the SSL entry was:

2. In the Tomcat web.xml file the following <security-constraint> has
to be added within the <web-app> element. This new element must be
added after the <servlet-mapping> element:

<!– SSL settings. only allow HTTPS access to JIRA –>
<security-constraint>
<web-resource-collection>
<web-resource-name>Entire Application</web-resource-name>
<url-pattern>/*</url-pattern>
</web-resource-collection>
<user-data-constraint>
<transport-guarantee>CONFIDENTIAL</transport-guarantee>
</user-data-constraint>
</security-constraint>

LogicalDoc startup shutdown script

Save the following scripts as /etc/init.d/logicaldoc . It will automatically be read and run at boot time. Check the log files if it does not start properly.

Make a link to it from /etc/rc3.d such as:

ln -s ../init.d/logicaldoc S71LogicalDoc

cd /etc/rc5.d
sudo ln -s ../init.d/tomcat S71tomcat
sudo ln -s ../init.d/apache S72apache

----------------------------  /etc/init.d/logicaldoc  ------------------------
#!/bin/bash
#
# logicaldoc        
#
# chkconfig: 
# description: 	Start up the LogicalDOC Tomcat servlet engine.

# Source function library.
. /etc/init.d/functions

RETVAL=$?
CATALINA_HOME="/LogicalDOC/tomcat"

case "$1" in
 start)
        if [ -f $CATALINA_HOME/bin/startup.sh ];
          then
	    echo $"Starting Tomcat"
            $CATALINA_HOME/bin/startup.sh
        fi
	;;
 stop)
        if [ -f $CATALINA_HOME/bin/shutdown.sh ];
          then
	    echo $"Stopping Tomcat"
            $CATALINA_HOME/bin/shutdown.sh
        fi
 	;;
 *)
 	echo $"Usage: $0 {start|stop}"
	exit 1
	;;
esac

exit $RETVAL
-------------------  end of /etc/init.d/logicaldoc  ---------------