Tips: File Compression and Archiving



The following tips deal with the use of tar, gzip, bzip2 and zip

make a tarball of certain types of files

You can make a tarball of only certain types of files from a directory with the following one-liner:
find reports/ -name "*.txt" | xargs tar -zcpf reports.tar.gz

Change the compression type to bzip2 and you'll have a smaller file:

find reports/ -name "*.txt" | xargs tar -jcpf reports.tar.bz2

untar in a different directory

If you've got a gzipped tarball and you want to untar it in a directory other than the one you're in, do the following:

zcat file.tar.gz | ( cd ./otherdir; tar xvf - )

extract individual files from a tarball

If you need a file that you've put into a tarball and you don't want to extract the whole file, you can do the following.
First, get a list of the files and find the one you want

tar -zltf file.tar.gz

Then extract the one you want

tar zxvf file.tar.gz indiv.file

backup everything with tar

To make a backup of everything in a particular directory, first do this

ls -a > backup.all

If you don't really want *everything*, you can also edit backup.all and get rid of things you don't want
To make the tarball, just do this:

tar -cvf newtarfile.tar `cat filelist`

(remember, those are backtics)

incremental backups with tar

An incremental backup is one that includes the new files you've created each day. First you need create a tarball with everything, as we did in the previous example, or just by creating a tarball of the entire directory.

tar cvzf homedir-complete.tar.gz /home/homedir/

Then, you can run the following as a cron job every day. It will add the new files to the homedir-complete.tar.gz tarball

find /home/homedir -mtime -1 -print | 
tar cvzf homedir_complete.tar.gz -T -

(don't forget that dash '-' at the end!)

zip syntax

Most Linux distributions come with zip, the most popular file compression format in the MS Windows world. The command line arguments are not the same as when you're creating a tar.gz, however. To create a zipped file on Linux, do this:
zip -b ./ *.png

This would create a zipped file of all the *.pngs in that particular directory. Keep in mind:
  • "./" means the directory you're in
  • As you can see, you can use wildcards and extension names
tar and untar over ssh

You can use tar combined with ssh to move files more easily over a local network or even the Internet. In the following example, we will take our 'development' web work and transfer them to the production server.

tar cf - *.png | ssh -l username production_server 
"( cd /home/website/html/images/ ; tar xf - ) "

The process can also be reversed to get remote files here.

ssh -l username production_server
"( cd /path/to/files; tar cf - *.png ) " | tar xf