CST8229 - Lab Exercise 9
The gzip and tar utilities
If you are ever required to hand in an assignment as a single compressed file, you will be using the gzip and tar utility programs. This is your chance to learn how to use these tools.
Note that gzip and tar are often used together, and you will often see a .tar.gz or .tgz extension used. You can restore a .tgz archive by first decompressing the file, and then using tar to restore the original directory and file structure. To create an archive you will first use tar to combine all the pieces, and then gzip the .tar file to compress the file. You will reverse this order (gunzip then tar) to extract files.
On Windows, WinZip is often used to do both operations (the Linux zip program can handle WinZip files, among other formats).
Take a copy of the demo files in the read-only directory ~allisor/Student/gzip-tar and place them in a subdirectory of your own. First, un-gzip the file large.gz.gz. What happens to it? Don't play with this file too much, since it makes the sysadmin grumpy with me.
You should then un-gzip the other file, look inside the file, un-tar it, then re-tar and finally re-gzip it. Be sure to examine the changes in file size with gzip and gunzip, and see how tar includes files from subdirectories. Watch the extensions change, find when files get deleted and when they don't, and find out how to set the .tgz extension yourself (you could always use mv if there's no other way).
gzip [-options] [filename ...]
"gzip reduces the size of the named files using Lempel-Ziv coding (LZ77). Whenever possible, each file is replaced by one with the extension .gz, while keeping the same ownership modes, access and modification times."...
"Compressed files can be restored to their original form using gzip -d or gunzip or gzcat. If the original name saved in the compressed file is not suitable for its file system, a new name is constructed from the original one to make it legal." [Adapted from a man page]
tar key[f tar-filename] [options] list ...
tar saves and restores multiple files in a single file (originally a magnetic tape, but it can be any file).
"A tarfile", or tarball, "may be made on a tape drive", like /dev/tape, "however, it is also common to write a tarfile to a normal disk file. The first argument to tar must be one of the options: Acdrtux", sometimes called the key, "followed by any optional functions. The final arguments to tar are the names of the files or directories which are to be archived. The use of a directory name always implies that the subdirectories below should be included in the archive." [Adapted from the GNU man page]
The key argument controls tar's actions. It's a string of characters (no dash) containing a function letter and often one or more modifiers.
Here are the most common key functions you will likely need:
c Create a new tar archive (tarfile) (e.g. tar cf mytarfile.tar dir1 dir2 dir3 to create mytarfile.tar containing the 3 directories and all their sub-directories).
r Refresh appends the named files at the end of the archive (tar rf mytarfile.tar dir4).
t Test will list the names of the specified files each time they occur in the archive. If no file argument is given, list all of the names in the archive (tar tf mytarfile.tar).
u Update; put the named files at the end of the archive if they are not currently in the archive, or if the file is more recent than the file in the archive (tar uf mytarfile.tar dir2).
x eXtract the named files from the archive. If a named file matches a directory whose contents had been written into the archive, this directory is (recursively) extracted. If no file argument is given, the entire content of the archive is extracted (tar xf mytarfile.tar) into the current directory.
Some function modifiers for the key:
f tar uses the next argument as the name of the archive instead of using /dev/tape. Since we have no tape drives, always use this to specify your tar filename, as the last item in the key.
j Use bzip2/bunzip for (un)compression; cannot be used with z, obviously.
z Use gzip/gunzip to (un)compress the tar file; conflicts with j.