LIS 312--Information in Cyberspace
Fall 1997

UNIX File Compression

Compression saves space and time. Unfortunately, there is no unique program for compressing and uncompressing files.
It is important to recognize the type of compression that was used in order to use the correct decompression modus.

Note: You have to treat compressed files always as binary files. Type binary before issuing the get command. The remote system will respond.

You also might want to review the commands used in our ftp exercise!
(the following list was adapted from "The EFF's Guide to the Internet" by Patrick Crispen)

 

FILE           TRANSFER UNCOMPRESS
EXTENSION      MODE     PACKAGE      ADDITIONAL COMMENTS
------------   ------   ----------   -----------------------------------

.txt or .TXT   ASCII                 By itself, this means the file is
                                     a document rather than a program,
                                     and does not need to be uncompressed

.ps or .PS     ASCII                 A PostScript document (in Adobe's
                                     page description language). You can
                                     print this file on any PostScript
                                     capable printer or use a previewer,
                                     like GNU project's GhostScript.

.doc or .DOC   ASCII                 Another common extension for text
                                     documents. (Be careful, though: .doc
                                     and .DOC extensions are also used for
                                     Microsoft Word documents (which are
                                     Binary files). The duck theory will
                                     help you determine the difference)
                                     No decompression is needed, unless it
                                     is followed by:

.Z             Binary   uncompress   This indicates a Unix compression
                                     method. To uncompress type

                                          uncompress filename.Z

                                     and hit enter at your host system's
                                     command line.

                                     u16.zip is an MS-DOS program that
                                     will let you download .Z files and
                                     uncompress them on your own computer.
                                     The Macintosh equivalent program is
                                     called MacCompress (use archie to
                                     find these).

.zip or .ZIP   Binary   PKZip or     This indicates the file has been
                        Zip/Unzip    compressed with a common MS-DOS
                                     compression program, known as PKZIP
                                     (use archie to find PKZIP204.EXE).
                                     Many Unix systems will let you un-ZIP
                                     a file with a program called unzip.

.gz            Binary   gunzip       A Unix version of ZIP. To uncompress,
                                     type

                                          gunzip filename.gz

                                     at your host system's command line.

.zoo or .ZOO   Binary   zoo          A Unix and MS-DOS compression
                                     format. Use a program called zoo
                                     to uncompress.

.shar or .Shar Binary   unshar       Another Unix format. Use unshar
                                     to uncompress.

.tar           Binary   tar          Another Unix format, often used
                                     to compress several related files
                                     into one large file. All Unix
                                     systems will have a program called
                                     tar for "un-tarring" such files.
                                     Often, a "tarred" file will also be
                                     be compressed with the gz method,
                                     so you first have to use uncompress
                                     and then tar.

.sit or .Sit   Binary   StuffIt      A Macintosh format that requires
                                     the unsit program.

.ARC           Binary   ARC or       Another MS-DOS format, which
                        ARCE         requires the use of the ARC
                                     or ARCE programs.

.LHZ           Binary   LHARC        Another MS-DOS format; requires
                                     the use of LHARC.

Exercise

Make sure you're connected and start a telnet session. Login.

You will ftp a file anonymousely from a remote system. You will uncompress and read the uncompressed file. Ready?

Finally, look in the file using more. Feel free to remove emily-postnews now using the rm command.


Heiko Haubitz