Compression

Discussion in 'Linux Kernel' started by Jarret W. Buse, Jul 9, 2013.

  1. Jarret W. Buse

    Jarret W. Buse Well-Known Member Staff Writer

    Messages:
    173
    Likes Received:
    256
    Trophy Points:
    63
    Compression

    Many people know about compression by compressing a file to make it smaller for e-mailing. Others compress files to place on a small storage device such as a USB thumb drive or memory card. In these instances, a file utility is used to compress the data into a compressed format such as ZIP, RAR, TAR, and others.

    Some file systems support compression natively. That is, the compression is part of the file system and is not a third-party utility.

    With hard disks becoming higher capacity, most people wonder why there is a need for compression.

    Compression helps in many ways on a hard disk. It not only saves space since most applications are getting larger and requires more space. Using compression can improve hard drive performance as well.

    Let’s look at what compression is and how it works.

    Compression is taking a file and manipulating the data to make it smaller. There are two types of compression: lossy and lossless. Lossy compression is used for audio/video compression where data that is not needed can be removed. Lossless compression is used to reduce the size of data within files. Once compressed, the information can be uncompressed back to the original size and the data is exactly as the original before compression.

    Different algorithms are used to “shrink” the data to a smaller size. One example is a popular compressor called zlib. The zlib library only uses only an algorithm for DEFLATE.

    NOTE: Other compression algorithms exist than zlib. Each file system has its own preferred algorithm.

    When a supporting file system has compression enabled, data files written to the drive are compressed (system files are not compressed).

    NOTE: System files, such as boot files, are not compressed. Any file that can be accessed while compression is not active cannot be compressed. If boot files would be compressed, the system will not boot since the files cannot be uncompressed until the compression library is loaded.

    As files are written, the data is run through the algorithm to reduce its size. Once done, the smaller file is written to disk. Later, when the file is accessed, the data is read from the hard disk and is decompressed with the algorithm before passing it to the application requesting the file.

    Compression sizes vary on the data type being compressed. Audio and video files may already be compressed. For example, MP4 files are already compressed and may not be reduced in size. But, text files can be reduced to a much smaller size. If a drive will be used to store large amounts of already compressed data, then disk compression will not help save space.

    As files are compressed and uncompressed, the processor is needed to perform these tasks. So when compression is enabled, more processor time is needed to access files.

    On the other hand, less hard disk resources are needed. Since the data is compressed, there is less data to read/write. For most computers, the hard disk performance is not as great as the processor. In these cases, compression can help overall system performance. For example, if a file is originally 10 MB in size, but it is compressed to 2.5 MB, the hard disk can read and write the smaller file faster.


    Keep in mind the compression native to a file system will compress data in the background and be unseen by the user. The files do not need to be manually compressed and uncompressed with a utility.

    When a file system is being used and the compression is off but later enabled, the existing files are not compressed. New files added or older files modified will be compressed. When compression is enabled on an active file system, it may be best to backup the data, delete the data, enable compression, then restore the data. Not only will all data be compressed now (except system files), but the files will not be fragmented (see the article “Intro to Extents”).

    An issue that may occur is when a file is copied from one file system to another. Even if both file systems support compression, the file is uncompressed after reading then recompressed before writing. If the target drive does not support compression, then the file is written in its uncompressed state. Keep in mind that any file read from a compressed drive is uncompressed before it is transferred to the calling process. For example, a document is on a compressed hard disk. A word processor is used to open the document. The document is read, uncompressed, and placed in the word processors work folder and memory before it is displayed in the application. When the document is modified and saved, the document is written back to the disk after it is compressed. If you should save the document from the word processor to a thumb drive (without compression), the file is written in a normal uncompressed state.

Share This Page