Intro to Extents

Discussion in 'Filesystem' started by Jarret W. Buse, Jul 9, 2013.

  1. Jarret W. Buse

    Jarret W. Buse Well-Known Member Staff Writer

    Messages:
    183
    Likes Received:
    268
    Trophy Points:
    63
    Extents

    To understand extents, first you should have some knowledge of hard disk layout. When formatting hard disks, they have concentric rings called tracks. From the disk's center to the outside, "slices" divide the tracks into sectors. The better the magnetic media and the read/write heads, the more tracks and sectors can be created. The more sectors on a disk, the more data it can contain.

    To access data, pointers are used to signify the address a block. A block is a group of sectors given an address. In this way, each block holds an individual file. All blocks on a hard disk (within one partition) contain the same number of sectors: 1, 2, 4, 8 or 16. In older hard disks, sectors were 512 bytes. On a hard disk where a block is one sector and a file is only 100 bytes in size, then 412 bytes of the block are unused or wasted. If more sectors exist within a block, such as 16 (making each block 8,196kb) then even more space is wasted (a total of 8,096kb).

    The sector size defaults to 4kb (4,096 bytes) on newer hard disks and you can already guess that a hard disk storing a large number of small files are wasting much space.

    Extents do not deal with this issue, but to manage wasted space by dealing with fragmentation. Fragmented files do not waste space, but time. The only thing you need to be concerned about is blocks. Extents are contiguous blocks on the hard disk that are used to keep files close together and prevent fragmentation.

    Fragments occur when parts of a file are scattered across a hard disk and do not exist in contiguous blocks. Fragments are caused by files growing in size when blocks after it are filled. When the file grows, portions are then placed in non-contiguous blocks. When the file is accessed, the hard disk requires more time to retrieve the whole file since its parts are spread over the hard disk. Please note that no file system can prevent fragmentation.

    With extents, a part of contiguous blocks after a file can be "reserved" for when the file grows. Not all files may grow and the reserved space may be required for use by other files.

    With inodes, the metadata section contained direct and indirect block pointers. These pointers were the hard disk addresses. Extents are another list which is pointed to by the inode block and lists contiguous blocks that make up the file. For example, suppose a file takes up 5 contiguous blocks and then another 3 contiguous blocks at another section. The inode pointer will point to the extents list which contains two entries. The first entry is the address of the first 5 blocks and specifies that 5 blocks are used. The second entry is the address of the second 3 contiguous blocks and specifies there three blocks in the set.

    A third entry in the extent list is an offset. If a file physically starts in extent 10, but more precisely in the third block, then the offset is 3. Extents are a set number of blocks. If one file ends at the second block of extent 10, then the next file would start at block 3, making its offset 3 in extent 10.

    The number of extents for a file can be found by using the “filefrag” program. The only needed parameter is the filename (wild cards are accepted for more than one file). The flag “-v” is used for verbose output to get more details.


    For example, a file such as file1.txt would be entered as shown:

    Code:
    filefrag file1.txt
    
    The output is:
    Code:
    buse@Buse-PC:/media/buse/Frag$ filefrag file1.txt
    file1.txt: 7 extents found
    
    The output shows that file1.txt uses 7 extents. It would be preferable to have only one, but the output shows that the file is fragmented.

    And the verbose command is:
    Code:
    filefrag –v file1.txt
    
    The output is:
    Code:
    buse@Buse-PC:/media/buse/Frag$ filefrag -v file1.txt
    Filesystem type is: ef53
    File size of file1.txt is 178353379 (174174 blocks, blocksize 1024)
    Ext logical physical expected length flags
    0 0 212993 8192
    1 8192 223233 221185 32768
    2 40960 256001 6144
    3 47104 270337 262145 32768
    4 79872 303105 32768
    5 112640 335873 32768
    6 145408 368641 8192
    7 153600 378881 376833 14336
    8 167936 397313 393217 4096
    9 172032 4338 401409 2048
    10 174080 137217 638694 eof
    file1.txt: 7 extents found
    
    Some file systems when updated, such as ext3 to ext4, allow in-place upgrades. What this means is that is your file system is ext3, it can be upgraded to the newer ext4 file system without requiring the hard disk to be formatted and the data restored. The file system is upgraded with the data in place.

    NOTE: it is best to always back-up the data before performing such a task which could cause data loss. It is better to be safe than sorry.

    An issue with the in-place update is that ext3 uses inodes rather than extents. When the upgrade is finished, the files are still using inodes only and not extents. Files added or moved (written after the update) will use extents instead. In this way, it is usually best to back-up, format and restore the data. This procedure is favorable in two ways. First, all files will be updated to use new features of the newer file system. In this case extents will be used for all files. Second, the files are written to disk in contiguous order so there is no fragmentation. With no fragmentation and the use of extents on all files, the file system performance will be greatly improved.

    Keep in mind, extents help reduce fragmentation and minimize the block lists in the inode entry to allow better file system performance. An extensive block list in an inode entry can cause performance degradation for file access.

Share This Page