High Performance File System (HPFS)

J

Jarret W. Buse

Guest
High Performance File System (HPFS)

The High Performance File System (HPFS) was introduced in November 1989. HPFS was created by Microsoft for OS/2 1.2. At the time Microsoft and IBM worked jointly on OS/2.

NOTE: This file system, created by Microsoft, was an improvement on FAT.

The improvements include:
  • The ROOT was placed in the middle of the volume to help with faster access times. The ROOT in FAT was stored at the beginning of the disk.
  • B+ Trees were used when possible to improve performance.
  • More date-stamps for files than just the last modification.
  • Uses Extents.
  • Stores similar files closer together on the disk.
  • Stores an individual file contiguously when possible.
  • Has less fragmentation.
  • Keeps case on folders and file names.
  • Allows for file and folder names with a length of 255 characters instead of 8+3.
The maximum volume size is 64 GB (2 TB theoretical) with a maximum file size of 2 GB. The maximum file name length is 255 characters with an unlimited number of files allowed on a volume. Each file can have up to 64 KB of metadata information stored about the file as extended attributes. The extended attributes are a dual pair (key, value). The attributes can be permissions, author, signatures, checksum, etc.

When Microsoft made HPFS, they did not make it as part of the OS/2 operating system. HPFS was created as an Installable File System (IFS). An IFS is an API which can be changed and updated without having to re-compile the OS’s kernel. IFS can also be used for network protocols, not just file systems.

NOTE: While speaking of compiling kernels, this may be the time to mention that the Linux Kernel may not be compiled on an HPFS volume. The reason is that some kernel files have the same name, only differing in case. This issue can cause problems on the HPFS file system.

Two different IFS drivers exist for HPFS, supplied by IBM:
  1. HPFS – standard HPFS with a 2 MB cache
  2. HPFS386 – a higher end, server version, of HPFS with cache limited to the amount of RAM
NOTE: File system cache is used to store data in RAM/cache that has been read from the disk. When an application requests a file, the file system retrieves the file and sends the data to the RAM/cache. The application then accesses the data from the RAM/cache. There are two types of memory to keep in mind:
  1. Available Memory – This is the amount of RAM not being used by the OS, applications, drivers, etc. This memory is then used by new apps when they are executed.
  2. Cache Memory – Memory taken from the Available Memory and used to store data read from the HPFS386 volume. Once data is placed here, it is available for use by the applications requesting the data.
HPFS386 allows for direct hardware and kernel interaction. This ability makes HPFS386 a higher performance file system than HPFS. Optimization is better and tends to be more suited for server use. HPFS386 also supports Server Message Blocks (SMB) to allow faster network access to the file system. HPFS386 works better on a system running network protocols. There is also support for Access Control Lists (ACLs) to provide higher security of files. Permissions through ACL can determine not only who can access which file, but what they can do to the files (read, write, delete, etc.).

HPFS386 also has a maximum volume size limit of 512 GB.

For better performance and redundancy, HPFS386 supports RAID 1. RAID 1 requires a second disk which may or may not be on a separate controller for better fault tolerance. RAID 1 is mirroring which is when the data on one disk is also written to a second disk. The two disks are mirrors of one another. When one disk is busy and an application requests data, the data can come from the second disk since it is not busy. In this manner, the hard disk provides better performance. If one disk fails, the other continues to operate until the failed disk can be replaced and the mirror recreated. When one disk has failed, the performance boost is lost until the mirror is re-established.

NOTE: When RAID 1 is set up with two drive controllers, this helps create more fault tolerance. If one controller fails, the other controller and drive will remain functioning. When two controllers are used, this is called duplexing. Duplexing can, and usually is, incorporated with other forms of RAID. Duplexing does boost performance by having another pathway on the motherboard, or bus, to get to a hard disk if one disk is busy. If two hard disks are located on one controller, then access to the disks are done serially and not in parallel. That means that if one drive is reading or writing, the other drive is inaccessible until the other disk is finished.

IBM has found HPFS to be limiting and has switched to a default file system of Journaled File System (JFS) on OS/2 systems.
 

Attachments

  • slide.jpg
    slide.jpg
    35.1 KB · Views: 81,903
Last edited:

Members online


Latest posts

Top