Disk Performance

(excerpt from Spencer W. Ng, "Advances in disk technology : performance issues", IEEE Computer, May 1998, pp.75-81)

The advances in disk technology improves disk performance.  These advances include increased rotational speed, faster seek times, and higher data rates.  Some other advances such as disk density or total drive capacity also impact the performance.  We will discuss these advances.

Disk performance is measured by "total job completion time" for a complex task involving a long sequence of disk I/Os.

The time for a disk drive to complete a user request consists of :

Command overhead -- the time takes for the disk controller to process and handle I/O request -- depends on the type of interface (IDE or SCSI), type of command read/write, use of drive's buffer.  Typical value is 0.5 ms for buffer miss and 0.1 ms for buffer hit.

Seek time -- the time for the head to move from its current cylinder to the targer cylinder.  Settling time -- the time to position the head over the target track until the correct track identification is confirmed.  A typical seek time is 10 ms.

Rotational latency -- in the past disk spins at the speed 3,600 rpm.  Today the highest speed is 10,000 rpm and typically 5,400 rpm. representing the average latency 5.6 ms.

Data transfer time -- depends on "data rate" and "transfer size".  There are two kinds of data rate : media rate and interface rate.  Media rate depends on recording density and rotational speed.  For example, a disk rotating at 5,400 rpm with 111 sectors (512 bytes each) per track will have a media rate 5 Mbytes per second.  Interface rate is how fast data can be transferred between the host and the disk drive over its interface.   SCSI drives can do upto 20 Mbytes per sec. over each 8-bit-wide transfer.   IDE drives with the Ultra-ATA interface support upto 33.3 Mbytes per sec.

Transfer time equals transfer size divided by data rate.  The average media transfer time is 0.8 ms, the average interface transfer time is 0.4 ms.

Example  The typical average time to do a random 4-K byte disk I/O is
overhead + seek + latency + transfer =
0.5 ms + 10 ms + 5.6 ms + 0.8 ms =
16.9 ms

Locality of access -- most I/Os are not random, the efffect is that the real seek time is about one third of random seek time.  Taking this into account the above example will be
overhead + seek + latency + transfer =
0.5 ms + 1/3 * 10 ms + 5.6 ms + 0.8 ms =
10.2 ms

Caching
With caching the mechanical component, i.e. seek and latency, are eliminated.  Data transfer takes place at the interface data rate.  Typical time to do 4K I/O becomes
overhead + transfer = 0.1 ms + 0.4 ms = 0.5 ms
 

Increased recording density

Increase BPI

BPI is called "linear density", determines the number of sectors on a track.  With "zoned recording", each zone the number of sectors per track is constant.  BPI toward the outer diameter of a zone is somewhat lower than the BPI toward the inner diameter of the same zone.  Increasing BPI affects a higher media data rate, puts constraint on rpm, has fewer head switches, and a bigger cylinder.

Higher media rate --  media data rate = 2 pi x radius x bpi x rotational speed

Constraint on rpm -- increasing bpi can push data rate beyond what the drive's data channel can handle.  Today's disk electronics can handle up to 25 Mbyes per sec.

Fewer head switches -- Switching to the next track on the same cylinder is called "track switch" and switching to the next track on the next cylinder is called "cylinder switch".
average switch time = (request size - 1/track size) x head switch time

Bigger cylinder -- When BPI increases, more sectors per track, more sectors per cylinder.  When operating within a small range of data, more sectors in a cylinder has 2 effects :

  1. The seek distance is reduced
  2. The number of seek is reduced
Higher track per inch

Seek time composes of two parts :

  1. travel time
  2. settling time
seek time = A + B x sqrt( seek distance ) + C x log(TPI)
where A,B,C are some constants specific to the disk drive.

TPI has two opposing effects on the seek time.  Higher TPI means shorter physical seek distance, means shorter travel time.  On the other hand, tracks are narrower require longer settling time.

No ID record format

The conventional format, each sector has ID field (or header).  To increase capacity, no-ID recording eliminates the ID field, allowing more data sectors on each track.  The drive can find the sectors by keeping a table of relations between sectors and embedded servos.

File system -- Allocation unit

For example, file allocation table (FAT) of DOS, Windows, the allocation unit is called "cluster", the cluster size is 16 sectors for a drive with capacity 512 - 1,023 M bytes.

For larger files -- bigger cluster size is better.
For small files -- larger cluster size means larger distance between file, hence longer seek time.  With a file occupies only small portion of a cluster, look ahead buffer is less effective. Result : smaller cluster size is better.