Interpreting iostat Disk I/O Statistics

I was asked a few questions about iostat that I was unable to answer off the top of my head, so I decided to write down a few notes on it to help learn.

From the manual (on macOS) :

  • iostat displays kernel I/O stats on terminal, device, and cpu operations.
  • The first stats you see are averaged over the system uptime.
  • To get info about current activity, a suitable wait time (in seconds) should be specified (with -w), so that subsequent sets of stats will be averaged over that time.

Here is the output from running iostat on my primary disk with a wait time of 5 seconds. This was done while copying a 4GB file over 25 seconds:

%iostat -w 5 disk0
              disk0       cpu    load average
    KB/t  tps  MB/s  us sy id   1m   5m   15m
  261.14  372 94.88  29 14 58  2.53 2.32 2.35
  333.73  452 147.24  29 11 60  2.49 2.31 2.35
  395.37  406 156.82  27  8 65  2.45 2.31 2.35
  254.52  605 150.47  37 11 52  2.41 2.30 2.35
  409.59  391 156.26  25  6 69  2.46 2.31 2.35

KB/t: kilobytes per transfer
tps: transfers per second
MB/s: megabytes per second
us: % of cpu in user mode
sy: % of cpu in system mode
id: % of cpu in idle mode

The tps number is the I/O Operations Per Second, or IOPS. You can compare this to Wikipedia’s list of average IOPS for different storage devices.

My Mac’s SSD hit a high of 605 IOPS during the file copy, which is 3X higher than the fastest mechanical disks, but nowhere near as fast as some of the enterprise SSDs that you can buy.

iostat-examples.png

One note that I found interesting was this, from vaneyckt’s article on iostat at Coderwall:

[On Linux versions of iostat], some people put a lot of faith in the %iowait metric as an indicator for I/O performance. However, %iowait is first and foremost a CPU metric that measures the percentage of time the CPU is idle while waiting for an I/O operation to complete. This metric is heavily influenced by both your CPU speed and CPU load and is therefore easily misinterpreted.

For servers, you should be sending your iostat statistics to an internal data collection and graphing service, so you can get an idea of a baseline over time.  You can then try and correlate spikes in disk I/O with other data, such as slow web site performance, database queries, etc.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s