I was asked a few questions about iostat that I was unable to answer off the top of my head, so I decided to write down a few notes on it to help learn.
From the manual (on macOS) :
- iostat displays kernel I/O stats on terminal, device, and cpu operations.
- The first stats you see are averaged over the system uptime.
- To get info about current activity, a suitable wait time (in seconds) should be specified (with -w), so that subsequent sets of stats will be averaged over that time.
Here is the output from running iostat on my primary disk with a wait time of 5 seconds. This was done while copying a 4GB file over 25 seconds:
%iostat -w 5 disk0 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 261.14 372 94.88 29 14 58 2.53 2.32 2.35 333.73 452 147.24 29 11 60 2.49 2.31 2.35 395.37 406 156.82 27 8 65 2.45 2.31 2.35 254.52 605 150.47 37 11 52 2.41 2.30 2.35 409.59 391 156.26 25 6 69 2.46 2.31 2.35
KB/t: kilobytes per transfer
tps: transfers per second
MB/s: megabytes per second
us: % of cpu in user mode
sy: % of cpu in system mode
id: % of cpu in idle mode
The tps number is the I/O Operations Per Second, or IOPS. You can compare this to Wikipedia’s list of average IOPS for different storage devices.
My Mac’s SSD hit a high of 605 IOPS during the file copy, which is 3X higher than the fastest mechanical disks, but nowhere near as fast as some of the enterprise SSDs that you can buy.
One note that I found interesting was this, from vaneyckt’s article on iostat at Coderwall:
[On Linux versions of iostat], some people put a lot of faith in the
%iowaitmetric as an indicator for I/O performance. However,
%iowaitis first and foremost a CPU metric that measures the percentage of time the CPU is idle while waiting for an I/O operation to complete. This metric is heavily influenced by both your CPU speed and CPU load and is therefore easily misinterpreted.
For servers, you should be sending your iostat statistics to an internal data collection and graphing service, so you can get an idea of a baseline over time. You can then try and correlate spikes in disk I/O with other data, such as slow web site performance, database queries, etc.