Storage performance is core to application performance and data access.  When we talk about storage performance, we typically talk about IOPS and throughput, but there is a third variable, latency.  Latency measures the time it takes for storage to respond to a request or instruction issued by the CPU.    The lowest latency is achieved by delivering data from memory at the speed of memory.  The typical latency is at 1us.  If data could be delivered at such latency, we would have a highly efficient server architecture, but  there are a number of factors that prevent out ability to see latency at that level.

  • Application latency – the inherent architecture of the application may make it impossible to achieve microsecond latency.  Typical operations add to the overall latency.
  • Local file system – since DRAM is volatile, data that requires persistence must be committed to persistent media before an acknowledgement is returned.  The local file system is responsible for taking blocks off DRAM and copying them to other media on the I/O bus.  A common Linux file sytsem such as XFS or EXT4 add as much at 250us.  Even with the replacement of DRAM with NVDIMM (persistent memory), the latencies remain at minimum at 250us.  Though 250us may seem like nothing, in a typical database environment the reduction of 250us alone would increase IOPS and throughput per core by 350% and 410% respectively.
  • Network – When data travels over the network, whether it is FC or IP, there are added latencies.   Most all SSD/Flash arrays deliver performance at 1ms or more latency.  If SSD/Flash is sitting on the PCIe bus, that latency may be reduced to  a range between 500 and 800 microseconds.  Recently, a new protocol has been developed to allow shared storage (SAN) to deliver the same latency as storage on the PCIe.   This is the NVMe standard.
  • Drive media – Flash has a lower latency profile than HDD; it is not surprising since HDD is a mechanical device where the speed with which the platters spin correlates to the time it takes for the data to be pulled off the drive.  Flash is not a mechanical media and doesn’t have the same delays built in.

Of course we can’t leave out IOPS and throughput.  IOPS measures how many operations can be performed per second while throughput is how much data can be transferred through a given pipe.  Depending on the application, one of the other of these metrics will be more relevant.

For applications that stream data sequentially require more bandwidth and are therefore more concerned with throughput.  Thoughput may be calculated by the total bandwidth of the drives in a given system, the controllers, and the network.  Even if you have a system capable of delivering gigabytes of data, it still needs the network to carry the data.  There is often an imbalance between the network and system capabilities.  Recently a client expressed concern exemplifying this issue.  As a research institution there is a lot of data created by the labs and then processed by the investigators.  The challenge they are facing is that the amount of data being created and moved to a centralized location is much greater than what the network can handle.  As a result, they are unable to transfer data over the wire; some use tape or don’t move data at all.

IOPS  measures the number of operations a drive or a system can perform.  We have seen huge gains with the adoption of SSD/Flash.  Where a 15K RPM drive has the ability to deliver around 180 IOPS, a flash drive has the ability to deliver thousands of IOPS.  About 10-15 years ago storage administrators would be forced to over-provision capacity in order to get enough drives in a RAID set to deliver required number of IOPS.  As an example:  if your application needed 1 TB of data and 1,500 IOPS, using 15K drives at 300GB of capacity each an administrator would have to provision 4 drives to reach required capacity and 9 drives to reach the required IOPS.  Today,  capacity and IOPS can be balanced.

Not all applications require microsecond latency, thousands of IOPS and gigabytes of throughput, but with higher performance, when properly designed, the system can perform at a much higher level of efficiency, both operational and financial.  Next time we talk about performance, let’s make sure we are clear what performance we need.

 

Advertisements