Latency is one of the most important factors to address when designing or optimizing solutions. Reducing latency is a primary objective to ensure optimal application and workload performance.
By definition, this is the amount of time it takes to execute a task and it is generally measured in millisecond values.
It should be noted that latency is a measurement that will appears in many places in IT environments, however in Optical Prime, latency is measured from the host level and is relative to the latency of a disk I/O.
Latency of a disk I/O reported from a storage array can differ from the host, as arrays implement latency measurements that are specific to their storage management operations.
However, as a general rule of thumb in a well-designed environment the latency as seen by a storage provider and the host would be very similar.
Note: Optical Prime accurately measures the average latency of a host or hosts with sustained performance of greater than 100 IOPS. High latencies for low periods of IOPS should typically be discounted as they are normally attributable to skew factors with the latency calculation.
Reducing latency is a primary technical objective when architecting a solution or increasing application performance. However, high latency is not always attributable to slow disk or a poorly designed disk subsystem. Latency source can be very difficult to diagnose and could be coming from multiple design issues.
As a general rule of thumb you can quickly make sense of some of the latency feedback
High Latency/ Low Queue Depth: Potential Network or Server bottleneck
Low Latency/ Low Queue Depth: under-utilized or properly designed resources
High Latency/ High Queue Depth: Potential Disk Bottleneck
Low Latency/ High Queue Depth: no particular problems
High Latency/ Low IOPS: very low IO (sub sustained 100 IOPS) can report false positives and can be ignored
Smaller IO transfer sizes generally want lower latency values
Very large IO transfer sizes can create larger latency that is still acceptable
For most applications these values are generally acceptable
Latency < 10ms is very good
Latency < 20ms is acceptable
Latency > 40ms is concerning especially if sustained
Read latency will generally be more erratic than write latency due to write caching
Related To:
Queue Depth
Performance
Efficiency
Disk Controller
Bottlenecks