CRISP Blog: A little latency goes a long way: Memory latency and its impact on CPU-driven inference performance

Thursday, January 31, 2019

A little latency goes a long way: Memory latency and its impact on CPU-driven inference performance

(Ameen Akel of Micron is presenting Mon. Feb 4, 2019)

Memory media latencies are often a hot topic: Systems architects often seek the lowest possible latencies, which, in turn, drives memory manufacturers to architect memories for the minimum possible access latencies. While some applications may benefit from lower media latencies, most applications exhibit very little sensitivity. We aim to dispel the memory latency myth: Workloads like DNN inference, across a wide variety of models, are not strongly correlated to memory media latency. Freeing memory companies of memory latency shackles enables favorable memory media architecture tradeoffs that systems architects may not expect.