(Zaid Qureshi, David Min, and Vikram Sharma Mailthody presenting 4/22/20.)
Storage class memories (SCM) have been considered as a prime
candidate to address the growing need for applications’ memory footprint. An
ideal SCM for tomorrow’s data center has TBs of memory capacity, has a few
hundred nanoseconds to a couple of microsecond latency, is energy efficient,
offers high memory parallelism, is scalable and is very cheap. Among the
several types of SCM, 3DX-point and Flash have shown promising results. Compared to 3DX-point, Flash offers higher
throughput, thanks to several levels of parallelism, has higher density,
consumes very low power per memory access, and is proven to be scalable and
cost-efficient.
However, studying Flash as part of the main memory system is
challenging as the existing simulators and emulators do not provide the needed
flexibility and cannot address practical system-level challenges. In this talk,
we will discuss our attempt in modeling an SSD using an FPGA. We show that CPUs
offer a very low amount of memory parallelism over PCIe and are inefficient in
exploiting the massive parallelism offered by these emerging NVM devices. To
increase the memory level parallelism, we connect a GPU with an FPGA. We shall
then discuss our learnings and several applications and system-level challenges
we encountered.