Sunday, January 27, 2019

PIMProf: A Performance Profiler for Processing-in-Memory Architectures


Monday, 1/28/19 at 2:00PM ET – Task 1.5 (Evaluation Through Architectural Simulation and Prototyping)
PIM architectures have drawn an increasing research interest as a mitigation to the data movement bottleneck within the current DRAM-based architectures, and a variety of them have been proposed for accelerating various data-intensive workloads. However, for a given workload, it is difficult to determine which part of a program should be offloaded to a given PIM architecture to gain the best performance and how much performance gain is possible. We propose PIMProf, a tool that uses a combination of static and runtime analysis to automatically detect the PIM candidates and estimate the speedup of the program. Our key ideas are as follows: First, PIMProf uses static analysis to capture the dependency between computation and data access, and constructs both a control flow graph and a data dependency graph of the program. Second, PIMProf profiles the computation cost and memory access cost of the program, and attributes the costs to the nodes and edges of the graph. Finally, we show how to formalize the PIM offloading problem into a cost minimization problem of the weighted graph.