Monday,
1/28/19 at 2:00PM ET – Task 1.5 (Evaluation Through
Architectural Simulation and Prototyping)
PIM architectures have drawn an
increasing research interest as a mitigation to the data movement bottleneck
within the current DRAM-based architectures, and a variety of them have been
proposed for accelerating various data-intensive workloads. However, for a
given workload, it is difficult to determine which part of a program should be
offloaded to a given PIM architecture to gain the best performance and how much
performance gain is possible. We propose PIMProf, a
tool that uses a combination of static and runtime analysis to automatically
detect the PIM candidates and estimate the speedup of the program. Our key
ideas are as follows: First, PIMProf uses
static analysis to capture the dependency between computation and data access,
and constructs both a control flow graph and a data dependency graph of the
program. Second, PIMProf
profiles the computation cost and memory access cost of the program, and
attributes the costs to the nodes and edges of the graph. Finally, we show how
to formalize the PIM offloading problem into a cost minimization problem of the
weighted graph.