Yi-Hsiang Lai presenting on Wed. 11/20/19--
With the pursuit of improving compute performance under strict power constraints, there is an increasing need for deploying applications to heterogeneous hardware architectures with accelerators, such as PIMs and FPGAs. However, although these heterogeneous computing platforms are becoming widely available, they are very difficult to program. As a result, the use of such platforms has been limited to a small subset of programmers with specialized hardware knowledge.
To tackle this challenge, we introduce HeteroCL, a programming infrastructure composed of a Python-based domain-specific language (DSL) and a compilation flow. The HeteroCL DSL provides a clean programming abstraction that decouples algorithm specification from three important types of hardware customization in compute, data types, and memory architectures. HeteroCL further captures the interdependence among these different customization techniques, allowing programmers to explore various performance/area/accuracy trade-offs in a systematic and productive manner. In addition, our framework produces highly efficient hardware implementations for a variety of popular workloads by targeting spatial architecture templates such as systolic arrays and stencil with dataflow architectures. HeteroCL further incorporates the T2S framework developed by Intel Lab. T2S is an intermediate programming model extended from Halide for high-performance systolic architectures. Similar to HetoerCL, T2S cleanly decouples the temporal definition from spatial mapping, which enables productive programming and efficient design space exploration.
Experimental results show that HeteroCL allows programmers to explore the design space efficiently in both performance and accuracy by combining different types of hardware customization and targeting spatial architectures, while keeping the algorithm code intact.
With the pursuit of improving compute performance under strict power constraints, there is an increasing need for deploying applications to heterogeneous hardware architectures with accelerators, such as PIMs and FPGAs. However, although these heterogeneous computing platforms are becoming widely available, they are very difficult to program. As a result, the use of such platforms has been limited to a small subset of programmers with specialized hardware knowledge.
To tackle this challenge, we introduce HeteroCL, a programming infrastructure composed of a Python-based domain-specific language (DSL) and a compilation flow. The HeteroCL DSL provides a clean programming abstraction that decouples algorithm specification from three important types of hardware customization in compute, data types, and memory architectures. HeteroCL further captures the interdependence among these different customization techniques, allowing programmers to explore various performance/area/accuracy trade-offs in a systematic and productive manner. In addition, our framework produces highly efficient hardware implementations for a variety of popular workloads by targeting spatial architecture templates such as systolic arrays and stencil with dataflow architectures. HeteroCL further incorporates the T2S framework developed by Intel Lab. T2S is an intermediate programming model extended from Halide for high-performance systolic architectures. Similar to HetoerCL, T2S cleanly decouples the temporal definition from spatial mapping, which enables productive programming and efficient design space exploration.
Experimental results show that HeteroCL allows programmers to explore the design space efficiently in both performance and accuracy by combining different types of hardware customization and targeting spatial architectures, while keeping the algorithm code intact.