Monday, October 25, 2021

DRAM-CAM: Extending Sieve for General-Purpose Exact Pattern Matching

(Lingxi Wu, UVA, presenting on Wednesday, October 27, 2021 at 1:00 PM & 7:00 PM ET.) 

Exact pattern matching is a widely used kernel in many application domains. A prominent example is k-mer matching in bioinformatics where input DNA sequences of size K are exactly matched against a library of reference DNA patterns for genome classification. Previously we propose three DRAM-based in-situ accelerator designs, dubbed Sieve, to alleviate the bottleneck stage of K-mer matching in genomics.


We notice k-mer matching shares many similarities with other exact pattern matching intensive workloads (e.g., text processing and data mining), thus we extend Sieve with several cost-effective modifications to make it capable of accommodating applications beyond bioinformatics. This enhanced architecture is named Sieve ++.

We show that Sieve ++ as an exact pattern matching accelerator offers significant advantages over traditional architectures. Sieve ++ outperforms CPU by two orders of magnitude for five out of six selected benchmarks that represent a broad set of real-world exact pattern matching use cases.

Monday, October 18, 2021

Ayudante: A Deep Reinforcement Learning Approach to Assist Persistent Memory Programming

(Hanxian Huang, UCSD, presenting on Wednesday, October 20, 2021 at 1:00pm and 7:00pm ET) 

Programming PM imposes non-trivial labor effort on writing code to adopt new PM-aware libraries and APIs. In addition, non-expert PM code can be error-prone. In order to ease the burden of PM programmers, we propose Ayudante, a deep reinforcement learning (RL)- based PM programming assistant framework consisting of two key components: a deep RL-based PM code generator and a code refining pipeline. Given a piece of C, C++, or Java source code developed for conventional volatile memory systems, our code generator automatically generates the corresponding PM code and checks its data persistence. The code refining pipeline parses the generated code to provide a report for further program testing and performance optimization. Our evaluation on an Intel server equipped with Optane DC PM demonstrates that both microbenchmark programs and a key-value store application generated by Ayudante pass PMDK checkers. Performance evaluation on the microbenchmarks shows that the generated code achieves comparable speedup and memory access performance as PMDK code examples.

Friday, October 8, 2021

Reticle: A Virtual Machine for Programming Modern FPGAs

(Luis Vega presenting on Wednesday, October 13, 2021 at 1:00 p.m. and 7:00 p.m. ET)

Modern field-programmable gate arrays (FPGAs) have recently powered high-profile efficiency gains in systems from datacenters to embedded devices by offering ensembles of heterogeneous, reconfigurable hardware units. Programming stacks for FPGAs, however, are stuck in the past—they are based on traditional hardware languages, which were appropriate when FPGAs were simple, homogeneous fabrics of basic programmable primitives. We describe Reticle, a new low-level abstraction for FPGA programming that, unlike existing languages, explicitly represents the special-purpose units available on a particular FPGA device. Reticle has two levels: a portable intermediate language and a target-specific assembly language. We show how to use a standard instruction selection approach to lower intermediate programs to assembly programs, which can be both faster and more effective than the complex metaheuristics that existing FPGA toolchains use. We use Reticle to implement linear algebra operators and coroutines and find that Reticle compilation runs up to 100times faster than current approaches while producing comparable or better run-time and utilization.