Tuesday, August 11, 2020

Acceleration of Bioinformatics Workloads

 (Cameron Martino, UCSD, presenting on 8/12/20)

Humans are host to a unique set of trillions of microbes that encode 99% of the genetic function found in the body. The sequencing of microbial genetic material has led to a revolution in the ability to profile the microbial communities living and on us all. These microbial profiles have been recognized as effective biomarkers for many fields ranging from cancer to forensics. Despite these revelations, the ability to employ sequencing of microbial profiles at the scale and speed necessary for many applications has lagged behind sequencing technology. This is often due to the expense in both time and compute power needed to process these large datasets. Here, we describe a 10 fold acceleration of processing pipelines while also improving processing accuracy. We then describe a GPU implementation of UniFrac, a widely used metric for evaluating microbial community profiles, which reduces run time from hours to minutes. Finally, we discuss the impact of the immediate application of these improvements to the current COVID-19 pandemic, highlighting the importance of acceleration in bioinformatic workloads.