Tuesday, March 12, 2019

Persistence Parallelism Optimization: a Holistic Approach from Memory Bus to RDMA Network

Xing Hu from the University of California, Santa Barbara
Presenting Wednesday, March 13th at 2:00PM CST 
Emerging non-volatile memories (NVM), such as phase change memory (PCM) and Resistive RAM (ReRAM), incorporate the features of fast byte-addressability and data persistence, which are beneficial for data services such as file systems and databases. To support data persistence, a persistent memory system requires ordering for write requests. The data path of a persistent request consists of three segments: through the cache hierarchy to the memory controller, through the bus from the memory controller to memory devices, and through the network from a remote node to a local node. Previous work contributes significantly to improve the persistence parallelism in the first segment of the data path. However, we observe that the memory bus and the Remote Direct Memory Access (RDMA) network remain severely under-utilized because the persistence parallelism in these two segments is not fully leveraged during ordering. 

We propose an architecture to further improve the persistence parallelism in the memory bus and the RDMA network. First, we utilize inter-thread persistence parallelism for barrier epoch management with better bank-level parallelism (BLP). Second, we enable intra-thread persistence parallelism for remote requests through RDMA network with buffered strict persistence. With these features, the architecture efficiently supports persistence through all three segments of the write datapath. Experimental results show that for local applications, the proposed mechanism can achieve 1.3× performance improvement, compared to the original buffered persistence work. In addition, it can achieve 1.93× performance improvement for remote applications serviced through the RDMA network.