Datacenter applications at companies like Google, Facebook, Microsoft, etc serve enormous workloads consisting of petabytes of data. Caching plays a critical part in serving these workloads at high throughput and low cost, but datacenters’ massive scale introduces many new challenges for caching systems. To keep cost low, datacenter caches must combine different technologies (DRAM, NVM, flash) with different cost and performance, as well as other unique properties like limited write-endurance in flash and NVM.
This project aims to build smart, high-performance, and low-cost caching systems for datacenter applications. We are focused on both theory and practice. On the theory side, we extend and apply caching theory to address the new challenges at datacenter scale. On the practical side, we are designing and implementing caching systems that combine lessons from theory with engineering insights to make the best use of diverse technologies.
The above image sketches how LHD (NSDI’18) implements a theoretically-grounded eviction policy based on Bayesian inference in a design inspired by recent, high-associativity cache designs for processors based on statistical sampling. We are currently building caching systems that make efficient use of flash-based SSDs for very small objects, and co-designing caches with the backing storage system to optimize end-to-end cost across the datacenter.
Checkout our frequent collaborators on the PDL caching project.
Nathan Beckmann, Phillip Gibbons, Bernhard Haeupler, Charles McGuffey. APoCS 2020. (Best Paper.)
Nathan Beckmann, Phillip Gibbons, Bernhard Haeupler, Charles McGuffey. SPAA 2019. (Brief Announcement.)
Daniel Berger, Nathan Beckmann, Mor Harchol-Balter. SIGMETRICS 2018.
Nathan Beckmann, Haoxian Chen, Asaf Cidon. NSDI 2018.
Nathan Beckmann, Daniel Sanchez. HPCA 2017.
Nathan Beckmann, Daniel Sanchez. IEEE CAL 2016.
Nathan Beckmann, Daniel Sanchez. HPCA 2016.
Nathan Beckmann, Daniel Sanchez. HPCA 2015.