Improving WebScale Cache Performance through Adaptive Policies

Caches are a critical component of Internet applications. Caching the results of user requests (e.g., database queries or dynamic web content) significantly reduces the latency of subsequent requests and lowers the load on backend servers. However, as datasets have grown exponentially in recent years, caches have grown as well, so that they now consume thousands of datacenter servers. Improving their performance can thus significantly reduce the operating costs of Internet services. Yet caches today are managed using simple caching policies, such as least-recently used (LRU) or random replacement, that leave significant performance on the table.

We propose to borrow ideas from processor caches, where achieving high performance with limited resources has received years of attention, to improve cache performance in Internet services. In particular, our recent work has developed an adaptive, statistical replacement policy for processor caches. The key insight is a novel and intuitive metric that, without any tunable parameters, ranks objects for eviction. Preliminary results are promising: On a real production trace of memcached requests, we more than halve the number of misses, achieving a hit rate of 90% vs.\ just 78% for LRU. An LRU cache would need to be 8X larger to match this performance. Many opportunities remain to support the new features of Internet services (e.g., varying replacement costs) or offer new user features (e.g., differentiated quality-of-service).

Sponsor

Google

Projects

Smart Caching at Datacenter Scale

Publications

Practical Bounds on Offline Caching with Variable Object Sizes [pdf]

Daniel Berger, Nathan Beckmann, Mor Harchol-Balter. SIGMETRICS 2018.

LHD: Improving Cache Hit Rate by Maximizing Hit Density [pdf]

Nathan Beckmann, Haoxian Chen, Asaf Cidon. NSDI 2018.

CORGi @ CMU

Navigate to

Funded by

CMU Affiliates