Decades of exponential growth in computing power have yielded transformative benefits for research, industry, and society. To sustain this growth as Dennard Scaling benefits fade, computer architects must design new systems with orders-of-magnitude better performance and energy-efficiency. Unfortunately, computer systems are increasingly limited by the rising cost of accessing data, which often dominates the time and energy spent doing actual computation. This project will design and evaluate a new computer system design that makes data accesses much faster and cheaper by dynamically specializing the memory hierarchy for each application. This project will continue the growth in computation power that researchers, industry, and society have come to rely upon. It will particularly accelerate important applications like machine learning, social networking, and robotics whose core computations remain beyond the capability of current chip designs. This project will develop new curricula on memory hierarchy and specialized architectures as well as involve high school, undergraduate, and graduate students in research. It will improve diversity through research workshops for undergraduate women and a summer internship program for under-represented minorities.
Current computer systems incur significant unnecessary data movement because their memory hierarchy is fixed in hardware and hidden from software, so that applications have no control over how data are managed. This project will develop a new hardware-software co-design that incorporates data as a first-class citizen alongside compute. Applications will express their performance goals (e.g., throughput, quality of service, and/or security) and, with compiler support, break their computation into tasks with associated data. The operating system (OS) and hardware will collaboratively co-schedule tasks and data so that applications’ goals are achieved with minimal data movement. The OS scheduler will account for the diverse performance characteristics of each core/accelerator in a heterogeneous system-on-chip (SoC), and hardware will incorporate energy-efficient cores throughout the memory hierarchy to enable near-data computation while maximally exploiting data locality. The hardware-software co-design gives an extensible platform for further research on reducing data movement, and it complements industry investment in heterogeneous SoCs by making it inexpensive to integrate new accelerators.
Rajat Kateja, Nathan Beckmann, Greg Ganger. ISCA 2020.
Anurag Mukkara, Nathan Beckmann, Daniel Sanchez. MICRO 2019.