CORGi @ CMU

Logo

Computer Organization Research Group led by Prof. Nathan Beckmann

Navigate to

Funded by


CMU Affiliates

Hardware/Software Co-Designed Vector Dataflow Architectures for Energy-Minimal AI/ML

This project is building hardware and software that minimize energy for AI/ML workloads, leveraging our work on vector-dataflow execution and recent work in coarse-grain reconfigurable arrays (CGRAs). The hardware architecture will adapt to maximize performance or efficiency, depending on energy availability. The software will minimize circuit-level switching activity to maximize efficiency.

CGRAs are a good architecture for energy-minimal computing because they operate in a spatial, vector-dataflow fashion. Vector-dataflow execution minimizes energy by amortizing instruction fetch across many operations (vector) and routing values directly between instructions (dataflow). Prior CGRAs achieve high performance, but at high power and with a unfamiliar programming interface. In contrast, our compiler maps a familiar vector ISA (RISCV V) to CGRA bitstreams.

Energy-harvesting computing opens many opportunities in the architecture and compiler. When harvested power is abundant, processors should maximize performance to minimize computation time; but if harvested power is weak, processors should focus on energy-efficiency to minimize energy-collection time. A critical challenge is that harvestable power varies, and no single architectural operating point is optimal. Prior work has focused on surviving power failures and has largely ignored variable input power.

Finally, in ULP systems, leakage is negligible and energy consumption is largely determined by circuit-level switching, which current simulators and compilers do not account for. Ignoring switching is acceptable for high-performance, high-power designs that heavily multiplex large structures, but it is insufficient for energy-minimal designs where switching of small structures is a significant concern. We observe that there are significant opportunities higher in the stack for compilers and library developers to reduce circuit-level switching. The key insight is that, for a given input data, the order that operations are performed (i.e., the instruction schedule) has a major impact on switching activity. We will develop compiler optimizations that schedule memory accesses and lay out data to minimize switching activity in control- (address signals) and data-path (pipeline registers) elements.

See the SRC Research Catalog.

Sponsor

Projects

Energy-Minimal Programmable Architectures for Intelligence Beyond the Edge

Publications

Pipestitch: An energy-minimal dataflow architecture with lightweight threads [pdf]

Nathan Serafin, Souradip Ghosh, Harsh Desai, Nathan Beckmann, Brandon Lucia. MICRO 2023.

RipTide: A programmable, energy-minimal dataflow compiler and architecture [pdf]

Graham Gobieski, Souradip Ghosh, Marijn Heule, Todd Mowry, Tony Nowatzki, Nathan Beckmann, Brandon Lucia. MICRO 2022.

SNAFU: An Ultra-Low-Power, Energy-Minimal CGRA-Generation Framework and Architecture [pdf]

Graham Gobieski, Oguz Atli, Ken Mai, Brandon Lucia, Nathan Beckmann. ISCA 2021.