Enhancing GPU Energy Efficiency with an Asymmetric Streaming Multiprocessor Architecture
Supachai Thongsuk, Prabhas Chongstitvatana
Abstract
Graphics Processing Units (GPUs) significantly enhance computational
performance through parallel processing but often suffer from energy
inefficiency due to resource under utilization, particularly in
memory-bound workloads. Conventional GPUs are typically designed with
symmetric streaming multiprocessors operating under a unified frequency
domain, which limits their ability to adapt to diverse workload
requirements. To address this limitation, this paper proposes an
Asymmetric Streaming Multiprocessor (ASM) architecture that partitions
streaming multiprocessors into high-frequency and low-frequency
clusters. A neural network-based classifier analyzes static Parallel
Thread Execution (PTX) code to predict the most suitable cluster for each
application at compile time. This approach eliminates runtime profiling
overhead and enables efficient workload-aware mapping. Experimental
evaluations on standard benchmark applications demonstrate that ASM
reduces execution time by 49%, lowers power consumption by 39%, and
improves energy efficiency by 124% compared with conventional Dynamic
Voltage and Frequency Scaling. Prior work by SSAGA achieved about a 20%
improvement in energy efficiency by customizing streaming multiprocessors
for different voltage–frequency domains and further gains with
workload-aware scheduling and power gating. These findings indicate that
the proposed ASM architecture constitutes a
practical and scalable approach to enhancing GPU performance and energy
efficiency.
key words: asymmetric streaming multiprocessor, GPU energy efficiency, static source code analysis, machine learning