14.09.2023
| Research |
A New Profiling Tool for Achieving Energy-Efficient Applications
DSC research associate Christopher Metz has developed a new approach for analysing PTX code. HyPA allows to examine PTX code statically and dynamically and can acquire metrics beyond static analysis. The generated profiles can be used for power and performance prediction of GPU applications.
General Purpose Computation on Graphics Processing Units (GPGPUs) are crucial when it comes to accelerating computing capacity and their usage has increased over the last decades. However, GPUs consume a lot of power to achieve high performance levels. Consequently, it becomes increasingly relevant to decrease power consumption and achieve energy-efficient applications.
One strategy that can be used to improve energy usage and performance, is the optimization of the chosen device with respect to the chosen application. To achieve this goal, one needs information about the potentially employed device, e.g. the different GPUs. To predict power consumption and performance, different standard metrics are used which are gathered by performing code profiling and execution analysis. Academia and industry have proposed different profilers that support developers with code optimization. However, these profilers often require an actual device (e.g. GPU) and the profiling process is very time-consuming.
Together with Christina Plump (DFKI), Bernhard J. Berger (Institute of Embedded Systems – Hamburg University of Technology) and under the direction of Rolf Drechsler, DSC research associate Christopher Metz has developed a hybrid
Parallel Thread Execution (PTX) Analyser that can analyse PTX code statically and dynamically without having to run it on a real GPGPU. The generated profile can be used for power and performance prediction of GPU applications as well as for better code understanding (you can read more about Christopher’s previous research on GPGPU performance prediction
»here). The new, hybrid analysis approach (HyPA) is further discussed in the paper “Hybrid PTX Analysis for GPU accelerated CNN inferencing aiding Computer Architecture Design”.
The hybrid PTX Analyser was tested in an experiment by performing the hybrid dynamic code analysis on several different CNNs and comparing the outcomes to classic profilers. The results showed that HyPA enables a significantly faster analysis and achieved speedups of up to 536%, compared to the nvprof profiler. Furthermore, the hybrid approach can gather metrics beyond what static code analysis can provide, including branch efficiency. It also achieves a faster execution time compared to profiling the application on an actual device. To help and support developers and system designers with further research and development, an open-source implementation of HyPA is provided.
The new work will be presented on Friday, September 15th (12:30 pm) at the FDL 2023 which takes place from September 13th to 15th in Turin, Italy. The
26th Forum on Specification and Design Languages is an international event during which academics get the opportunity to exchange their ideas and experiences and discuss new trends related to languages, tools, and techniques for developing software and hardware. A variety of targeted systems will be discussed, including e.g., cyber-physical systems, distributed systems and IoT.
DSC research associate Christopher Metz is also going participate in the Ph.D. Forum at the FDL 2023 which is going to take place on Wednesday afternoon (5 pm). The Ph.D. Forum will be a poster session that gives participants the opportunity to present their ongoing research to various experts operating in languages, tools, and techniques for hardware and software system development. During the session, senior faculties and researchers will provide Ph.D. students with feedback on their ongoing research.
Updated by: Svenja Goers
« back
14.09.2023 | Research
A New Profiling Tool for Achieving Energy-Efficient Applications
DSC research associate Christopher Metz has developed a new approach for analysing PTX code. HyPA allows to examine PTX code statically and dynamically and can acquire metrics beyond static analysis. The generated profiles can be used for power and performance prediction of GPU applications.
General Purpose Computation on Graphics Processing Units (GPGPUs) are crucial when it comes to accelerating computing capacity and their usage has increased over the last decades. However, GPUs consume a lot of power to achieve high performance levels. Consequently, it becomes increasingly relevant to decrease power consumption and achieve energy-efficient applications.
One strategy that can be used to improve energy usage and performance, is the optimization of the chosen device with respect to the chosen application. To achieve this goal, one needs information about the potentially employed device, e.g. the different GPUs. To predict power consumption and performance, different standard metrics are used which are gathered by performing code profiling and execution analysis. Academia and industry have proposed different profilers that support developers with code optimization. However, these profilers often require an actual device (e.g. GPU) and the profiling process is very time-consuming.
Together with Christina Plump (DFKI), Bernhard J. Berger (Institute of Embedded Systems – Hamburg University of Technology) and under the direction of Rolf Drechsler, DSC research associate Christopher Metz has developed a hybrid
Parallel Thread Execution (PTX) Analyser that can analyse PTX code statically and dynamically without having to run it on a real GPGPU. The generated profile can be used for power and performance prediction of GPU applications as well as for better code understanding (you can read more about Christopher’s previous research on GPGPU performance prediction
»here). The new, hybrid analysis approach (HyPA) is further discussed in the paper “Hybrid PTX Analysis for GPU accelerated CNN inferencing aiding Computer Architecture Design”.
The hybrid PTX Analyser was tested in an experiment by performing the hybrid dynamic code analysis on several different CNNs and comparing the outcomes to classic profilers. The results showed that HyPA enables a significantly faster analysis and achieved speedups of up to 536%, compared to the nvprof profiler. Furthermore, the hybrid approach can gather metrics beyond what static code analysis can provide, including branch efficiency. It also achieves a faster execution time compared to profiling the application on an actual device. To help and support developers and system designers with further research and development, an open-source implementation of HyPA is provided.
The new work will be presented on Friday, September 15th (12:30 pm) at the FDL 2023 which takes place from September 13th to 15th in Turin, Italy. The
26th Forum on Specification and Design Languages is an international event during which academics get the opportunity to exchange their ideas and experiences and discuss new trends related to languages, tools, and techniques for developing software and hardware. A variety of targeted systems will be discussed, including e.g., cyber-physical systems, distributed systems and IoT.
DSC research associate Christopher Metz is also going participate in the Ph.D. Forum at the FDL 2023 which is going to take place on Wednesday afternoon (5 pm). The Ph.D. Forum will be a poster session that gives participants the opportunity to present their ongoing research to various experts operating in languages, tools, and techniques for hardware and software system development. During the session, senior faculties and researchers will provide Ph.D. students with feedback on their ongoing research.
Author: Svenja Goers
Please contact us if you have any questions:
Christopher Metz
Research Associate
+49 (421) 218 - 63942
cmetz@uni-bremen.de
« back