Publicación (Tesis doctorales)

Energy Consumption Reduction on High Performance Embedded Systems for Hyperspectral Imaging Cancer Detection

Madroñal Quintín, Daniel
The growing complexity of both applications and architectures, combined with the tightening of functional and non-functional system requirements, is reaching the frontiers of what can currently be obtained. This is the case of, for example, the application of Hyperspectral Imaging (HSI) to discern between healthy and tumor human tissues. This research line has gained importance in the past few years, as it is a non-invasive and non-ionizing technology capable of accurately delimiting tumor boundaries. Nevertheless, this kind of system requires processing a large amount of data and, depending on the field in which this application is used, the system requirements vary: in the neurosurgical case, detecting tumor boundaries in realtime is necessary to help the specialist during the surgery; in the dermatological domain, energy consumption becomes the most important requirement, since battery-supplied systems should be used. To ease the development of systems where functional and non-functional requirements need to be weighed, design strategies where different tasks are automated (parallelization, code generation, task scheduling, etc) are often used. In the literature, the most extended approaches are those based on Y-chart design. This methodology proposes a separation of concerns, where the application is deployed onto the target architecture considering a set of user-defined constraints; additionally, to deploy these systems, multi-objective optimizations can be considered to fulfill the often opposing system constraints. In this regard, this PhD proposes a Y-chart dataflow-based design methodology whose aim is to include energy consumption within the non-functional requirements to be optimized. This methodology has been built as an iterative optimization loop with the objective of being both application and architecture independent. Specifically, the proposed energy-aware adaptation loop is composed of 3 different modules: monitoring, energy consumption estimation and energy-aware decision making. First, concerning the application monitoring, a dataflow-based monitoring infrastructure called PAPIFY has been built. This new tool retrieves timing and Performance Monitoring Counter (PMC) information at runtime, where, due to the dataflow-oriented approach, the instrumentation of each part of the application is individually set up and each platform computational resource is linked to a specific PMC set. As a result, 1) monitoring of heterogeneous architectures is supported and 2) the application profiling is automatically adapted at runtime, considering the part of the application that each processing element is executing. Secondly, to address the energy consumption estimation, a new platformmodeling methodology is proposed focusing on building application-independent models. The approach has been built keeping in mind its potential applicability to model architectures of different nature, e.g., multi-/many-cores, which leads to models where the energy consumption is divided into 3 contributions: resource-active, communication and computation. Additionally, linear models based on runtime information are built to be able to estimate the energy consumption at runtime. Later, regarding the decision making module, an energy-aware mechanism that builds a loop around latency-based mapping/scheduling algorithms to optimize the system energy consumption has been proposed. This loop consists in using the mapping/scheduling block as a black-box, characterizing the performance and energy consumption of its system deployments and modifying its inputs so as to test different system configurations and keep the one with the lowest energy consumption that reaches the desired performance. This approach has been built so as to be applied in both design time, where Design Space Exploration (DSE) is enhanced, and by a runtime resource manager to performenergy-based optimizations. To validate the methodology, two real implementations of the energy-aware optimization loop have been carried out: one targeting design time DSE enhancement and the other focused on runtime energy consumption optimization. In both cases, the 3 aforementioned modules are combined so as to incorporate energy-awareness capabilities in different dataflow-based design frameworks. On the one hand, the design time implementation has been included within Parallel Real-time Embedded Executives Scheduling Method (PREESM) design framework, where the generated applications are automatically instrumented using PAPIFY to characterize them in terms of timing and energy consumption. These data is then included within PREESM information so as to enhance its latency-based DSE process considering (or not) energy-awareness. On the other hand, the runtime version of the energy-aware optimization loop has been embedded within Synchronous Parameterized and Interfaced Dataflow Embedded Runtime (SPiDER) runtime manager, which is the runtime counterpart of PREESM framework. In this case, the system is iteratively improved with on-thefly information retrieved thanks to PAPIFY and fed to the runtime manager itself. By including this self-awareness of the system execution, SPiDER is able to performbetter decisions during the mapping/scheduling process and, on top of that, to seek the most energy-efficient system configuration based on data of its current execution. To characterize the benefits of applying the proposed methodology, as a baseline to compare with, the HSI cancer detection processing chain has been manually deployed on a many-core architecture called Multi-Purpose Processor Array (MPPA). The main objective of this implementation has been to minimize the latency by exploiting the intrinsic parallelism of the application. As a result, speedups from 50£ to 112£ have been achieved when distributing the workload among the 256 Processing Elements (PEs) of theMPPA, using neurosurgical and dermatological images, respectively. After that, the Y-chart design methodology has been used to define a dataflow version of the application. In this case, in average, the maximum reachable performance has been reduced by 6 due to the limitations of this approach. Nevertheless, the systems generated with this implementation can be automatically optimized in terms of latency or energy consumption. Consequently, two scenarios have been considered to fully characterize the proposed methodology: 1) latencybased optimization and 2) energy-awareness considering a specific performance objective. In the latter, additionally, different working modes are used, weighing performance and energy consumption constraints. In the case of latency-based optimization, both PREESM and SPiDER frameworks have been enhanced with timing profiling information retrieved via PAPIFY. Thanks to this information, speedups up to 1.4£ and 1.3£ are achieved at design time and runtime, respectively. Finally, promising results have been obtained in both energy-aware optimization loop implementations, as they have been able to reduce the system energy consumption in up to a 30.21% in both design time and runtime. In this case, PAPIFY retrieves both timing and PMC information so as to estimate the energy consumption using the model developed for the MPPA, whose estimations –for a real application that has not been included during the architecture modeling– have an accuracy over 95%. Consequently, the design methodology proposed in this PhD has been proven to successfully deal with automatic multi-objective optimization of a highly demanding application when deployed on a complex architecture
Áreas de investigación:
Tipo de publicación:
Tesis doctorales
Tipo de publicación:
Tesis (Doctoral)
Eduardo Juárez y César Sanz