7 results on '"Moraes, Fernando Gehm"'
Search Results
2. Differentiated Communication Services for NoC-Based MPSoCs.
- Author
-
Carara, Everton Alceu, Calazans, Ney Laert Vilar, and Moraes, Fernando Gehm
- Subjects
- *
NETWORKS on a chip , *INFRASTRUCTURE (Economics) , *ROUTING algorithms , *BANDWIDTHS , *MESSAGE passing (Computer science) , *QUALITY of service - Abstract
The adoption of Networks-on-Chip (NoCs) as the communication infrastructure for complex integrated systems is a fact, and has been promoted by the growing number of processing elements integrated in current MPSoCs. These are designed to execute several applications in parallel, with different communication requirements and distinct levels of required quality of service. To meet these restrictions, most designs customize the MPSoC at design time, using specific NoC communication services as adaptive routing algorithms, priorities, and connections. However, MPSoCs are increasingly used in embedded systems, where new applications may be added at runtime, characterizing dynamic workload scenarios. Such scenarios require adaptability at runtime, with applications having the possibility to select the most appropriate communication service according to their respective requirements. The goal of the present work is to link the hardware level of NoCs to the MPSoC application level, proposing the development of a communication API that exposes the communication services offered by the NoC to the application developer. Executing real and synthetic applications in two different MPSOCs, and using four different NoC communication services enabled to demonstrate the efficiency of the proposed approach to meet applications requirements. [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
3. Modular and Distributed Management of Many-Core SoCs.
- Author
-
RUARO, MARCELO, SANT'ANA, ANDERSON, JANTSCH, AXEL, and MORAES, FERNANDO GEHM
- Subjects
- *
RESOURCE management - Abstract
Many-Core Systems-on-Chip increasingly require Dynamic Multi-objective Management (DMOM) of resources. DMOM uses different management components for objectives and resources to implement comprehensive and self-adaptive system resource management. DMOMs are challenging because they require a scalable and well-organized framework to make each component modular, allowing it to be instantiated or redesigned with a limited impact on other components. This work evaluates two state-of-the-art distributed management paradigms and, motivated by their drawbacks, proposes a new one called Management Application (MA), along with a DMOM framework based on MA. MA is a distributed application, specific for management, where each task implements a management role. This paradigm favors scalability and modularity because the management design assumes different and parallel modules, decoupled from the OS. An experiment with a task mapping case study shows that MA reduces the overhead of management resources (-61.5%), latency (-66%), and communication volume (-96%) compared to state-of-the-art perapplication management. Compared to cluster-based management (CBM) implemented directly as part of the OS, MA is similar in resources and communication volume, increasing only the mapping latency (+16%). Results targeting a complete DMOM control loop addressing up to three different objectives show the scalability regarding system size and adaptation frequency compared to CBM, presenting an overall management latency reduction of 17.2% and an overall monitoring messages' latency reduction of 90.2%. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
4. A High-Level Modeling Framework for Estimating Hardware Metrics of CNN Accelerators.
- Author
-
Juracy, Leonardo Rezende, Moreira, Matheus Trevisan, de Morais Amory, Alexandre, Hampel, Alexandre F., and Moraes, Fernando Gehm
- Subjects
- *
CONVOLUTIONAL neural networks , *SPACE exploration - Abstract
GPUs became the reference platform for both training and inference phases of Convolutional Neural Networks (CNN) due to their tailored architecture to the CNN operators. However, GPUs are power-hungry architectures. A path to enable the deployment of CNNs in energy-constrained devices is adopting hardware accelerators for the inference phase. The design space exploration of CNNs using standard approaches, such as RTL, is limited due to their complexity. Thus, designers need frameworks enabling design space exploration that delivers accurate hardware estimation metrics to deploy CNNs. This work proposes a framework to explore CNNs design space, providing power, performance, and area (PPA) estimations. The heart of the framework is a system simulator. The system simulator front-end is TensorFlow, and the back-end is performance estimations obtained from the physical synthesis of hardware accelerators, not only from components like multipliers and adders. The first set of results evaluate the CNN accuracy using integer quantization, the accelerators PPA after physical synthesis, and the benefits of using a system simulator. These results allow a rich design space exploration, enabling selecting the best set of CNN parameters to meet the design constraints. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
5. The power impact of hardware and software actuators on self-adaptable many-core systems.
- Author
-
Martins, André Luís del Mestre, Garibotti, Rafael, Dutt, Nikil, and Moraes, Fernando Gehm
- Subjects
- *
MULTICORE processors , *COMPLEMENTARY metal oxide semiconductors , *ENERGY dissipation , *RESOURCE management , *ELECTRIC power consumption - Abstract
Many-core systems rely on the advantages of the latest Complementary Metal Oxide Semiconductor (CMOS) technologies to increase the number of cores. However, this improvement comes at the cost of higher power dissipation, which prevents full use of the chip. To continue improving performance on future many-core systems, Resource Management (RM) becomes imperative to handle multi-objective and conflicting requirements such as power, performance, resilience, among others. In this task, RM can use both hardware (e.g., dynamic voltage and frequency scaling) and software actuators (e.g., task remapping). However, the complexity of synchronizing available actuators to follow a particular goal while avoiding actuation overlapping is a remaining challenge. This paper evaluates the power impact of each actuator and provides insights that will help engineers develop appropriate resource management heuristics to improve self-adaptable many-core systems. A state-of-the-art comparison shows that no related work provides or details the same comprehensiveness of actuation methods concerning power consumption. Our proposal is validated in a many-core system described in a true clock-cycle accurate model. Regarding hardware actuators, the results show the power profiling at the core level and detail the contribution of each hardware component. Furthermore, results of software actuators evidence that task events present a more significant power impact on the ratio of active and idle cores changes. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
6. Hierarchical adaptive Multi-objective resource management for many-core systems.
- Author
-
Martins, André Luís del Mestre, da Silva, Alzemiro Henrique Lucas, Rahmani, Amir M., Dutt, Nikil, and Moraes, Fernando Gehm
- Subjects
- *
SCALABILITY , *MULTICORE processors , *RESOURCE management , *MULTIDISCIPLINARY design optimization , *SYSTEMS on a chip - Abstract
The typical workload of many-core systems produces peaks and valleys of resources utilization throughout the time. The power capping limits the full system utilization in a workload peak, but also creates a power slack to apply different resource management (RM) policy in a valley phase. Related works do not consider this workload behavior, by proposing RMs with fixed goals. This work proposes a hierarchical adaptive Multi-Objective Resource Management (MORM) for many-core systems under a power cap. MORM works with dynamic workloads, which presents peaks and valleys of utilization. The hierarchical approach allows clusters of processing elements (PEs) to execute applications according to different objectives simultaneously. A cluster can drive the PEs to optimize either performance or energy. MORM can dynamically shift the goals of a cluster according to the workload behavior. Comparison with a state-of-the-art RM optimized for single objective shows that MORM achieves equivalent results in a workload valley while outperforming up to 37.19–49.03% the performance in a workload peak regardless of the power cap. The comparison reveals relevant features to be considered in large many-core systems: hierarchical organization, multi-task mapping, and joint adaptability between software (remapping) and hardware (DVFS) actuation. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
7. Hierarchical energy monitoring for task mapping in many-core systems.
- Author
-
Castilhos, Guilherme, Mandelli, Marcelo, Ost, Luciano, and Moraes, Fernando Gehm
- Subjects
- *
ENERGY management , *MATHEMATICAL mappings , *RELIABILITY in engineering , *ENERGY consumption , *SCALABILITY - Abstract
This work addresses a research subject with a rich literature: task mapping in NoC-based systems. Task mapping is the process of selecting a processing element to execute a given task. The number of cores in many-core systems increases the complexity of the task mapping. The main concerns in task mapping in large systems include (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging many-core systems may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload assignment. The workload assignment plays an important role in the many-core system reliability. Load imbalance may generate hotspots zones and consequently thermal implications, which may generate hotspots zones and consequently thermal implications. More recently, task mapping techniques aiming at improving system reliability have been proposed in the literature. However, such approaches rely on centralized mapping decisions, which are not scalable. To address these challenges, the main goal of this work is to propose a hierarchical runtime mapping heuristic, which provides scalability and a fair workload distribution. Distributing the workload inside the system increases the system reliability in long-term, due to the reduction of hotspot regions. The proposed mapping heuristic considers the application workload as a function of the consumed energy in the processors and NoC routers. The proposal adopts a hierarchical energy monitoring scheme, able to estimate at runtime the consumption at each processing element. The mapping uses the energy estimated by the monitoring scheme to guide the mapping decision. Results compare the proposal against a mapping heuristic whose main cost function minimizes the communication energy. Results obtained in large systems, up to 256 cores, show improvements in the workload distribution (average value 59.2%) and a reduction in the maximum energy values spent by the processors (average value 32.2%). Such results demonstrate the effectiveness of the proposal. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.