59 results on '"Sudeep Pasricha"'
Search Results
2. Interconnects for DNA, Quantum, In-Memory, and Optical Computing: Insights From a Panel Discussion
- Author
-
Amlan Ganguly, Sergi Abadal, Ishan Thakkar, Natalie Enright Jerger, Marc Riedel, Masoud Babaie, Rajeev Balasubramonian, Abu Sebastian, Sudeep Pasricha, Baris Taskin, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, and Universitat Politècnica de Catalunya. CBA - Sistemes de Comunicacions i Arquitectures de Banda Ampla
- Subjects
Integrated circuit interconnections ,Ordinadors quàntics ,DNA Computing ,Wireless communications systems ,Wireless communication ,Topology ,Photonic Interconnects ,Photonic interconnects ,Optical computing ,Electrical and Electronic Engineering ,Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] ,In-Memory Computing ,Computers ,Quantum computers ,Deep learning ,DNA ,Quantum computing ,In-Memory computing ,Optical Computing ,DNA computing ,Comunicació sense fil, Sistemes de ,Hardware and Architecture ,Quantum Computing ,Qubit ,Wireless Interconnects ,Wireless interconnects ,Software ,Aprenentatge profund - Abstract
The computing world is witnessing a proverbial Cambrian explosion of emerging paradigms propelled by applications such as Artificial Intelligence, Big Data, and Cybersecurity. The recent advances in technology to store digital data inside a DNA strand, manipulate quantum bits (qubits), perform logical operations with photons, and perform computations inside memory systems are ushering in the era of emerging paradigms of DNA computing, quantum computing, optical computing, and in-memory computing. In an orthogonal direction, research on interconnect design using advanced electro-optic, wireless, and microfluidic technologies has shown promising solutions to the architectural limitations of traditional von-Neumann computers. In this article, experts present their comments on the role of interconnects in the emerging computing paradigms and discuss the potential use of chiplet-based architectures for the heterogeneous integration of such technologies. This work was supported in part by the US NSF CAREER Grant CNS-1553264 and EU H2020 research and innovation programme under Grant 863337.
- Published
- 2022
- Full Text
- View/download PDF
3. Photonic Networks-on-Chip Employing Multilevel Signaling: A Cross-Layer Comparative Study
- Author
-
Venkata Sai Praneeth Karempudi, Febin Sunny, Ishan G. Thakkar, Sai Vineel Reddy Chittamuru, Mahdi Nikdast, and Sudeep Pasricha
- Subjects
FOS: Computer and information sciences ,Emerging Technologies (cs.ET) ,Hardware and Architecture ,Computer Science - Emerging Technologies ,Electrical and Electronic Engineering ,Software - Abstract
Photonic network-on-chip (PNoC) architectures employ photonic links with dense wavelength-division multiplexing (DWDM) to enable high throughput on-chip transfers. Unfortunately, increasing the DWDM degree (i.e., using a larger number of wavelengths) to achieve higher aggregated datarate in photonic links, and hence higher throughput in PNoCs, requires sophisticated and costly laser sources along with extra photonic hardware. This extra hardware can introduce undesired noise to the photonic link and increase the bit-error-rate (BER), power, and area consumption of PNoCs. To mitigate these issues, the use of 4-pulse amplitude modulation (4-PAM) signaling, instead of the conventional on-off keying (OOK) signaling, can halve the wavelength signals utilized in photonic links for achieving the target aggregate datarate while reducing the overhead of crosstalk noise, BER, and photonic hardware. There are various designs of 4- PAM modulators reported in the literature. For example, the signal superposition (SS), electrical digital-to-analog converter (EDAC), and optical digital-to-analog converter (ODAC) based designs of 4-PAM modulators have been reports. However, it is yet to be explored how these SS, EDAC, and ODAC based 4-PAM modulators can be utilized to design DWDM-based photonic links and PNoC architectures. In this paper, we provide an extensive link-level and system-level of the SS, EDAC, and ODAC types of 4-PAM modulators from prior work with regards to their applicability and utilization overheads. From our link-level and PNoC-level evaluation, we have observed that the 4-PAM EDAC based variants of photonic links and PNoCs exhibit better performance and energy-efficiency compared to the OOK, 4-PAM SS, and 4-PAM ODAC based links and PNoCs., Submitted and Accepted to publish in ACM Journal on Emerging Technologies in Computing Systems
- Published
- 2022
- Full Text
- View/download PDF
4. Surveillance mission scheduling with unmanned aerial vehicles in dynamic heterogeneous environments
- Author
-
Dylan Machovec, Howard Jay Siegel, James A. Crowder, Sudeep Pasricha, Anthony A. Maciejewski, and Ryan D. Friese
- Subjects
Hardware and Architecture ,Software ,Information Systems ,Theoretical Computer Science - Published
- 2023
- Full Text
- View/download PDF
5. LATTE: <u>L</u> STM Self- <u>Att</u> ention based Anomaly Detection in <u>E</u> mbedded Automotive Platforms
- Author
-
Vipin Kumar Kukkala, Sooryaa Vignesh Thiruloga, and Sudeep Pasricha
- Subjects
Scheme (programming language) ,business.industry ,Computer science ,Distributed computing ,Automotive industry ,Cyber-physical system ,Beacon ,CAN bus ,Recurrent neural network ,Hardware and Architecture ,Anomaly detection ,Visibility ,business ,computer ,Software ,computer.programming_language - Abstract
Modern vehicles can be thought of as complex distributed embedded systems that run a variety of automotive applications with real-time constraints. Recent advances in the automotive industry towards greater autonomy are driving vehicles to be increasingly connected with various external systems (e.g., roadside beacons, other vehicles), which makes emerging vehicles highly vulnerable to cyber-attacks. Additionally, the increased complexity of automotive applications and the in-vehicle networks results in poor attack visibility, which makes detecting such attacks particularly challenging in automotive systems. In this work, we present a novel anomaly detection framework called LATTE to detect cyber-attacks in Controller Area Network (CAN) based networks within automotive platforms. Our proposed LATTE framework uses a stacked Long Short Term Memory (LSTM) predictor network with novel attention mechanisms to learn the normal operating behavior at design time. Subsequently, a novel detection scheme (also trained at design time) is used to detect various cyber-attacks (as anomalies) at runtime. We evaluate our proposed LATTE framework under different automotive attack scenarios and present a detailed comparison with the best-known prior works in this area, to demonstrate the potential of our approach.
- Published
- 2021
- Full Text
- View/download PDF
6. Electronic, Wireless, and Photonic Network-on-Chip Security: Challenges and Countermeasures
- Author
-
Sudeep Pasricha, John Jose, and Sujay Deb
- Subjects
FOS: Computer and information sciences ,Computer Science - Cryptography and Security ,Hardware and Architecture ,Hardware Architecture (cs.AR) ,Electrical and Electronic Engineering ,Computer Science - Hardware Architecture ,Cryptography and Security (cs.CR) ,Software - Abstract
Networks-on-chips (NoCs) are an integral part of emerging manycore computing chips. They play a key role in facilitating communication among processing cores and between cores and memory. To meet the aggressive performance and energy-efficiency targets of machine learning and big data applications, NoCs have been evolving to leverage emerging paradigms such as silicon photonics and wireless communication. Increasingly, these NoC fabrics are becoming susceptible to security vulnerabilities, such as from hardware trojans that can snoop, corrupt, or disrupt information transfers on NoCs. This article surveys the landscape of security challenges and countermeasures across electronic, wireless, and photonic NoCs.
- Published
- 2022
7. A Survey on Silicon Photonics for Deep Learning
- Author
-
Febin Sunny, Sudeep Pasricha, Mahdi Nikdast, and Ebadollah Taheri
- Subjects
I.2 ,FOS: Computer and information sciences ,B.7.m ,Computer science ,C.5.4 ,Computer Science - Emerging Technologies ,02 engineering and technology ,01 natural sciences ,010309 optics ,020210 optoelectronics & photonics ,Human–computer interaction ,Hardware Architecture (cs.AR) ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,General pattern ,Electrical and Electronic Engineering ,Computer Science - Hardware Architecture ,Silicon photonics ,business.industry ,Deep learning ,Emerging Technologies (cs.ET) ,Neuromorphic engineering ,Hardware and Architecture ,Artificial intelligence ,business ,Software - Abstract
Deep learning has led to unprecedented successes in solving some very difficult problems in domains such as computer vision, natural language processing, and general pattern recognition. These achievements are the culmination of decades-long research into better training techniques and deeper neural network models, as well as improvements in hardware platforms that are used to train and execute the deep neural network models. Many application-specific integrated circuit (ASIC) hardware accelerators for deep learning have garnered interest in recent years due to their improved performance and energy-efficiency over conventional CPU and GPU architectures. However, these accelerators are constrained by fundamental bottlenecks due to (1) the slowdown in CMOS scaling, which has limited computational and performance-per-watt capabilities of emerging electronic processors; and (2) the use of metallic interconnects for data movement, which do not scale well and are a major cause of bandwidth, latency, and energy inefficiencies in almost every contemporary processor. Silicon photonics has emerged as a promising CMOS-compatible alternative to realize a new generation of deep learning accelerators that can use light for both communication and computation. This article surveys the landscape of silicon photonics to accelerate deep learning, with a coverage of developments across design abstractions in a bottom-up manner, to convey both the capabilities and limitations of the silicon photonics paradigm in the context of deep learning acceleration.
- Published
- 2021
- Full Text
- View/download PDF
8. ARXON: A Framework for Approximate Communication Over Photonic Networks-on-Chip
- Author
-
Sudeep Pasricha, Asif Mirza, Ishan G. Thakkar, Febin Sunny, and Mahdi Nikdast
- Subjects
Silicon photonics ,Computer science ,business.industry ,02 engineering and technology ,Dissipation ,Laser ,020202 computer hardware & architecture ,Power (physics) ,law.invention ,Hardware and Architecture ,law ,Limit (music) ,0202 electrical engineering, electronic engineering, information engineering ,Electronic engineering ,Overhead (computing) ,Laser power scaling ,Electrical and Electronic Engineering ,Photonics ,business ,Software - Abstract
The approximate computing paradigm advocates for relaxing accuracy goals in applications to improve energy-efficiency and performance. Recently, this paradigm has been explored to improve the energy-efficiency of silicon photonic networks-on-chip (PNoCs). Silicon photonic interconnects suffer from high power dissipation because of laser sources, which generate carrier wavelengths, and tuning power required for regulating photonic devices under different uncertainties. In this article, we propose a framework called AppRoXimation framework for On-chip photonic Networks ( ARXON ) to reduce such power dissipation overhead by enabling intelligent and aggressive approximation during communication over silicon photonic links in PNoCs. Our framework reduces laser and tuning-power overhead while intelligently approximating communication, such that application output quality is not distorted beyond an acceptable limit. Simulation results show that our framework can achieve up to 56.4% lower laser power consumption and up to 23.8% better energy-efficiency than the best-known prior work on approximate communication with silicon photonic interconnects and for the same application output quality.
- Published
- 2021
- Full Text
- View/download PDF
9. VESPA: A Framework for Optimizing Heterogeneous Sensor Placement and Orientation for Autonomous Vehicles
- Author
-
Sudeep Pasricha, Joydeep Dey, and Wes Taylor
- Subjects
050210 logistics & transportation ,0209 industrial biotechnology ,Orientation (computer vision) ,Computer science ,05 social sciences ,Real-time computing ,02 engineering and technology ,Radar detection ,Computer Science Applications ,Human-Computer Interaction ,020901 industrial engineering & automation ,Hardware and Architecture ,0502 economics and business ,Electrical and Electronic Engineering ,Design space ,Cruise control - Abstract
In emerging autonomous vehicles, perception of the environment around the vehicle depends not only on the quality and choice of sensor type, but more importantly also on the instrumented location and orientation of each of the sensors. This article explores the synthesis of heterogeneous sensor configurations toward achieving vehicle autonomy goals. We propose a novel optimization framework called VESPA that explores the design space of the sensor placement locations and orientations to find the optimal sensor configuration for a vehicle. We demonstrate how our framework can obtain optimal sensor configurations for heterogeneous sensors deployed across two contemporary real vehicles.
- Published
- 2021
- Full Text
- View/download PDF
10. A Survey on Energy Management for Mobile and IoT Devices
- Author
-
Umit Y. Ogras, Sudeep Pasricha, Raid Ayoub, Sumit K. Mandal, and Michael Kishinevsky
- Subjects
business.industry ,Computer science ,Energy management ,Mobile computing ,Wearable computer ,02 engineering and technology ,020202 computer hardware & architecture ,Form factor (design) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Wireless ,State (computer science) ,Electrical and Electronic Engineering ,Telecommunications ,business ,Energy harvesting ,Software ,Efficient energy use - Abstract
Editor’s notes: Mobile and IoT devices have proliferated our daily lives. However, these miniaturized computing systems should be highly energy-efficient due to their ultrasmall form factor. Hence, energy management is of utmost importance for both mobile and IoT devices. This article presents a comprehensive survey on this topic. — Partha Pratim Pande, Washington State University
- Published
- 2020
- Full Text
- View/download PDF
11. Approximate NoC and Memory Controller Architectures for GPGPU Accelerators
- Author
-
Sudeep Pasricha and Venkata Yaswanth Raparti
- Subjects
020203 distributed computing ,Random access memory ,Hardware_MEMORYSTRUCTURES ,Computer science ,Network packet ,Locality ,02 engineering and technology ,Parallel computing ,Overlay ,Thread (computing) ,Memory controller ,Bottleneck ,Scheduling (computing) ,Computational Theory and Mathematics ,Hardware and Architecture ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,General-purpose computing on graphics processing units ,Dram - Abstract
High interconnect bandwidth is crucial for achieving better performance in many-core GPGPU architectures that execute highly data parallel applications. The parallel warps of threads running on shader cores generate a high volume of read requests to the main memory due to the limited size of data caches at the shader cores. This leads to a scenarios with rapid arrival of an even larger volume of reply data from the DRAM, which creates a bottleneck at memory controllers (MCs) that send reply packets back to the requesting cores over the network-on-chip (NoC). Coping with such high volumes of data requires intelligent memory scheduling and innovative NoC architectures. To mitigate memory bottlenecks in GPGPUs, we first propose a novel approximate memory controller architecture ( AMC ) that reduces the DRAM latency by opportunistically exploiting row buffer locality and bank level parallelism in memory request scheduling, and leverages approximability of the reply data from DRAM, to reduce the number of reply packets injected into the NoC. To further realize high throughput and low energy communication in GPGPUs, we propose a low power, approximate NoC architecture ( Dapper ) that increases the utilization of the available network bandwidth by using single cycle overlay circuits for the reply traffic between MCs and shader cores. Experimental results show that Dapper and AMC together increase NoC throughput by up to 21 percent; and reduce NoC latency by up to 45.5 percent and energy consumed by the NoC and MC by up to 38.3 percent, with minimal impact on output accuracy, compared to state-of-the-art approximate NoC/MC architectures.
- Published
- 2020
- Full Text
- View/download PDF
12. Roadmap for Cybersecurity in Autonomous Vehicles
- Author
-
Vipin Kumar Kukkala, Sooryaa Vignesh Thiruloga, and Sudeep Pasricha
- Subjects
FOS: Computer and information sciences ,Human-Computer Interaction ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Cryptography and Security ,Hardware and Architecture ,Computer Science - Artificial Intelligence ,Electrical and Electronic Engineering ,Cryptography and Security (cs.CR) ,Machine Learning (cs.LG) ,Computer Science Applications - Abstract
Autonomous vehicles are on the horizon and will be transforming transportation safety and comfort. These vehicles will be connected to various external systems and utilize advanced embedded systems to perceive their environment and make intelligent decisions. However, this increased connectivity makes these vehicles vulnerable to various cyber-attacks that can have catastrophic effects. Attacks on automotive systems are already on the rise in today's vehicles and are expected to become more commonplace in future autonomous vehicles. Thus, there is a need to strengthen cybersecurity in future autonomous vehicles. In this article, we discuss major automotive cyber-attacks over the past decade and present state-of-the-art solutions that leverage artificial intelligence (AI). We propose a roadmap towards building secure autonomous vehicles and highlight key open challenges that need to be addressed.
- Published
- 2022
13. Ethical Design of Computers: From Semiconductors to IoT and Artificial Intelligence
- Author
-
Sudeep Pasricha and Marilyn Wolf
- Subjects
FOS: Computer and information sciences ,Computer Science - Computers and Society ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Hardware and Architecture ,Computers and Society (cs.CY) ,Hardware Architecture (cs.AR) ,Electrical and Electronic Engineering ,Computer Science - Hardware Architecture ,Software - Abstract
Computing systems are tightly integrated today into our professional, social, and private lives. An important consequence of this growing ubiquity of computing is that it can have significant ethical implications of which computing professionals should take account. In most real-world scenarios, it is not immediately obvious how particular technical choices during the design and use of computing systems could be viewed from an ethical perspective. This article provides a perspective on the ethical challenges within semiconductor chip design, IoT applications, and the increasing use of artificial intelligence in the design processes, tools, and hardware-software stacks of these systems.
- Published
- 2023
- Full Text
- View/download PDF
14. Overcoming Security Vulnerabilities in Deep Learning--based Indoor Localization Frameworks on Mobile Devices
- Author
-
Saideep Tiku and Sudeep Pasricha
- Subjects
Spoofing attack ,Artificial neural network ,business.industry ,Computer science ,Reliability (computer networking) ,Deep learning ,Distributed computing ,Vulnerability ,020206 networking & telecommunications ,02 engineering and technology ,Hardware and Architecture ,Application domain ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Mobile device ,Software - Abstract
Indoor localization is an emerging application domain for the navigation and tracking of people and assets. Ubiquitously available Wi-Fi signals have enabled low-cost fingerprinting-based localization solutions. Further, the rapid growth in mobile hardware capability now allows high-accuracy deep learning--based frameworks to be executed locally on mobile devices in an energy-efficient manner. However, existing deep learning--based indoor localization solutions are vulnerable to access point (AP) attacks. This article presents an analysis into the vulnerability of a convolutional neural network--based indoor localization solution to AP security compromises. Based on this analysis, we propose a novel methodology to maintain indoor localization accuracy, even in the presence of AP attacks. The proposed secured neural network framework (S-CNNLOC) is validated across a benchmark suite of paths and is found to deliver up to 10× more resiliency to malicious AP attacks compared to its unsecured counterpart.
- Published
- 2019
- Full Text
- View/download PDF
15. PortLoc: A Portable Data-Driven Indoor Localization Framework for Smartphones
- Author
-
Sudeep Pasricha and Saideep Tiku
- Subjects
Computer science ,business.industry ,Real-time computing ,02 engineering and technology ,Fingerprint recognition ,020202 computer hardware & architecture ,Data-driven ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Global Positioning System ,Electrical and Electronic Engineering ,business ,Software ,Multipath propagation - Abstract
Editor’s note: Fingerprinting is essential for indoor navigation and localization due to its low cost, accuracy, and resiliency to multipath effects in constrained environments. This article aims to overcome the challenge of device heterogeneity and describes a portable lightweight fingerprinting framework while improving localization accuracy.—Paul Bogdan, University of Southern California
- Published
- 2019
- Full Text
- View/download PDF
16. Mobile Network-Aware Middleware Framework for Cloud Offloading: Using Reinforcement Learning to Make Reward-Based Decisions in Smartphone Applications
- Author
-
Aditya Khune and Sudeep Pasricha
- Subjects
business.industry ,Computer science ,Distributed computing ,Response time ,020206 networking & telecommunications ,Cloud computing ,02 engineering and technology ,Energy consumption ,Smartphone application ,020202 computer hardware & architecture ,Computer Science Applications ,Human-Computer Interaction ,Hardware and Architecture ,Middleware ,0202 electrical engineering, electronic engineering, information engineering ,Cellular network ,Reinforcement learning ,Electrical and Electronic Engineering ,business ,Mobile device - Abstract
Offloading mobile computations is an innovative technique to reduce energy consumption in mobile devices and minimize application response time. In this article, we propose a middleware framework that uses reinforcement learning (RL) to make reward-based offloading decisions. Our framework allows a smartphone to consider suitable contextual information to determine when it makes sense to offload and select between available networks when offloading. We tested our framework in simulated and real environments, across various apps, to demonstrate how energy consumption can be minimized in mobile devices capable of supporting offloading to the cloud.
- Published
- 2019
- Full Text
- View/download PDF
17. Energy and Network Aware Workload Management for Geographically Distributed Data Centers
- Author
-
Ninad Hogade, Howard Jay Siegel, and Sudeep Pasricha
- Subjects
Networking and Internet Architecture (cs.NI) ,FOS: Computer and information sciences ,Queueing theory ,Control and Optimization ,Renewable Energy, Sustainability and the Environment ,business.industry ,Computer science ,Distributed computing ,Cloud computing ,Workload ,Net metering ,Computer Science - Networking and Internet Architecture ,Computational Theory and Mathematics ,Peak demand ,Computer Science - Distributed, Parallel, and Cluster Computing ,Hardware and Architecture ,Computer Science - Computer Science and Game Theory ,Server ,Data center ,Distributed, Parallel, and Cluster Computing (cs.DC) ,business ,Software ,Operating cost ,Computer Science and Game Theory (cs.GT) - Abstract
Cloud service providers are distributing data centers geographically to minimize energy costs through intelligent workload distribution. With increasing data volumes in emerging cloud workloads, it is critical to factor in the network costs for transferring workloads across data centers. For geo-distributed data centers, many researchers have been exploring strategies for energy cost minimization and intelligent inter-data-center workload distribution separately. However, prior work does not comprehensively and simultaneously consider data center energy costs, data transfer costs, and data center queueing delay. In this paper, we propose a novel game theory-based workload management framework that takes a holistic approach to the cloud operating cost minimization problem by making intelligent scheduling decisions aware of data transfer costs and the data center queueing delay. Our framework performs intelligent workload management that considers heterogeneity in data center compute capability, cooling power, interference effects from task co-location in servers, time-of-use electricity pricing, renewable energy, net metering, peak demand pricing distribution, and network pricing. Our simulations show that the proposed game-theoretic technique can minimize the cloud operating cost more effectively than existing approaches.
- Published
- 2021
18. BPLight-CNN: A Photonics-based Backpropagation Accelerator for Deep Learning
- Author
-
Dharanidhar Dang, Rabi N. Mahapatra, Sudeep Pasricha, Sai Vineel Reddy Chittamuru, and Debashis Sahoo
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Speedup ,Computer science ,Computation ,cs.LG ,Computer Science - Emerging Technologies ,CAD ,Convolutional neural network ,Machine Learning (cs.LG) ,cs.AR ,cs.ET ,Hardware Architecture (cs.AR) ,Electrical and Electronic Engineering ,Computer Science - Hardware Architecture ,business.industry ,Deep learning ,Backpropagation ,Emerging Technologies (cs.ET) ,Computer engineering ,Hardware and Architecture ,Benchmark (computing) ,Artificial intelligence ,business ,Software ,Energy (signal processing) - Abstract
Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation (BP) algorithm. This results in expensive computation overheads during training. Consequently, most deep learning accelerators today employ pretrained weights and focus only on improving the design of the inference phase. The recent trend is to build a complete deep learning accelerator by incorporating the training module. Such efforts require an ultra-fast chip architecture for executing the BP algorithm. In this article, we propose a novel photonics-based backpropagation accelerator for high-performance deep learning training. We present the design for a convolutional neural network (CNN), BPLight-CNN , which incorporates the silicon photonics-based backpropagation accelerator. BPLight-CNN is a first-of-its-kind photonic and memristor-based CNN architecture for end-to-end training and prediction. We evaluate BPLight-CNN using a photonic CAD framework (IPKISS) on deep learning benchmark models, including LeNet and VGG-Net. The proposed design achieves (i) at least 34× speedup, 34× improvement in computational efficiency, and 38.5× energy savings during training; and (ii) 29× speedup, 31× improvement in computational efficiency, and 38.7× improvement in energy savings during inference compared with the state-of-the-art designs. All of these comparisons are done at a 16-bit resolution, and BPLight-CNN achieves these improvements at a cost of approximately 6% lower accuracy compared with the state-of-the-art.
- Published
- 2021
19. ROBIN: A Robust Optical Binary Neural Network Accelerator
- Author
-
Sudeep Pasricha, Febin Sunny, Mahdi Nikdast, and Asif Mirza
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Silicon photonics ,Artificial neural network ,business.industry ,Computer science ,Computer Science - Emerging Technologies ,02 engineering and technology ,020202 computer hardware & architecture ,Machine Learning (cs.LG) ,020210 optoelectronics & photonics ,Emerging Technologies (cs.ET) ,Computer engineering ,Hardware and Architecture ,Hardware Architecture (cs.AR) ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Photonics ,Latency (engineering) ,business ,Computer Science - Hardware Architecture ,Throughput (business) ,Software ,Energy (signal processing) ,Efficient energy use - Abstract
Domain specific neural network accelerators have garnered attention because of their improved energy efficiency and inference performance compared to CPUs and GPUs. Such accelerators are thus well suited for resource-constrained embedded systems. However, mapping sophisticated neural network models on these accelerators still entails significant energy and memory consumption, along with high inference time overhead. Binarized neural networks (BNNs), which utilize single-bit weights, represent an efficient way to implement and deploy neural network models on accelerators. In this paper, we present a novel optical-domain BNN accelerator, named ROBIN , which intelligently integrates heterogeneous microring resonator optical devices with complementary capabilities to efficiently implement the key functionalities in BNNs. We perform detailed fabrication-process variation analyses at the optical device level, explore efficient corrective tuning for these devices, and integrate circuit-level optimization to counter thermal variations. As a result, our proposed ROBIN architecture possesses the desirable traits of being robust, energy-efficient, low latency, and high throughput, when executing BNN models. Our analysis shows that ROBIN can outperform the best-known optical BNN accelerators and many electronic accelerators. Specifically, our energy-efficient ROBIN design exhibits energy-per-bit values that are ∼4 × lower than electronic BNN accelerators and ∼933 × lower than a recently proposed photonic BNN accelerator, while a performance-efficient ROBIN design shows ∼3 × and ∼25 × better performance than electronic and photonic BNN accelerators, respectively.
- Published
- 2021
- Full Text
- View/download PDF
20. QuickLoc: Adaptive Deep-Learning for Fast Indoor Localization with Mobile Devices
- Author
-
Saideep Tiku, Sudeep Pasricha, and Prathmesh Kale
- Subjects
Signal Processing (eess.SP) ,FOS: Computer and information sciences ,Computer Science - Machine Learning ,Control and Optimization ,Computational complexity theory ,Computer Networks and Communications ,Computer science ,Process (engineering) ,Real-time computing ,Latency (audio) ,Machine Learning (cs.LG) ,Computer Science - Networking and Internet Architecture ,Reduction (complexity) ,Artificial Intelligence ,FOS: Electrical engineering, electronic engineering, information engineering ,Electrical Engineering and Systems Science - Signal Processing ,Baseline (configuration management) ,Networking and Internet Architecture (cs.NI) ,business.industry ,Deep learning ,Human-Computer Interaction ,Hardware and Architecture ,Artificial intelligence ,business ,Mobile device ,Energy (signal processing) - Abstract
Indoor localization services are a crucial aspect for the realization of smart cyber-physical systems within cities of the future. Such services are poised to reinvent the process of navigation and tracking of people and assets in a variety of indoor and subterranean environments. The growing ownership of computationally capable smartphones has laid the foundations of portable fingerprinting-based indoor localization through deep learning. However, as the demand for accurate localization increases, the computational complexity of the associated deep learning models increases as well. We present an approach for reducing the computational requirements of a deep learning-based indoor localization framework while maintaining localization accuracy targets. Our proposed methodology is deployed and validated across multiple smartphones and is shown to deliver up to 42% reduction in prediction latency and 45% reduction in prediction energy as compared to the best-known baseline deep learning-based indoor localization model.
- Published
- 2021
- Full Text
- View/download PDF
21. BiGNoC: Accelerating Big Data Computing with Application-Specific Photonic Network-on-Chip Architectures
- Author
-
Dharanidhar Dang, Sudeep Pasricha, Sai Vineel Reddy Chittamuru, and Rabi N. Mahapatra
- Subjects
020203 distributed computing ,Multicast ,Computer science ,business.industry ,Big data ,02 engineering and technology ,Terabyte ,Bottleneck ,020202 computer hardware & architecture ,Computational Theory and Mathematics ,Computer architecture ,Hardware and Architecture ,Computer cluster ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,business ,Dissemination - Abstract
In the era of big data, high performance data analytics applications are frequently executed on large-scale cluster architectures to accomplish massive data-parallel computations. Often, these applications involve iterative machine learning algorithms to extract information and make predictions from large data sets. Multicast data dissemination is one of the major performance bottlenecks for such data analytics applications in cluster computing, as terabytes of data need to be distributed frequently from a single data source to hundreds of computing nodes. To overcome this bottleneck for big data applications, we propose BiGNoC , a manycore chip platform with a novel application-specific photonic network-on-chip (PNoC) fabric. BiGNoC is designed for big data computing and exploits multicasting in photonic waveguides. For high performance data analytics applications, BiGNoC improves throughput by up to ${{9.9}}\times$ while reducing latency by up to 88 percent and energy-per-bit by up to 98 percent over two state-of-the-art PNoC architectures as well as a broadcast-optimized electrical mesh NoC architecture, and a traditional electrical mesh NoC architecture.
- Published
- 2018
- Full Text
- View/download PDF
22. Resilience-Aware Resource Management for Exascale Computing Systems
- Author
-
Sudeep Pasricha, Daniel Dauwe, Howard Jay Siegel, and Anthony A. Maciejewski
- Subjects
Mean time between failures ,Control and Optimization ,Computational Theory and Mathematics ,Hardware and Architecture ,Renewable Energy, Sustainability and the Environment ,Computer science ,Distributed computing ,Redundancy (engineering) ,Supercomputer ,Execution time ,Software ,Exascale computing ,Scheduling (computing) - Abstract
With the increases in complexity and number of nodes in large-scale high performance computing (HPC) systems over time, the probability of applications experiencing runtime failures has increased significantly. Projections indicate that exascale-sized systems are likely to operate with mean time between failures (MTBF) of as little as a few minutes. Several strategies have been proposed in recent years for enabling systems of these extreme sizes to be resilient against failures. This work provides a comparison of four state-of-the-art HPC resilience protocols that are being considered for use in exascale systems. We explore the behavior of each resilience protocol operating under the simulated execution of a diverse set of applications and study the performance degradation that a large-scale system experiences from the overhead associated with each resilience protocol as well as the re-computation needed to recover when a failure occurs. Using the results from these analyses, we examine how resource management on exascale systems can be improved by allowing the system to select the optimal resilience protocol depending upon each application's execution characteristics, as well as providing the system resource manager the ability to make scheduling decisions that are “resilience aware” through the use of more accurate execution time predictions.
- Published
- 2018
- Full Text
- View/download PDF
23. RAPID: Memory-Aware NoC for Latency Optimized GPGPU Architectures
- Author
-
Venkata Yaswanth Raparti and Sudeep Pasricha
- Subjects
Hardware and Architecture ,Control and Systems Engineering ,Network packet ,Computer science ,Parallel computing ,Overlay ,General-purpose computing on graphics processing units ,Latency (engineering) ,Network topology ,Shader ,Memory controller ,Bottleneck ,Information Systems - Abstract
The growing parallelism in most of today's applications has led to an increased demand for parallel computing in processors. General Purpose Graphics Processing Units (GPGPUs) have been used extensively to support highly parallel applications in recent years. Such GPGPUs generate huge volumes of network traffic between memory controllers (MCs) and shader cores. As a result, the network-on-chip (NoC) fabric can become a performance bottleneck, especially for memory intensive applications running on GPGPUs. Traditional mesh-based NoC topologies are not suitable for GPGPUs as they possess high network latency that leads to congestion at MCs and an increase in application execution time. In this article, we propose a novel memory-aware NoC that has two (request and reply) planes tailored to exploit the traffic characteristics in GPGPUs. The request layer consists of low power, and low latency routers that are optimized for the many-to-few traffic pattern. In the reply layer, flits are sent on fast overlay circuits to reach their destinations in just three cycles (at 1 GHz). In addition, as traditional memory controllers are not aware of the application memory intensity that leads to higher waiting time for applications on the shader cores, we propose an enhanced memory controller that prioritizes burst packets to improve application performance on GPGPUs. Experimental results indicate that our framework yields an improvement of ${\mathrm{4}}-{\mathrm{10}}\times$ in NoC latency, up to 63 percent in execution time, and up to 4× in total energy consumption compared to the state-of-the-art.
- Published
- 2018
- Full Text
- View/download PDF
24. Minimizing Energy Costs for Geographically Distributed Heterogeneous Data Centers
- Author
-
Eric Jonardi, Mark A. Oxley, Howard Jay Siegel, Ninad Hogade, Sudeep Pasricha, and Anthony A. Maciejewski
- Subjects
Control and Optimization ,Operations research ,Renewable Energy, Sustainability and the Environment ,business.industry ,Computer science ,Electricity pricing ,Workload ,Net metering ,Renewable energy ,Cost reduction ,Computational Theory and Mathematics ,Peak demand ,Hardware and Architecture ,Peaking power plant ,Data center ,business ,Software - Abstract
The recent proliferation and associated high electricity costs of distributed data centers have motivated researchers to study energy-cost minimization at the geo-distributed level. The development of time-of-use (TOU) electricity pricing models and renewable energy source models has provided the means for researchers to reduce these high energy costs through intelligent geographical workload distribution. However, neglecting important considerations such as data center cooling power, interference effects from task co-location in servers, net-metering, and peak demand pricing of electricity has led to sub-optimal results in prior work because these factors have a significant impact on energy costs and performance. We propose a set of workload management techniques that take a holistic approach to the energy minimization problem for geo-distributed data centers. Our approach considers detailed data center cooling power, co-location interference, TOU electricity pricing, renewable energy, net metering, and peak demand pricing distribution models. We demonstrate the value of utilizing such information by comparing against geo-distributed workload management techniques that possess varying amounts of system information. Our simulation results indicate that our best proposed technique is able to achieve a 61 percent (on average) cost reduction compared to state-of-the-art prior work.
- Published
- 2018
- Full Text
- View/download PDF
25. LIBRA: Thermal and Process Variation Aware Reliability Management in Photonic Networks-on-Chip
- Author
-
Sudeep Pasricha, Sai Vineel Reddy Chittamuru, and Ishan G. Thakkar
- Subjects
business.industry ,Computer science ,Thread (computing) ,Chip ,Process variation ,Resonator ,Tree traversal ,Hardware and Architecture ,Control and Systems Engineering ,Hardware_INTEGRATEDCIRCUITS ,Electronic engineering ,System on a chip ,Trimming ,Photonics ,business ,Information Systems - Abstract
Silicon nanophotonics technology is being considered for future networks-on-chip (NoCs) as it can enable high bandwidth density and lower latency with traversal of data at the speed of light. But, the operation of photonic NoCs (PNoCs) is very sensitive to on-chip temperature and process variations. These variations can create significant reliability issues for PNoCs. For example, a microring resonator (MR) may resonate at another wavelength instead of its designated wavelength due to thermal and/or process variations, which can lead to bandwidth wastage and data corruption in PNoCs. This paper proposes a novel run-time framework called LIBRA to overcome temperature- and process variation- induced reliability issues in PNoCs. The framework consists of (i) a device-level reactive MR assignment mechanism that dynamically assigns a group of MRs to reliably modulate/receive data in a waveguide based on the chip thermal and process variation characteristics; and (ii) a system-level proactive thread migration technique to avoid on-chip thermal threshold violations and reduce MR tuning/ trimming power by dynamically migrating threads between cores. Our simulation results indicate that LIBRA can reliably satisfy on-chip thermal thresholds and maintain high network bandwidth while reducing total power by up to 61.3 percent, and thermal tuning/trimming power by up to 76.2 percent over state-of-the-art thermal and process variation aware solutions.
- Published
- 2018
- Full Text
- View/download PDF
26. Advanced Driver-Assistance Systems: A Path Toward Autonomous Vehicles
- Author
-
Jordan A. Tunnell, Sudeep Pasricha, Vipin Kumar Kukkala, and Thomas H. Bradley
- Subjects
Computer science ,business.industry ,020206 networking & telecommunications ,Ranging ,Advanced driver assistance systems ,02 engineering and technology ,Sensor fusion ,Computer Science Applications ,law.invention ,Human-Computer Interaction ,Lidar ,Software ,Hardware and Architecture ,Feature (computer vision) ,law ,0202 electrical engineering, electronic engineering, information engineering ,Systems engineering ,Key (cryptography) ,020201 artificial intelligence & image processing ,Electrical and Electronic Engineering ,Radar ,business - Abstract
Advanced driver-assistance systems (ADASs) have become a salient feature for safety in modern vehicles. They are also a key underlying technology in emerging autonomous vehicles. State-of-the-art ADASs are primarily vision based, but light detection and ranging (lidar), radio detection and ranging (radar), and other advanced-sensing technologies are also becoming popular. In this article, we present a survey of different hardware and software ADAS technologies and their capabilities and limitations. We discuss approaches used for vision-based recognition and sensor fusion in ADAS solutions. We also highlight challenges for the next generation of ADASs.
- Published
- 2018
- Full Text
- View/download PDF
27. Mixed-criticality scheduling on heterogeneous multicore systems powered by energy harvesting
- Author
-
Yi Xiang and Sudeep Pasricha
- Subjects
010302 applied physics ,Mixed criticality ,Multi-core processor ,Exploit ,Job shop scheduling ,Computer science ,Distributed computing ,02 engineering and technology ,Energy budget ,01 natural sciences ,020202 computer hardware & architecture ,Scheduling (computing) ,Hardware and Architecture ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,Performance improvement ,Energy harvesting ,Software - Abstract
In this paper, we address the scheduling problem for single-ISA heterogeneous multicore processors running hybrid mixed-criticality workloads with a limited and fluctuating energy budget provided by solar energy harvesting. The hybrid workloads consist of a set of firm-deadline timing-centric applications and a set of soft-deadline throughput-centric multithreaded applications. Our framework exploits traits of the different types of cores in heterogeneous multicore systems to service timing-centric workloads with a few big out-of-order cores, while servicing throughput-centric workloads with many smaller in-order cores clocked in the energy-efficient near-threshold computing (NTC) region. Guided by a novel timing intensity metric, our mixed-criticality scheduling framework creates an optimized schedule that minimizes overall miss penalty for a time-varying energy budget. Experimental results indicate that our framework achieves a 9.5% miss penalty reduction with the proposed timing intensity metric compared to metrics from prior work, a 13.6% performance improvement over a state-of-the-art scheduling approach for single-ISA heterogeneous platforms, and a 23.2% performance benefit from exploiting platform heterogeneity.
- Published
- 2018
- Full Text
- View/download PDF
28. Rate-based thermal, power, and co-location aware resource management for heterogeneous data centers
- Author
-
Sudeep Pasricha, Howard Jay Siegel, Mark A. Oxley, Gregory A. Koenig, Eric Jonardi, Patrick J. Burns, and Anthony A. Maciejewski
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Distributed computing ,Thermal power station ,Symmetric multiprocessor system ,Workload ,02 engineering and technology ,020202 computer hardware & architecture ,Theoretical Computer Science ,Artificial Intelligence ,Hardware and Architecture ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Resource allocation ,Resource management ,Data center ,Cache ,Electricity ,business ,Software - Abstract
Today’s data centers contain large numbers of compute nodes that require substantial power, and therefore require a large amount of cooling resources to operate at a reliable temperature. The high power consumption of the computing and cooling systems produces extraordinary electricity costs, requiring some data center operators to be constrained by a specified electricity budget. In addition, the processors within these systems contain a large number of cores with shared resources (e.g., last-level cache), heavily affecting the performance of tasks that are co-located on cores and contend for these resources. This problem is only exacerbated as processors move to the many-core realm. These issues lead to interesting performance-power tradeoffs; by considering resource management in a holistic fashion, the performance of the computing system can be maximized while satisfying power and temperature constraints. In this work, the performance of the system is quantified as the total reward earned from completing tasks by their individual deadlines. By designing three resource allocation techniques, we perform a rigorous analysis on thermal, power, and co-location aware resource management using two different facility configurations, three different workload environments, and a sensitivity analysis of the power and thermal constraints.
- Published
- 2018
- Full Text
- View/download PDF
29. HYDRA: Heterodyne Crosstalk Mitigation With Double Microring Resonators and Data Encoding for Photonic NoCs
- Author
-
Sudeep Pasricha, Sai Vineel Reddy Chittamuru, and Ishan G. Thakkar
- Subjects
010302 applied physics ,Heterodyne ,business.industry ,Computer science ,Detector ,02 engineering and technology ,Dissipation ,01 natural sciences ,Waveguide (optics) ,020202 computer hardware & architecture ,law.invention ,Crosstalk ,Resonator ,Hardware and Architecture ,law ,Wavelength-division multiplexing ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Electronic engineering ,Electrical and Electronic Engineering ,Photonics ,business ,Waveguide ,Software ,Intermodulation - Abstract
Silicon-photonic networks on chip (PNoCs) provide high bandwidth with lower data-dependent power dissipation than does the traditional electrical NoCs (ENoCs); therefore, they are promising candidates to replace ENoCs in future manycore chips. PNoCs typically employ photonic waveguides with dense wavelength division multiplexing (DWDM) for signal traversal and microring resonators (MRs) for signal modulation. Unfortunately, DWDM increases susceptibility to intermodulation (IM) and off-resonance filtering effects, which reduce optical signal-to-noise ratio (OSNR) for photonic data transfers. Additionally, process variations (PVs) induce variations in the width and thickness of MRs causing resonance wavelength shifts, which further reduce OSNR, and create communication errors. This paper proposes a novel cross-layer framework called HYDRA to mitigate heterodyne crosstalk due to PVs, off-resonance filtering, and IM effects in PNoCs. The framework consists of two device-level mechanisms and a circuit-level mechanism to improve heterodyne crosstalk resilience in PNoCs. Simulation results on three PNoC architectures indicate that HYDRA can improve the worst case OSNR by up to $5.3\times $ and significantly enhance the reliability of DWDM-based PNoC architectures.
- Published
- 2018
- Full Text
- View/download PDF
30. Indoor Localization with Smartphones: Harnessing the Sensor Suite in Your Pocket
- Author
-
Sudeep Pasricha, Saideep Tiku, and Christopher Langlois
- Subjects
business.industry ,Computer science ,Reliability (computer networking) ,Distributed computing ,Suite ,02 engineering and technology ,Energy consumption ,020202 computer hardware & architecture ,Computer Science Applications ,law.invention ,Variety (cybernetics) ,Human-Computer Interaction ,Bluetooth ,Hardware and Architecture ,law ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,Global Positioning System ,020201 artificial intelligence & image processing ,Use case ,Electrical and Electronic Engineering ,business - Abstract
The need for indoor localization systems that can provide reliable access to location information in areas that are not serviced sufficiently by a global positioning system (GPS) has continued to grow. There are a wide variety of use cases for this localization data and increasing interest from industry, academia, and government agencies that has fueled research in this area. Smartphones are uniquely positioned to be a critical part of a localization solution based on the proliferation of these devices and the diverse array of sensors and radios that they contain. In this article, the capabilities of these sensors are explored along with the benefits and drawbacks of each for localization. Various methods for employing these sensors are surveyed. Many localization systems currently being explored utilize a combination of complimentary methods to enhance accuracy and reliability and decrease energy consumption of the overall system. Several of these localization frameworks are also explored. Finally, we describe the major challenges that are being faced in current research on indoor localization with smartphones, as they are critical for charting the path for future advances in indoor localization.
- Published
- 2017
- Full Text
- View/download PDF
31. SWIFTNoC
- Author
-
Sudeep Pasricha, Sai Vineel Reddy Chittamuru, and Srinivas Desai
- Subjects
010302 applied physics ,Engineering ,Multi-core processor ,Silicon photonics ,Multicast ,business.industry ,Throughput ,02 engineering and technology ,Chip ,01 natural sciences ,020202 computer hardware & architecture ,Hardware and Architecture ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Bandwidth (computing) ,Electrical and Electronic Engineering ,Photonics ,business ,Software ,Communication channel - Abstract
On-chip communication is widely considered to be one of the major performance bottlenecks in contemporary chip multiprocessors (CMPs). With recent advances in silicon nanophotonics, photonics-based network-on-chip (NoC) architectures are being considered as a viable solution to support communication in future CMPs as they can enable higher bandwidth and lower power dissipation compared to traditional electrical NoCs. In this article, we present SwiftNoC , a novel reconfigurable silicon-photonic NoC architecture that features improved multicast-enabled channel sharing, as well as dynamic re-prioritization and exchange of bandwidth between clusters of cores running multiple applications, to increase channel utilization and system performance. Experimental results show that SwiftNoC improves throughput by up to 25.4× while reducing latency by up to 72.4% and energy-per-bit by up to 95% over state-of-the-art solutions.
- Published
- 2017
- Full Text
- View/download PDF
32. ARTEMIS: An Aging-Aware Runtime Application Mapping Framework for 3D NoC-Based Chip Multiprocessors
- Author
-
Venkata Yaswanth Raparti, Sudeep Pasricha, and Nishit Kapadia
- Subjects
Engineering ,Computer science ,business.industry ,Reliability (computer networking) ,Hardware_PERFORMANCEANDRELIABILITY ,Chip ,Electromigration ,Die (integrated circuit) ,Power (physics) ,Hardware and Architecture ,Control and Systems Engineering ,Embedded system ,Hardware_INTEGRATEDCIRCUITS ,System on a chip ,business ,Information Systems ,Electronic circuit ,Degradation (telecommunications) - Abstract
In emerging 3D NoC-based chip multiprocessors (CMPs), aging in circuits due to bias temperature instability (BTI) stress is expected to cause gate-delay degradation that, if left unchecked, can lead to untimely failure. Simultaneously, the effects of electromigration (EM) induced aging in the on-chip wires, especially those in the 3D power delivery network (PDN), are expected to notably reduce chip lifetime. A commonly proposed solution to mitigate circuit-slowdown due to aging is to hike the supply voltage; however, this increases current-densities in the PDN due to the increased power consumption on the die, which in turn expedites PDN-aging. We thus note that mechanisms to enhance lifetime reliability in 3D NoC-based CMPs must consider circuit-aging together with PDN-aging. In this paper, we propose a novel runtime framework ( ARTEMIS ) for intelligent dynamic application-mapping and voltage-scaling to simultaneously manage aging in circuits and the PDN, and enhance the performance and lifetime of 3D NoC-based CMPs. We also propose an aging-enabled routing algorithm that balances the degree of aging between NoC routers and cores, thereby increasing the combined lifetime of both. Our framework also considers dark-silicon power constraints that are becoming a major design challenge in scaled technologies, particularly for 3D stacked CMPs. Our experimental results indicate that ARTEMIS enables the execution of 25 percent more applications over the chip lifetime compared to state-of-the-art prior work.
- Published
- 2017
- Full Text
- View/download PDF
33. A Runtime Framework for Robust Application Scheduling With Adaptive Parallelism in the Dark-Silicon Era
- Author
-
Nishit Kapadia and Sudeep Pasricha
- Subjects
Multi-core processor ,Computer science ,business.industry ,020208 electrical & electronic engineering ,Multiprocessing ,02 engineering and technology ,Integrated circuit ,Energy consumption ,Chip ,020202 computer hardware & architecture ,law.invention ,Dynamic voltage scaling ,Reliability (semiconductor) ,Hardware and Architecture ,law ,Embedded system ,Dark silicon ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,business ,Software - Abstract
With deeper technology scaling accompanied by a worsening power wall, an increasing proportion of chip area on a chip multiprocessor (CMP) is expected to be occupied by dark silicon. At the same time, design challenges due to process variations and soft errors in integrated circuits are projected to become even more severe. It is well known that spatial variations in process parameters introduce significant unpredictability in the performance and power profiles of CMP cores. By mapping applications onto the best set of cores, process variations can potentially be used to our advantage in the dark-silicon era. In addition, the probability of occurrence of soft errors during application execution has been found to be strongly related to the supply voltage and operating frequency values, thus necessitating reliability awareness within runtime voltage scaling schemes in contemporary CMPs. In this paper, we present a novel framework that leverages the knowledge of variations on the chip to perform runtime application mapping and dynamic voltage scaling to optimize system performance and energy, while satisfying dark-silicon power constraints of the chip as well as application-specific performance and reliability constraints. Our experimental results show average savings of 10%–71% in application service times and 13%–38% in energy consumption, compared with prior work.
- Published
- 2017
- Full Text
- View/download PDF
34. Guest Editors’ Introduction: Design and Management of Mobile Platforms: From Smartphones to Wearable Devices
- Author
-
Umit Y. Ogras, Michael Kishinevsky, Raid Ayoub, and Sudeep Pasricha
- Subjects
Computer science ,business.industry ,Mobile computing ,Wearable computer ,Energy consumption ,System requirements ,Electric power system ,User experience design ,Hardware and Architecture ,Embedded system ,Resource management ,Electrical and Electronic Engineering ,business ,Software ,Wearable technology - Abstract
There are close to five million apps that run on more than a billion smartphones as of 2020 [1] . This number is likely to continue increasing rapidly with the technological advancements in mobile computing, smaller form-factor wearable computers, and the Internet-of-Things (IoT) devices. Although form factors and specific system requirements vary, mobile platforms share common design goals, which include energy-efficiency, competitive performance, battery life, and reliability. Competitive performance requires faster operating frequency and leads to higher power consumption. In turn, power consumption increases the junction and skin temperatures, which have adverse effects on the device reliability and user experience. Therefore, highly heterogeneous systems-on-chips (SoCs) are required to achieve the performance requirements in terms of tight power consumption, energy, and cost. The design of these platforms remains challenging. Moreover, application development, let alone aggressive optimization, is notoriously difficult and time-consuming when utilizing highly specialized accelerators. The optimization problem is exacerbated by dynamic variations of application workloads and operating conditions. As a result, there is a need for novel software- and hardware-based adaptive resource management approaches that consider the platform as a whole, rather than focusing on a subset of the target system.
- Published
- 2020
- Full Text
- View/download PDF
35. Guest Editors’ Introduction: Emerging Networks-on-Chip – Designs, Technologies, and Applications
- Author
-
Edoardo Fusella, Jose Flich, Ian O'Connor, Sudeep Pasricha, Mahdi Nikdast, INL - Conception de Systèmes Hétérogènes (INL - CSH), Institut des Nanotechnologies de Lyon (INL), École Centrale de Lyon (ECL), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École supérieure de Chimie Physique Electronique de Lyon (CPE)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-École Centrale de Lyon (ECL), and Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
010302 applied physics ,Computer science ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,Computer architecture ,Hardware and Architecture ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,Networks on chip ,[SPI.NANO]Engineering Sciences [physics]/Micro and nanotechnologies/Microelectronics ,Software ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience
- Published
- 2019
- Full Text
- View/download PDF
36. HPC node performance and energy modeling with the co-location of applications
- Author
-
Sudeep Pasricha, Howard Jay Siegel, David A. Bader, Daniel Dauwe, Eric Jonardi, Ryan Friese, and Anthony A. Maciejewski
- Subjects
020203 distributed computing ,Multi-core processor ,Xeon ,Computer science ,Distributed computing ,Energy modeling ,02 engineering and technology ,Execution time ,020202 computer hardware & architecture ,Theoretical Computer Science ,Scheduling (computing) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Software ,Information Systems - Abstract
Multicore processors have become an integral part of modern large-scale and high-performance parallel and distributed computing systems. Unfortunately, applications co-located on multicore processors can suffer from decreased performance and increased dynamic energy use as a result of interference in shared resources, such as memory. As this interference is difficult to characterize, assumptions about application execution time and energy usage can be misleading in the presence of co-location. Consequently, it is important to accurately characterize the performance and energy usage of applications that execute in a co-located manner on these architectures. This work investigates some of the disadvantages of co-location, and presents a methodology for building models capable of utilizing varying amounts of information about a target application and its co-located applications to make predictions about the target application’s execution time and the system’s energy use under arbitrary co-locations of a wide range of application types. The proposed methodology is validated on three different server class Intel Xeon multicore processors using eleven applications from two scientific benchmark suites. The model’s utility for scheduling is also demonstrated in a simulated large-scale high-performance computing environment through the creation of a co-location aware scheduling heuristic. This heuristic demonstrates that scheduling using information generated with the proposed modeling methodology is capable of making significant improvements over a scheduling heuristic that is oblivious to co-location interference.
- Published
- 2016
- Full Text
- View/download PDF
37. A System-Level Cosynthesis Framework for Power Delivery and On-Chip Data Networks in Application-Specific 3-D ICs
- Author
-
Nishit Kapadia and Sudeep Pasricha
- Subjects
Engineering ,business.industry ,020208 electrical & electronic engineering ,Probabilistic logic ,Multiprocessing ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,MPSoC ,020202 computer hardware & architecture ,Power (physics) ,Network on a chip ,Hardware and Architecture ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,System on a chip ,Electrical and Electronic Engineering ,Routing (electronic design automation) ,business ,Metaheuristic ,Software - Abstract
With increasing core counts ushering in power-constrained 3-D multiprocessor system-on-chips (MPSoCs), optimizing communication power dissipated by the 3-D network-on-chip (NoC) fabric is critical. At the same time, with increased power densities in 3-D ICs, problems of IR drops in the power delivery network (PDN) as well as thermal hot spots on the 3-D die are becoming very severe. Even though the PDN and NoC design goals are nonoverlapping, both the optimizations are interdependent. Unfortunately, designers today seldom consider the design of the PDN, while designing NoCs. Moreover, for each new configuration of computation core and communication mapping on an MPSoC, the corresponding intercore communication patterns, 3-D on-chip thermal profile, as well as IR-drop distribution in the PDN can vary significantly. Based on this observation, we propose a novel design-time system-level application-specific cosynthesis framework that intelligently maps computation and communication resources on a die, for a given workload. The goal is to minimize the NoC power as well as chip-cooling power and optimize the 3-D PDN architecture; while meeting performance goals and satisfying thermal constraints, for a microfluidic cooling-based application-specific 3-D MPSoC. Our experimental results indicate that the proposed 3-D NoC-PDN cosynthesis framework is not only able to meet PDN design goals unlike prior 3-D NoC synthesis approaches, but also provides better overall optimality with the solution quality improvement of up to 35.4% over a probabilistic metaheuristic-based cooptimization approach proposed in prior work.
- Published
- 2016
- Full Text
- View/download PDF
38. A Hidden Markov Model based smartphone heterogeneity resilient portable indoor localization framework
- Author
-
Saideep Tiku, Qi Han, Branislav M. Notaros, and Sudeep Pasricha
- Subjects
010302 applied physics ,060102 archaeology ,Computer science ,business.industry ,Real-time computing ,06 humanities and the arts ,01 natural sciences ,Porting ,Variety (cybernetics) ,Software ,Hardware and Architecture ,Application domain ,0103 physical sciences ,Scalability ,0601 history and archaeology ,Transceiver ,Hidden Markov model ,business ,Mobile device - Abstract
Indoor localization is an emerging application domain that promises to enhance the way we navigate in various indoor environments, as well as track equipment and people. Wireless signal-based fingerprinting is one of the leading approaches for indoor localization. Using ubiquitous Wi-Fi access points and Wi-Fi transceivers in smartphones has enabled the possibility of fingerprinting-based localization techniques that are scalable and low-cost. But the variety of Wi-Fi hardware modules and software stacks used in today's smartphones introduce errors when using Wi-Fi based fingerprinting approaches across devices, which reduces localization accuracy. We propose a framework called SHERPA-HMM that enables efficient porting of indoor localization techniques across mobile devices, to maximize accuracy. An in-depth analysis of our framework shows that it can deliver up to 8× more accurate results as compared to state-of-the-art localization techniques for a variety of environments.
- Published
- 2020
- Full Text
- View/download PDF
39. Run-Time Management for Multicore Embedded Systems With Energy Harvesting
- Author
-
Sudeep Pasricha and Yi Xiang
- Subjects
Multi-core processor ,Engineering ,business.industry ,Heuristic (computer science) ,Energy management ,Workload ,Task (project management) ,Hardware and Architecture ,Embedded system ,Overhead (computing) ,Electrical and Electronic Engineering ,Heuristics ,business ,Frequency scaling ,Software - Abstract
In this paper, we propose a novel framework for runtime energy and workload management in multicore embedded systems with solar energy harvesting and a periodic hard real-time task set as the workload. Compared with prior work, our framework makes several novel contributions and possesses several advantages, including the following: 1) a semidynamic scheduling heuristic that dynamically adapts to runtime harvested power variations without losing the consistency of periodic tasks; 2) a battery–supercapacitor hybrid energy storage module for more efficient system energy management; 3) a coarse-grained core shutdown heuristic for additional energy saving; 4) energy budget planning and task allocation heuristics with process variation tolerance; 5) a novel dual-speed method specifically designed for periodic tasks to address discrete frequency levels and dynamic voltage/frequency scaling switching overhead at the core level; and 6) an extension to prepare the system for thermal issues arising at runtime during extreme environmental conditions. The experimental studies show that our framework results in a reduction in task miss rate by up to 70% and task miss penalty by up to 65% compared with the best known prior work.
- Published
- 2015
- Full Text
- View/download PDF
40. Makespan and Energy Robust Stochastic Static Resource Allocation of a Bag-of-Tasks to a Heterogeneous Computing System
- Author
-
Dalton Young, Yong Zou, Anthony A. Maciejewski, Jay Smith, Sudeep Pasricha, Jonathan Apodaca, Bhavesh Khemka, Adrian Ramirez, Shirish Bahirat, Mark A. Oxley, Luis Diego Briceno, and Howard Jay Siegel
- Subjects
Job shop scheduling ,Operations research ,business.industry ,Computer science ,Distributed computing ,Workload ,Symmetric multiprocessor system ,Energy consumption ,Energy budget ,Computational Theory and Mathematics ,Hardware and Architecture ,Signal Processing ,Resource allocation ,Data center ,Resource management ,Cache ,Heuristics ,business - Abstract
Today’s data centers face the issue of balancing electricity use and completion times of their workloads. Rising electricity costs are forcing data center operators to either operate within an electricity budget or to reduce electricity use as much as possible while still maintaining service agreements. Energy-aware resource allocation is one technique a system administrator can employ to address both problems: optimizing the workload completion time (makespan) when given an energy budget, or to minimize energy consumption subject to service guarantees (such as adhering to deadlines). In this paper, we study the problem of energy-aware static resource allocation in an environment where a collection of independent (non-communicating) tasks (“bag-of-tasks”) is assigned to a heterogeneous computing system. Computing systems often operate in environments where task execution times vary (e.g., due to cache misses or data dependent execution times). We model these execution times stochastically, using probability density functions. We want our resource allocations to be robust against these variations, where we define energy-robustness as the probability that the energy budget is not violated, and makespan-robustness as the probability a makespan deadline is not violated. We develop and analyze several heuristics for energy-aware resource allocation for both energy-constrained and deadline-constrained problems.
- Published
- 2015
- Full Text
- View/download PDF
41. Soft and Hard Reliability-Aware Scheduling for Multicore Embedded Systems with Energy Harvesting
- Author
-
Yi Xiang and Sudeep Pasricha
- Subjects
Engineering ,Schedule ,Multi-core processor ,business.industry ,Workload ,System dynamics ,Scheduling (computing) ,Hardware and Architecture ,Control and Systems Engineering ,Embedded system ,Energy supply ,business ,Energy harvesting ,Information Systems ,Efficient energy use - Abstract
For multicore embedded systems powered by energy harvesting, it is necessary to develop intelligent resource allocation techniques that adjust the application execution strategy on-the-fly to adapt to changing energy supply from the harvesting system. To cope with the complexity of managing applications with data dependencies on such systems, we propose a hybrid design-time/run-time framework for resource allocation that takes into consideration variations in solar radiance and execution time, transient faults, and permanent faults due to aging effects. Our framework generates schedule templates at design-time with an emphasis on energy efficiency and uses lightweight online management schemes to react to run-time system dynamics. Experimental results indicate that our framework presents improvements in performance and adaptivity, with up to 23.2 percent miss rate reduction compared to prior work, 43.6 percent performance benefits from adaptive run-time workload management, and up to 24.5 percent expected system lifetime improvement with aging-aware allocation of workload partitions.
- Published
- 2015
- Full Text
- View/download PDF
42. 3-D WiRED: A Novel WIDE I/O DRAM With Energy-Efficient 3-D Bank Organization
- Author
-
Ishan G. Thakkar and Sudeep Pasricha
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer science ,business.industry ,Energy consumption ,CAS latency ,Hardware and Architecture ,Universal memory ,Embedded system ,Memory architecture ,Electrical and Electronic Engineering ,Architecture ,Latency (engineering) ,business ,Software ,Dram ,Efficient energy use - Abstract
Editor’s notes: WIDE I/O DRAM is a promising 3-D memoryarchitecture for low-power/highperformance computing. This paper proposes a new WIDE I/O DRAM architecture to reduce access latency and energy consumption at the same time, which shows the possibility of further optimization of the WIDE I/O DRAM architecture and the impact of TSV usage in the memory architecture on the performance and energy consumption.
- Published
- 2015
- Full Text
- View/download PDF
43. 3D-ProWiz: An Energy-Efficient and Optically-Interfaced 3D DRAM Architecture with Reduced Data Access Overhead
- Author
-
Sudeep Pasricha and Ishan G. Thakkar
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer science ,business.industry ,Parallel computing ,Energy consumption ,CAS latency ,Hardware and Architecture ,Control and Systems Engineering ,Embedded system ,Memory architecture ,Overhead (computing) ,Latency (engineering) ,business ,Dram ,Random access ,Information Systems ,Efficient energy use - Abstract
This paper introduces 3D-ProWiz , which is a high-bandwidth, energy-efficient, optically-interfaced 3D DRAM architecture with fine grained data organization and activation. 3D-ProWiz integrates sub-bank level 3D partitioning of the data array to enable fine-grained activation and greater memory parallelism. A novel method of routing the internal memory bus to individual subarrays using TSVs and fanout buffers enables 3D-ProWiz to use smaller dimension subarrays without significant area overhead. The use of TSVs at subarray-level granularity eliminates the need to use slow and power hungry global lines, which in turn reduces the random access latency and activation-precharge energy. 3D-ProWiz yields the best latency and energy consumption values per access among other well-known 3D DRAM architectures. Experimental results with PARSEC benchmarks indicate that 3D-ProWiz achieves 41.9 percent reduction in average latency, 52 percent reduction in average power, and 80.6 percent reduction in energy-delay product (EDP) on average over DRAM architectures from prior work.
- Published
- 2015
- Full Text
- View/download PDF
44. A middleware framework for application-aware and user-specific energy optimization in smart mobile devices
- Author
-
Chris Ohlsen, Sudeep Pasricha, and Brad K. Donohoo
- Subjects
Ubiquitous computing ,Computer Networks and Communications ,business.industry ,Computer science ,Energy consumption ,Backlight ,Computer Science Applications ,Hardware and Architecture ,Embedded system ,Specific energy ,Markov decision process ,Android (operating system) ,business ,Classifier (UML) ,Mobile device ,Software ,Information Systems - Abstract
Mobile battery-operated devices are becoming an essential instrument for business, communication, and social interaction. In addition to the demand for an acceptable level of performance and a comprehensive set of features, users often desire extended battery lifetime. In fact, limited battery lifetime is one of the biggest obstacles facing the current utility and future growth of increasingly sophisticated "smart" mobile devices. This paper proposes a novel application-aware and user-interaction aware energy optimization middleware framework (AURA) for pervasive mobile devices. AURA optimizes CPU and screen backlight energy consumption while maintaining a minimum acceptable level of performance. The proposed framework employs a novel Bayesian application classifier and management strategies based on Markov Decision Processes and Q-Learning to achieve energy savings. Real-world user evaluation studies on Google Android based HTC Dream and Google Nexus One smartphones running the AURA framework demonstrate promising results, with up to 29% energy savings compared to the baseline device manager, and up to 5i?savings over prior work on CPU and backlight energy co-optimization.
- Published
- 2015
- Full Text
- View/download PDF
45. Crosstalk Mitigation for High-Radix and Low-Diameter Photonic NoC Architectures
- Author
-
Sudeep Pasricha and Sai Vineel Reddy Chittamuru
- Subjects
Engineering ,Multi-core processor ,Interconnection ,business.industry ,Detector ,Crosstalk ,Resonator ,Network on a chip ,Hardware and Architecture ,Electronic engineering ,Electrical and Electronic Engineering ,Photonics ,Crossbar switch ,business ,Software - Abstract
Photonic Network-on-chip (PNoC) is a promising alternative to design low-power and high-bandwidth interconnection infrastructure for multicore chips. The micro ring resonators, which are essential building blocks for designing PNoCs are susceptible to crosstalk that can notably degrade signal-to-noise ratio (SNR), reducing reliability of PNoCs. This paper proposes two novel encoding mechanisms to improve worst-case SNR by reducing crosstalk noise in microring resonators used within high-radix and low-diameter crossbar-based PNoCs.
- Published
- 2015
- Full Text
- View/download PDF
46. Silicon Nanophotonics for Future Multicore Architectures: Opportunities and Challenges
- Author
-
Yi Xu and Sudeep Pasricha
- Subjects
Network architecture ,Multi-core processor ,Silicon ,Computer science ,Nanophotonics ,chemistry.chemical_element ,Hardware_PERFORMANCEANDRELIABILITY ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Bandwidth allocation ,chemistry ,Hardware and Architecture ,Hardware_INTEGRATEDCIRCUITS ,Electronic engineering ,Electrical and Electronic Engineering ,Software - Abstract
This article surveys emerging silicon nanophotonic architectures, protocols, and components, for intra-chip and inter-chip communication. It also examines the challenges in silicon nanophotonics research, presents some of the more prominent emerging solutions, and provides a critical view of their advantages and limitations.
- Published
- 2014
- Full Text
- View/download PDF
47. METEOR
- Author
-
Sudeep Pasricha and Shirish Bahirat
- Subjects
Multi-core processor ,business.industry ,Computer science ,Mesh networking ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Chip ,Network on a chip ,Hardware and Architecture ,Embedded system ,Scalability ,Hardware_INTEGRATEDCIRCUITS ,Photonics ,business ,Massively parallel ,Software ,Data transmission - Abstract
With increasing application complexity and improvements in process technology, Chip MultiProcessors (CMPs) with tens to hundreds of cores on a chip are becoming a reality. Networks-on-Chip (NoCs) have emerged as scalable communication fabrics that can support high bandwidths for these massively parallel multicore systems. However, traditional electrical NoC implementations still need to overcome the challenges of high data transfer latencies and large power consumption. On-chip photonic interconnects with high performance-per-watt characteristics have recently been proposed as an alternative to address these challenges for intra-chip communication. In this article, we explore using low-cost photonic interconnects on a chip to enhance traditional electrical NoCs. Our proposed hybrid photonic ring-mesh NoC (METEOR) utilizes a configurable photonic ring waveguide coupled to a traditional 2D electrical mesh NoC. Experimental results indicate a strong motivation to consider the proposed architecture for future CMPs, as it can provide about 5× reduction in power consumption and improved throughput and access latencies, compared to traditional electrical 2D mesh and torus NoC architectures. Compared to other previously proposed hybrid photonic NoC fabrics such as the hybrid photonic torus, Corona, and Firefly, our proposed fabric is also shown to have lower photonic area overhead, power consumption, and energy-delay product, while maintaining competitive throughput and latency.
- Published
- 2014
- Full Text
- View/download PDF
48. Guest Editorial: Special Issue on Low-Power Dependable Computing
- Author
-
Sudeep Pasricha, Man Lin, Dakai Zhu, and Muhammad Shafique
- Subjects
Control and Optimization ,Renewable Energy, Sustainability and the Environment ,business.industry ,Computer science ,Distributed computing ,Reliability (computer networking) ,Fault tolerance ,Energy consumption ,Application software ,computer.software_genre ,Software ,Computational Theory and Mathematics ,Hardware and Architecture ,Transient (computer programming) ,Compiler ,business ,computer ,Efficient energy use - Abstract
The papers in this special section focus on low power dependable computing systems (LPDC). Faults (especially transient faults that lead to soft errors) have become more common due to the miniaturization of computing systems with continuously scaled technology sizes. Thus, it is imperative for most modern computing systems to deploy one or more types of fault-tolerance techniques. Traditionally, fault tolerance has been achieved through various error reduction, detection, and recovery techniques at different levels of the hardware/software stacks (e.g., circuit, architecture, operating systems, compiler, and application software), which generally incur power and energy overheads. Given the fact that energy has become a first-class system resource (especially for batterypowered mobile and IoT devices), it is important to understand the interdependencies between system reliability and power/energy consumption, and further investigate techniques that can address their tradeoffs. This special issue on Low-Power Dependable Computing (LPDC) seeks to tackle these challenges by exploring novel and bold ideas to achieve energy-efficient reliable computations in modern computing systems.
- Published
- 2018
- Full Text
- View/download PDF
49. MultiMaKe
- Author
-
Nikil Dutt, Sudeep Pasricha, Luis Angel D. Bathen, and Yongjin Ahn
- Subjects
Computer science ,Design flow ,Multiprocessing ,computer.file_format ,Parallel computing ,Chip ,Power budget ,Scheduling (computing) ,Software pipelining ,Hardware and Architecture ,JPEG 2000 ,computer ,Software ,Scratchpad memory - Abstract
The increasing demand for low-power and high-performance multimedia embedded systems has motivated the need for effective solutions to satisfy application bandwidth and latency requirements under a tight power budget. As technology scales, it is imperative that applications are optimized to take full advantage of the underlying resources and meet both power and performance requirements. We propose MultiMaKe, an application mapping design flow capable of discovering and enabling parallelism opportunities via code transformations, efficiently distributing the computational load across resources, and minimizing unnecessary data transfers. Our approach decomposes the application's tasks into smaller units of computations called kernels, which are distributed and pipelined across the different processing resources. We exploit the ideas of inter-kernel data reuse to minimize unnecessary data transfers between kernels, early execution edges to drive performance, and kernel pipelining to increase system throughput. Our experimental results on JPEG and JPEG2000 show up to 97% off-chip memory access reduction, and up to 80% execution time reduction over standard mapping and task-level pipelining approaches.
- Published
- 2013
- Full Text
- View/download PDF
50. A Software Framework for Rapid Application-Specific Hybrid Photonic Network-on-Chip Synthesis
- Author
-
Sudeep Pasricha and Shirish Bahirat
- Subjects
Engineering ,synthesis algorithms ,Computer Networks and Communications ,lcsh:TK7800-8360 ,02 engineering and technology ,computer.software_genre ,020210 optoelectronics & photonics ,Genetic algorithm ,0202 electrical engineering, electronic engineering, information engineering ,network-on-chip ,photonic interconnects ,chip multiprocessors ,Electrical and Electronic Engineering ,business.industry ,Ant colony optimization algorithms ,lcsh:Electronics ,Particle swarm optimization ,020202 computer hardware & architecture ,Software framework ,Network on a chip ,Hardware and Architecture ,Control and Systems Engineering ,Embedded system ,Signal Processing ,Simulated annealing ,Scalability ,Heuristics ,business ,computer - Abstract
Network on Chip (NoC) architectures have emerged in recent years as scalable communication fabrics to enable high bandwidth data transfers in chip multiprocessors (CMPs). These interconnection architectures still need to conquer many challenges, e.g., significant power consumption and high data transfer latencies. Hybrid electro-photonic NoCs have been recently proposed as a solution to mitigate some of these challenges. However, with increasing application complexity, hardware dependencies, and performance variability, optimization of hybrid photonic NoCs requires traversing a massive design space. To date, prior work on software tools for rapid automated NoC synthesis have mainly focused on electrical NoCs. In this article, we propose a novel suite of software tools for effectively synthesizing hybrid photonic NoCs. We formulate and solve the synthesis problem using four search-based optimization heuristics: (1) Ant Colony Optimization (ACO); (2) Particle Swarm Optimization (PSO); (3) Genetic Algorithm (GA); and (4) Simulated Annealing (SA). Our experimental results show significant promise for the ACO and PSO based heuristics. Our novel implementation of PSO achieves an average of 64% energy-delay product improvements over GA and 53% improvement over SA; while our novel ACO implementation achieves 107% energy-delay product improvements over GA and 62% improvement over SA.
- Published
- 2016
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.