55 results on '"Sudeep Pasricha"'
Search Results
2. Silicon Photonic Microring Resonators: A Comprehensive Design-Space Exploration and Optimization Under Fabrication-Process Variations
- Author
-
Asif Mirza, Febin Sunny, Peter Walsh, Karim Hassan, Sudeep Pasricha, and Mahdi Nikdast
- Subjects
Electrical and Electronic Engineering ,Computer Graphics and Computer-Aided Design ,Software - Published
- 2022
- Full Text
- View/download PDF
3. Interconnects for DNA, Quantum, In-Memory, and Optical Computing: Insights From a Panel Discussion
- Author
-
Amlan Ganguly, Sergi Abadal, Ishan Thakkar, Natalie Enright Jerger, Marc Riedel, Masoud Babaie, Rajeev Balasubramonian, Abu Sebastian, Sudeep Pasricha, Baris Taskin, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, and Universitat Politècnica de Catalunya. CBA - Sistemes de Comunicacions i Arquitectures de Banda Ampla
- Subjects
Integrated circuit interconnections ,Ordinadors quàntics ,DNA Computing ,Wireless communications systems ,Wireless communication ,Topology ,Photonic Interconnects ,Photonic interconnects ,Optical computing ,Electrical and Electronic Engineering ,Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] ,In-Memory Computing ,Computers ,Quantum computers ,Deep learning ,DNA ,Quantum computing ,In-Memory computing ,Optical Computing ,DNA computing ,Comunicació sense fil, Sistemes de ,Hardware and Architecture ,Quantum Computing ,Qubit ,Wireless Interconnects ,Wireless interconnects ,Software ,Aprenentatge profund - Abstract
The computing world is witnessing a proverbial Cambrian explosion of emerging paradigms propelled by applications such as Artificial Intelligence, Big Data, and Cybersecurity. The recent advances in technology to store digital data inside a DNA strand, manipulate quantum bits (qubits), perform logical operations with photons, and perform computations inside memory systems are ushering in the era of emerging paradigms of DNA computing, quantum computing, optical computing, and in-memory computing. In an orthogonal direction, research on interconnect design using advanced electro-optic, wireless, and microfluidic technologies has shown promising solutions to the architectural limitations of traditional von-Neumann computers. In this article, experts present their comments on the role of interconnects in the emerging computing paradigms and discuss the potential use of chiplet-based architectures for the heterogeneous integration of such technologies. This work was supported in part by the US NSF CAREER Grant CNS-1553264 and EU H2020 research and innovation programme under Grant 863337.
- Published
- 2022
- Full Text
- View/download PDF
4. Photonic Networks-on-Chip Employing Multilevel Signaling: A Cross-Layer Comparative Study
- Author
-
Venkata Sai Praneeth Karempudi, Febin Sunny, Ishan G. Thakkar, Sai Vineel Reddy Chittamuru, Mahdi Nikdast, and Sudeep Pasricha
- Subjects
FOS: Computer and information sciences ,Emerging Technologies (cs.ET) ,Hardware and Architecture ,Computer Science - Emerging Technologies ,Electrical and Electronic Engineering ,Software - Abstract
Photonic network-on-chip (PNoC) architectures employ photonic links with dense wavelength-division multiplexing (DWDM) to enable high throughput on-chip transfers. Unfortunately, increasing the DWDM degree (i.e., using a larger number of wavelengths) to achieve higher aggregated datarate in photonic links, and hence higher throughput in PNoCs, requires sophisticated and costly laser sources along with extra photonic hardware. This extra hardware can introduce undesired noise to the photonic link and increase the bit-error-rate (BER), power, and area consumption of PNoCs. To mitigate these issues, the use of 4-pulse amplitude modulation (4-PAM) signaling, instead of the conventional on-off keying (OOK) signaling, can halve the wavelength signals utilized in photonic links for achieving the target aggregate datarate while reducing the overhead of crosstalk noise, BER, and photonic hardware. There are various designs of 4- PAM modulators reported in the literature. For example, the signal superposition (SS), electrical digital-to-analog converter (EDAC), and optical digital-to-analog converter (ODAC) based designs of 4-PAM modulators have been reports. However, it is yet to be explored how these SS, EDAC, and ODAC based 4-PAM modulators can be utilized to design DWDM-based photonic links and PNoC architectures. In this paper, we provide an extensive link-level and system-level of the SS, EDAC, and ODAC types of 4-PAM modulators from prior work with regards to their applicability and utilization overheads. From our link-level and PNoC-level evaluation, we have observed that the 4-PAM EDAC based variants of photonic links and PNoCs exhibit better performance and energy-efficiency compared to the OOK, 4-PAM SS, and 4-PAM ODAC based links and PNoCs., Submitted and Accepted to publish in ACM Journal on Emerging Technologies in Computing Systems
- Published
- 2022
- Full Text
- View/download PDF
5. Surveillance mission scheduling with unmanned aerial vehicles in dynamic heterogeneous environments
- Author
-
Dylan Machovec, Howard Jay Siegel, James A. Crowder, Sudeep Pasricha, Anthony A. Maciejewski, and Ryan D. Friese
- Subjects
Hardware and Architecture ,Software ,Information Systems ,Theoretical Computer Science - Published
- 2023
- Full Text
- View/download PDF
6. LATTE: <u>L</u> STM Self- <u>Att</u> ention based Anomaly Detection in <u>E</u> mbedded Automotive Platforms
- Author
-
Vipin Kumar Kukkala, Sooryaa Vignesh Thiruloga, and Sudeep Pasricha
- Subjects
Scheme (programming language) ,business.industry ,Computer science ,Distributed computing ,Automotive industry ,Cyber-physical system ,Beacon ,CAN bus ,Recurrent neural network ,Hardware and Architecture ,Anomaly detection ,Visibility ,business ,computer ,Software ,computer.programming_language - Abstract
Modern vehicles can be thought of as complex distributed embedded systems that run a variety of automotive applications with real-time constraints. Recent advances in the automotive industry towards greater autonomy are driving vehicles to be increasingly connected with various external systems (e.g., roadside beacons, other vehicles), which makes emerging vehicles highly vulnerable to cyber-attacks. Additionally, the increased complexity of automotive applications and the in-vehicle networks results in poor attack visibility, which makes detecting such attacks particularly challenging in automotive systems. In this work, we present a novel anomaly detection framework called LATTE to detect cyber-attacks in Controller Area Network (CAN) based networks within automotive platforms. Our proposed LATTE framework uses a stacked Long Short Term Memory (LSTM) predictor network with novel attention mechanisms to learn the normal operating behavior at design time. Subsequently, a novel detection scheme (also trained at design time) is used to detect various cyber-attacks (as anomalies) at runtime. We evaluate our proposed LATTE framework under different automotive attack scenarios and present a detailed comparison with the best-known prior works in this area, to demonstrate the potential of our approach.
- Published
- 2021
- Full Text
- View/download PDF
7. Electronic, Wireless, and Photonic Network-on-Chip Security: Challenges and Countermeasures
- Author
-
Sudeep Pasricha, John Jose, and Sujay Deb
- Subjects
FOS: Computer and information sciences ,Computer Science - Cryptography and Security ,Hardware and Architecture ,Hardware Architecture (cs.AR) ,Electrical and Electronic Engineering ,Computer Science - Hardware Architecture ,Cryptography and Security (cs.CR) ,Software - Abstract
Networks-on-chips (NoCs) are an integral part of emerging manycore computing chips. They play a key role in facilitating communication among processing cores and between cores and memory. To meet the aggressive performance and energy-efficiency targets of machine learning and big data applications, NoCs have been evolving to leverage emerging paradigms such as silicon photonics and wireless communication. Increasingly, these NoC fabrics are becoming susceptible to security vulnerabilities, such as from hardware trojans that can snoop, corrupt, or disrupt information transfers on NoCs. This article surveys the landscape of security challenges and countermeasures across electronic, wireless, and photonic NoCs.
- Published
- 2022
8. A Survey on Silicon Photonics for Deep Learning
- Author
-
Febin Sunny, Sudeep Pasricha, Mahdi Nikdast, and Ebadollah Taheri
- Subjects
I.2 ,FOS: Computer and information sciences ,B.7.m ,Computer science ,C.5.4 ,Computer Science - Emerging Technologies ,02 engineering and technology ,01 natural sciences ,010309 optics ,020210 optoelectronics & photonics ,Human–computer interaction ,Hardware Architecture (cs.AR) ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,General pattern ,Electrical and Electronic Engineering ,Computer Science - Hardware Architecture ,Silicon photonics ,business.industry ,Deep learning ,Emerging Technologies (cs.ET) ,Neuromorphic engineering ,Hardware and Architecture ,Artificial intelligence ,business ,Software - Abstract
Deep learning has led to unprecedented successes in solving some very difficult problems in domains such as computer vision, natural language processing, and general pattern recognition. These achievements are the culmination of decades-long research into better training techniques and deeper neural network models, as well as improvements in hardware platforms that are used to train and execute the deep neural network models. Many application-specific integrated circuit (ASIC) hardware accelerators for deep learning have garnered interest in recent years due to their improved performance and energy-efficiency over conventional CPU and GPU architectures. However, these accelerators are constrained by fundamental bottlenecks due to (1) the slowdown in CMOS scaling, which has limited computational and performance-per-watt capabilities of emerging electronic processors; and (2) the use of metallic interconnects for data movement, which do not scale well and are a major cause of bandwidth, latency, and energy inefficiencies in almost every contemporary processor. Silicon photonics has emerged as a promising CMOS-compatible alternative to realize a new generation of deep learning accelerators that can use light for both communication and computation. This article surveys the landscape of silicon photonics to accelerate deep learning, with a coverage of developments across design abstractions in a bottom-up manner, to convey both the capabilities and limitations of the silicon photonics paradigm in the context of deep learning acceleration.
- Published
- 2021
- Full Text
- View/download PDF
9. ARXON: A Framework for Approximate Communication Over Photonic Networks-on-Chip
- Author
-
Sudeep Pasricha, Asif Mirza, Ishan G. Thakkar, Febin Sunny, and Mahdi Nikdast
- Subjects
Silicon photonics ,Computer science ,business.industry ,02 engineering and technology ,Dissipation ,Laser ,020202 computer hardware & architecture ,Power (physics) ,law.invention ,Hardware and Architecture ,law ,Limit (music) ,0202 electrical engineering, electronic engineering, information engineering ,Electronic engineering ,Overhead (computing) ,Laser power scaling ,Electrical and Electronic Engineering ,Photonics ,business ,Software - Abstract
The approximate computing paradigm advocates for relaxing accuracy goals in applications to improve energy-efficiency and performance. Recently, this paradigm has been explored to improve the energy-efficiency of silicon photonic networks-on-chip (PNoCs). Silicon photonic interconnects suffer from high power dissipation because of laser sources, which generate carrier wavelengths, and tuning power required for regulating photonic devices under different uncertainties. In this article, we propose a framework called AppRoXimation framework for On-chip photonic Networks ( ARXON ) to reduce such power dissipation overhead by enabling intelligent and aggressive approximation during communication over silicon photonic links in PNoCs. Our framework reduces laser and tuning-power overhead while intelligently approximating communication, such that application output quality is not distorted beyond an acceptable limit. Simulation results show that our framework can achieve up to 56.4% lower laser power consumption and up to 23.8% better energy-efficiency than the best-known prior work on approximate communication with silicon photonic interconnects and for the same application output quality.
- Published
- 2021
- Full Text
- View/download PDF
10. A Survey on Energy Management for Mobile and IoT Devices
- Author
-
Umit Y. Ogras, Sudeep Pasricha, Raid Ayoub, Sumit K. Mandal, and Michael Kishinevsky
- Subjects
business.industry ,Computer science ,Energy management ,Mobile computing ,Wearable computer ,02 engineering and technology ,020202 computer hardware & architecture ,Form factor (design) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Wireless ,State (computer science) ,Electrical and Electronic Engineering ,Telecommunications ,business ,Energy harvesting ,Software ,Efficient energy use - Abstract
Editor’s notes: Mobile and IoT devices have proliferated our daily lives. However, these miniaturized computing systems should be highly energy-efficient due to their ultrasmall form factor. Hence, energy management is of utmost importance for both mobile and IoT devices. This article presents a comprehensive survey on this topic. — Partha Pratim Pande, Washington State University
- Published
- 2020
- Full Text
- View/download PDF
11. Ethical Design of Computers: From Semiconductors to IoT and Artificial Intelligence
- Author
-
Sudeep Pasricha and Marilyn Wolf
- Subjects
FOS: Computer and information sciences ,Computer Science - Computers and Society ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Hardware and Architecture ,Computers and Society (cs.CY) ,Hardware Architecture (cs.AR) ,Electrical and Electronic Engineering ,Computer Science - Hardware Architecture ,Software - Abstract
Computing systems are tightly integrated today into our professional, social, and private lives. An important consequence of this growing ubiquity of computing is that it can have significant ethical implications of which computing professionals should take account. In most real-world scenarios, it is not immediately obvious how particular technical choices during the design and use of computing systems could be viewed from an ethical perspective. This article provides a perspective on the ethical challenges within semiconductor chip design, IoT applications, and the increasing use of artificial intelligence in the design processes, tools, and hardware-software stacks of these systems.
- Published
- 2023
- Full Text
- View/download PDF
12. Overcoming Security Vulnerabilities in Deep Learning--based Indoor Localization Frameworks on Mobile Devices
- Author
-
Saideep Tiku and Sudeep Pasricha
- Subjects
Spoofing attack ,Artificial neural network ,business.industry ,Computer science ,Reliability (computer networking) ,Deep learning ,Distributed computing ,Vulnerability ,020206 networking & telecommunications ,02 engineering and technology ,Hardware and Architecture ,Application domain ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Mobile device ,Software - Abstract
Indoor localization is an emerging application domain for the navigation and tracking of people and assets. Ubiquitously available Wi-Fi signals have enabled low-cost fingerprinting-based localization solutions. Further, the rapid growth in mobile hardware capability now allows high-accuracy deep learning--based frameworks to be executed locally on mobile devices in an energy-efficient manner. However, existing deep learning--based indoor localization solutions are vulnerable to access point (AP) attacks. This article presents an analysis into the vulnerability of a convolutional neural network--based indoor localization solution to AP security compromises. Based on this analysis, we propose a novel methodology to maintain indoor localization accuracy, even in the presence of AP attacks. The proposed secured neural network framework (S-CNNLOC) is validated across a benchmark suite of paths and is found to deliver up to 10× more resiliency to malicious AP attacks compared to its unsecured counterpart.
- Published
- 2019
- Full Text
- View/download PDF
13. PortLoc: A Portable Data-Driven Indoor Localization Framework for Smartphones
- Author
-
Sudeep Pasricha and Saideep Tiku
- Subjects
Computer science ,business.industry ,Real-time computing ,02 engineering and technology ,Fingerprint recognition ,020202 computer hardware & architecture ,Data-driven ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Global Positioning System ,Electrical and Electronic Engineering ,business ,Software ,Multipath propagation - Abstract
Editor’s note: Fingerprinting is essential for indoor navigation and localization due to its low cost, accuracy, and resiliency to multipath effects in constrained environments. This article aims to overcome the challenge of device heterogeneity and describes a portable lightweight fingerprinting framework while improving localization accuracy.—Paul Bogdan, University of Southern California
- Published
- 2019
- Full Text
- View/download PDF
14. Energy and Network Aware Workload Management for Geographically Distributed Data Centers
- Author
-
Ninad Hogade, Howard Jay Siegel, and Sudeep Pasricha
- Subjects
Networking and Internet Architecture (cs.NI) ,FOS: Computer and information sciences ,Queueing theory ,Control and Optimization ,Renewable Energy, Sustainability and the Environment ,business.industry ,Computer science ,Distributed computing ,Cloud computing ,Workload ,Net metering ,Computer Science - Networking and Internet Architecture ,Computational Theory and Mathematics ,Peak demand ,Computer Science - Distributed, Parallel, and Cluster Computing ,Hardware and Architecture ,Computer Science - Computer Science and Game Theory ,Server ,Data center ,Distributed, Parallel, and Cluster Computing (cs.DC) ,business ,Software ,Operating cost ,Computer Science and Game Theory (cs.GT) - Abstract
Cloud service providers are distributing data centers geographically to minimize energy costs through intelligent workload distribution. With increasing data volumes in emerging cloud workloads, it is critical to factor in the network costs for transferring workloads across data centers. For geo-distributed data centers, many researchers have been exploring strategies for energy cost minimization and intelligent inter-data-center workload distribution separately. However, prior work does not comprehensively and simultaneously consider data center energy costs, data transfer costs, and data center queueing delay. In this paper, we propose a novel game theory-based workload management framework that takes a holistic approach to the cloud operating cost minimization problem by making intelligent scheduling decisions aware of data transfer costs and the data center queueing delay. Our framework performs intelligent workload management that considers heterogeneity in data center compute capability, cooling power, interference effects from task co-location in servers, time-of-use electricity pricing, renewable energy, net metering, peak demand pricing distribution, and network pricing. Our simulations show that the proposed game-theoretic technique can minimize the cloud operating cost more effectively than existing approaches.
- Published
- 2021
15. BPLight-CNN: A Photonics-based Backpropagation Accelerator for Deep Learning
- Author
-
Dharanidhar Dang, Rabi N. Mahapatra, Sudeep Pasricha, Sai Vineel Reddy Chittamuru, and Debashis Sahoo
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Speedup ,Computer science ,Computation ,cs.LG ,Computer Science - Emerging Technologies ,CAD ,Convolutional neural network ,Machine Learning (cs.LG) ,cs.AR ,cs.ET ,Hardware Architecture (cs.AR) ,Electrical and Electronic Engineering ,Computer Science - Hardware Architecture ,business.industry ,Deep learning ,Backpropagation ,Emerging Technologies (cs.ET) ,Computer engineering ,Hardware and Architecture ,Benchmark (computing) ,Artificial intelligence ,business ,Software ,Energy (signal processing) - Abstract
Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation (BP) algorithm. This results in expensive computation overheads during training. Consequently, most deep learning accelerators today employ pretrained weights and focus only on improving the design of the inference phase. The recent trend is to build a complete deep learning accelerator by incorporating the training module. Such efforts require an ultra-fast chip architecture for executing the BP algorithm. In this article, we propose a novel photonics-based backpropagation accelerator for high-performance deep learning training. We present the design for a convolutional neural network (CNN), BPLight-CNN , which incorporates the silicon photonics-based backpropagation accelerator. BPLight-CNN is a first-of-its-kind photonic and memristor-based CNN architecture for end-to-end training and prediction. We evaluate BPLight-CNN using a photonic CAD framework (IPKISS) on deep learning benchmark models, including LeNet and VGG-Net. The proposed design achieves (i) at least 34× speedup, 34× improvement in computational efficiency, and 38.5× energy savings during training; and (ii) 29× speedup, 31× improvement in computational efficiency, and 38.7× improvement in energy savings during inference compared with the state-of-the-art designs. All of these comparisons are done at a 16-bit resolution, and BPLight-CNN achieves these improvements at a cost of approximately 6% lower accuracy compared with the state-of-the-art.
- Published
- 2021
16. ROBIN: A Robust Optical Binary Neural Network Accelerator
- Author
-
Sudeep Pasricha, Febin Sunny, Mahdi Nikdast, and Asif Mirza
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Silicon photonics ,Artificial neural network ,business.industry ,Computer science ,Computer Science - Emerging Technologies ,02 engineering and technology ,020202 computer hardware & architecture ,Machine Learning (cs.LG) ,020210 optoelectronics & photonics ,Emerging Technologies (cs.ET) ,Computer engineering ,Hardware and Architecture ,Hardware Architecture (cs.AR) ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Photonics ,Latency (engineering) ,business ,Computer Science - Hardware Architecture ,Throughput (business) ,Software ,Energy (signal processing) ,Efficient energy use - Abstract
Domain specific neural network accelerators have garnered attention because of their improved energy efficiency and inference performance compared to CPUs and GPUs. Such accelerators are thus well suited for resource-constrained embedded systems. However, mapping sophisticated neural network models on these accelerators still entails significant energy and memory consumption, along with high inference time overhead. Binarized neural networks (BNNs), which utilize single-bit weights, represent an efficient way to implement and deploy neural network models on accelerators. In this paper, we present a novel optical-domain BNN accelerator, named ROBIN , which intelligently integrates heterogeneous microring resonator optical devices with complementary capabilities to efficiently implement the key functionalities in BNNs. We perform detailed fabrication-process variation analyses at the optical device level, explore efficient corrective tuning for these devices, and integrate circuit-level optimization to counter thermal variations. As a result, our proposed ROBIN architecture possesses the desirable traits of being robust, energy-efficient, low latency, and high throughput, when executing BNN models. Our analysis shows that ROBIN can outperform the best-known optical BNN accelerators and many electronic accelerators. Specifically, our energy-efficient ROBIN design exhibits energy-per-bit values that are ∼4 × lower than electronic BNN accelerators and ∼933 × lower than a recently proposed photonic BNN accelerator, while a performance-efficient ROBIN design shows ∼3 × and ∼25 × better performance than electronic and photonic BNN accelerators, respectively.
- Published
- 2021
- Full Text
- View/download PDF
17. Exploiting Process Variations to Secure Photonic NoC Architectures from Snooping Attacks
- Author
-
Sudeep Pasricha, Sairam Sri Vatsavai, Sai Vineel Reddy Chittamuru, Varun Bhat, and Ishan G. Thakkar
- Subjects
FOS: Computer and information sciences ,Hardware security module ,Authentication ,Multicast ,Computer science ,business.industry ,Computer Science - Emerging Technologies ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,020202 computer hardware & architecture ,Process variation ,Emerging Technologies (cs.ET) ,Hardware Trojan ,Wavelength-division multiplexing ,Hardware Architecture (cs.AR) ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,Unicast ,Photonics ,business ,Computer Science - Hardware Architecture ,Software ,Computer network - Abstract
The compact size and high wavelength-selectivity of microring resonators (MRs) enable photonic networks-on-chip (PNoCs) to utilize dense-wavelength-division-multiplexing (DWDM) in their photonic waveguides, and as a result, attain high bandwidth on-chip data transfers. Unfortunately, a Hardware Trojan in a PNoC can manipulate the electrical driving circuit of its MRs to cause the MRs to snoop data from the neighboring wavelength channels in a shared photonic waveguide, which introduces a serious security threat. This paper presents a framework that utilizes process variation-based authentication signatures along with architecture-level enhancements to protect against data-snooping Hardware Trojans during unicast as well as multicast transfers in PNoCs. Evaluation results indicate that our framework can improve hardware security across various PNoC architectures with minimal overheads of up to 14.2% in average latency and of up to 14.6% in energy-delay-product (EDP)., Pre-Print: Accepted in IEEE TCAD Journal on July 16, 2020
- Published
- 2020
18. Resilience-Aware Resource Management for Exascale Computing Systems
- Author
-
Sudeep Pasricha, Daniel Dauwe, Howard Jay Siegel, and Anthony A. Maciejewski
- Subjects
Mean time between failures ,Control and Optimization ,Computational Theory and Mathematics ,Hardware and Architecture ,Renewable Energy, Sustainability and the Environment ,Computer science ,Distributed computing ,Redundancy (engineering) ,Supercomputer ,Execution time ,Software ,Exascale computing ,Scheduling (computing) - Abstract
With the increases in complexity and number of nodes in large-scale high performance computing (HPC) systems over time, the probability of applications experiencing runtime failures has increased significantly. Projections indicate that exascale-sized systems are likely to operate with mean time between failures (MTBF) of as little as a few minutes. Several strategies have been proposed in recent years for enabling systems of these extreme sizes to be resilient against failures. This work provides a comparison of four state-of-the-art HPC resilience protocols that are being considered for use in exascale systems. We explore the behavior of each resilience protocol operating under the simulated execution of a diverse set of applications and study the performance degradation that a large-scale system experiences from the overhead associated with each resilience protocol as well as the re-computation needed to recover when a failure occurs. Using the results from these analyses, we examine how resource management on exascale systems can be improved by allowing the system to select the optimal resilience protocol depending upon each application's execution characteristics, as well as providing the system resource manager the ability to make scheduling decisions that are “resilience aware” through the use of more accurate execution time predictions.
- Published
- 2018
- Full Text
- View/download PDF
19. Minimizing Energy Costs for Geographically Distributed Heterogeneous Data Centers
- Author
-
Eric Jonardi, Mark A. Oxley, Howard Jay Siegel, Ninad Hogade, Sudeep Pasricha, and Anthony A. Maciejewski
- Subjects
Control and Optimization ,Operations research ,Renewable Energy, Sustainability and the Environment ,business.industry ,Computer science ,Electricity pricing ,Workload ,Net metering ,Renewable energy ,Cost reduction ,Computational Theory and Mathematics ,Peak demand ,Hardware and Architecture ,Peaking power plant ,Data center ,business ,Software - Abstract
The recent proliferation and associated high electricity costs of distributed data centers have motivated researchers to study energy-cost minimization at the geo-distributed level. The development of time-of-use (TOU) electricity pricing models and renewable energy source models has provided the means for researchers to reduce these high energy costs through intelligent geographical workload distribution. However, neglecting important considerations such as data center cooling power, interference effects from task co-location in servers, net-metering, and peak demand pricing of electricity has led to sub-optimal results in prior work because these factors have a significant impact on energy costs and performance. We propose a set of workload management techniques that take a holistic approach to the energy minimization problem for geo-distributed data centers. Our approach considers detailed data center cooling power, co-location interference, TOU electricity pricing, renewable energy, net metering, and peak demand pricing distribution models. We demonstrate the value of utilizing such information by comparing against geo-distributed workload management techniques that possess varying amounts of system information. Our simulation results indicate that our best proposed technique is able to achieve a 61 percent (on average) cost reduction compared to state-of-the-art prior work.
- Published
- 2018
- Full Text
- View/download PDF
20. DyPhase: A Dynamic Phase Change Memory Architecture With Symmetric Write Latency and Restorable Endurance
- Author
-
Sudeep Pasricha and Ishan G. Thakkar
- Subjects
010302 applied physics ,Random access memory ,Dynamic random-access memory ,Hardware_MEMORYSTRUCTURES ,Computer science ,business.industry ,02 engineering and technology ,Parallel computing ,01 natural sciences ,Computer Graphics and Computer-Aided Design ,020202 computer hardware & architecture ,law.invention ,Phase-change memory ,Memory management ,law ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Leverage (statistics) ,Electrical and Electronic Engineering ,Latency (engineering) ,business ,Software ,Dram - Abstract
A major challenge for the widespread adoption of phase change memory (PCM) as main memory is its asymmetric write latency. Generally, for a PCM, the latency of a SET operation (i.e., an operation that writes “1”) is 2–5 times longer than the latency of a RESET operation (i.e., an operation that writes “0”). For this reason, the average write latency of a PCM system is limited by the high-latency SET operations. This paper presents a novel PCM architecture called DyPhase , which uses partial-SET operations instead of the conventional SET operations to introduce a symmetry in write latency, thereby increasing write performance and throughput. However, use of partial-SET decreases data retention time. As a remedy to this problem, DyPhase employs novel distributed refresh operations in PCM that leverage the available power budget to periodically rewrite the stored data with minimal performance overhead. Unfortunately, the use of periodic refresh operations increases the write rate of the memory, which in turn accelerates memory degradation and decreases its lifetime. DyPhase overcomes this shortcoming by utilizing a proactive in-situ self-annealing (PISA) technique that periodically heals degraded memory cells, resulting in decelerated degradation and increased memory lifetime. Experiments with PARSEC benchmarks indicate that our DyPhase architecture-based hybrid dynamic random access memory (DRAM)–PCM memory system, when enabled with PISA, yields orders of magnitude higher lifetime, 8.3% less CPI, and 44.3% less EDP on average over other hybrid DRAM–PCM memory systems that utilize PCM architectures from prior works.
- Published
- 2018
- Full Text
- View/download PDF
21. Advanced Driver-Assistance Systems: A Path Toward Autonomous Vehicles
- Author
-
Jordan A. Tunnell, Sudeep Pasricha, Vipin Kumar Kukkala, and Thomas H. Bradley
- Subjects
Computer science ,business.industry ,020206 networking & telecommunications ,Ranging ,Advanced driver assistance systems ,02 engineering and technology ,Sensor fusion ,Computer Science Applications ,law.invention ,Human-Computer Interaction ,Lidar ,Software ,Hardware and Architecture ,Feature (computer vision) ,law ,0202 electrical engineering, electronic engineering, information engineering ,Systems engineering ,Key (cryptography) ,020201 artificial intelligence & image processing ,Electrical and Electronic Engineering ,Radar ,business - Abstract
Advanced driver-assistance systems (ADASs) have become a salient feature for safety in modern vehicles. They are also a key underlying technology in emerging autonomous vehicles. State-of-the-art ADASs are primarily vision based, but light detection and ranging (lidar), radio detection and ranging (radar), and other advanced-sensing technologies are also becoming popular. In this article, we present a survey of different hardware and software ADAS technologies and their capabilities and limitations. We discuss approaches used for vision-based recognition and sensor fusion in ADAS solutions. We also highlight challenges for the next generation of ADASs.
- Published
- 2018
- Full Text
- View/download PDF
22. Mixed-criticality scheduling on heterogeneous multicore systems powered by energy harvesting
- Author
-
Yi Xiang and Sudeep Pasricha
- Subjects
010302 applied physics ,Mixed criticality ,Multi-core processor ,Exploit ,Job shop scheduling ,Computer science ,Distributed computing ,02 engineering and technology ,Energy budget ,01 natural sciences ,020202 computer hardware & architecture ,Scheduling (computing) ,Hardware and Architecture ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,Performance improvement ,Energy harvesting ,Software - Abstract
In this paper, we address the scheduling problem for single-ISA heterogeneous multicore processors running hybrid mixed-criticality workloads with a limited and fluctuating energy budget provided by solar energy harvesting. The hybrid workloads consist of a set of firm-deadline timing-centric applications and a set of soft-deadline throughput-centric multithreaded applications. Our framework exploits traits of the different types of cores in heterogeneous multicore systems to service timing-centric workloads with a few big out-of-order cores, while servicing throughput-centric workloads with many smaller in-order cores clocked in the energy-efficient near-threshold computing (NTC) region. Guided by a novel timing intensity metric, our mixed-criticality scheduling framework creates an optimized schedule that minimizes overall miss penalty for a time-varying energy budget. Experimental results indicate that our framework achieves a 9.5% miss penalty reduction with the proposed timing intensity metric compared to metrics from prior work, a 13.6% performance improvement over a state-of-the-art scheduling approach for single-ISA heterogeneous platforms, and a 23.2% performance benefit from exploiting platform heterogeneity.
- Published
- 2018
- Full Text
- View/download PDF
23. Rate-based thermal, power, and co-location aware resource management for heterogeneous data centers
- Author
-
Sudeep Pasricha, Howard Jay Siegel, Mark A. Oxley, Gregory A. Koenig, Eric Jonardi, Patrick J. Burns, and Anthony A. Maciejewski
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Distributed computing ,Thermal power station ,Symmetric multiprocessor system ,Workload ,02 engineering and technology ,020202 computer hardware & architecture ,Theoretical Computer Science ,Artificial Intelligence ,Hardware and Architecture ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Resource allocation ,Resource management ,Data center ,Cache ,Electricity ,business ,Software - Abstract
Today’s data centers contain large numbers of compute nodes that require substantial power, and therefore require a large amount of cooling resources to operate at a reliable temperature. The high power consumption of the computing and cooling systems produces extraordinary electricity costs, requiring some data center operators to be constrained by a specified electricity budget. In addition, the processors within these systems contain a large number of cores with shared resources (e.g., last-level cache), heavily affecting the performance of tasks that are co-located on cores and contend for these resources. This problem is only exacerbated as processors move to the many-core realm. These issues lead to interesting performance-power tradeoffs; by considering resource management in a holistic fashion, the performance of the computing system can be maximized while satisfying power and temperature constraints. In this work, the performance of the system is quantified as the total reward earned from completing tasks by their individual deadlines. By designing three resource allocation techniques, we perform a rigorous analysis on thermal, power, and co-location aware resource management using two different facility configurations, three different workload environments, and a sensitivity analysis of the power and thermal constraints.
- Published
- 2018
- Full Text
- View/download PDF
24. HYDRA: Heterodyne Crosstalk Mitigation With Double Microring Resonators and Data Encoding for Photonic NoCs
- Author
-
Sudeep Pasricha, Sai Vineel Reddy Chittamuru, and Ishan G. Thakkar
- Subjects
010302 applied physics ,Heterodyne ,business.industry ,Computer science ,Detector ,02 engineering and technology ,Dissipation ,01 natural sciences ,Waveguide (optics) ,020202 computer hardware & architecture ,law.invention ,Crosstalk ,Resonator ,Hardware and Architecture ,law ,Wavelength-division multiplexing ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Electronic engineering ,Electrical and Electronic Engineering ,Photonics ,business ,Waveguide ,Software ,Intermodulation - Abstract
Silicon-photonic networks on chip (PNoCs) provide high bandwidth with lower data-dependent power dissipation than does the traditional electrical NoCs (ENoCs); therefore, they are promising candidates to replace ENoCs in future manycore chips. PNoCs typically employ photonic waveguides with dense wavelength division multiplexing (DWDM) for signal traversal and microring resonators (MRs) for signal modulation. Unfortunately, DWDM increases susceptibility to intermodulation (IM) and off-resonance filtering effects, which reduce optical signal-to-noise ratio (OSNR) for photonic data transfers. Additionally, process variations (PVs) induce variations in the width and thickness of MRs causing resonance wavelength shifts, which further reduce OSNR, and create communication errors. This paper proposes a novel cross-layer framework called HYDRA to mitigate heterodyne crosstalk due to PVs, off-resonance filtering, and IM effects in PNoCs. The framework consists of two device-level mechanisms and a circuit-level mechanism to improve heterodyne crosstalk resilience in PNoCs. Simulation results on three PNoC architectures indicate that HYDRA can improve the worst case OSNR by up to $5.3\times $ and significantly enhance the reliability of DWDM-based PNoC architectures.
- Published
- 2018
- Full Text
- View/download PDF
25. SWIFTNoC
- Author
-
Sudeep Pasricha, Sai Vineel Reddy Chittamuru, and Srinivas Desai
- Subjects
010302 applied physics ,Engineering ,Multi-core processor ,Silicon photonics ,Multicast ,business.industry ,Throughput ,02 engineering and technology ,Chip ,01 natural sciences ,020202 computer hardware & architecture ,Hardware and Architecture ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Bandwidth (computing) ,Electrical and Electronic Engineering ,Photonics ,business ,Software ,Communication channel - Abstract
On-chip communication is widely considered to be one of the major performance bottlenecks in contemporary chip multiprocessors (CMPs). With recent advances in silicon nanophotonics, photonics-based network-on-chip (NoC) architectures are being considered as a viable solution to support communication in future CMPs as they can enable higher bandwidth and lower power dissipation compared to traditional electrical NoCs. In this article, we present SwiftNoC , a novel reconfigurable silicon-photonic NoC architecture that features improved multicast-enabled channel sharing, as well as dynamic re-prioritization and exchange of bandwidth between clusters of cores running multiple applications, to increase channel utilization and system performance. Experimental results show that SwiftNoC improves throughput by up to 25.4× while reducing latency by up to 72.4% and energy-per-bit by up to 95% over state-of-the-art solutions.
- Published
- 2017
- Full Text
- View/download PDF
26. A Runtime Framework for Robust Application Scheduling With Adaptive Parallelism in the Dark-Silicon Era
- Author
-
Nishit Kapadia and Sudeep Pasricha
- Subjects
Multi-core processor ,Computer science ,business.industry ,020208 electrical & electronic engineering ,Multiprocessing ,02 engineering and technology ,Integrated circuit ,Energy consumption ,Chip ,020202 computer hardware & architecture ,law.invention ,Dynamic voltage scaling ,Reliability (semiconductor) ,Hardware and Architecture ,law ,Embedded system ,Dark silicon ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,business ,Software - Abstract
With deeper technology scaling accompanied by a worsening power wall, an increasing proportion of chip area on a chip multiprocessor (CMP) is expected to be occupied by dark silicon. At the same time, design challenges due to process variations and soft errors in integrated circuits are projected to become even more severe. It is well known that spatial variations in process parameters introduce significant unpredictability in the performance and power profiles of CMP cores. By mapping applications onto the best set of cores, process variations can potentially be used to our advantage in the dark-silicon era. In addition, the probability of occurrence of soft errors during application execution has been found to be strongly related to the supply voltage and operating frequency values, thus necessitating reliability awareness within runtime voltage scaling schemes in contemporary CMPs. In this paper, we present a novel framework that leverages the knowledge of variations on the chip to perform runtime application mapping and dynamic voltage scaling to optimize system performance and energy, while satisfying dark-silicon power constraints of the chip as well as application-specific performance and reliability constraints. Our experimental results show average savings of 10%–71% in application service times and 13%–38% in energy consumption, compared with prior work.
- Published
- 2017
- Full Text
- View/download PDF
27. Guest Editors’ Introduction: Design and Management of Mobile Platforms: From Smartphones to Wearable Devices
- Author
-
Umit Y. Ogras, Michael Kishinevsky, Raid Ayoub, and Sudeep Pasricha
- Subjects
Computer science ,business.industry ,Mobile computing ,Wearable computer ,Energy consumption ,System requirements ,Electric power system ,User experience design ,Hardware and Architecture ,Embedded system ,Resource management ,Electrical and Electronic Engineering ,business ,Software ,Wearable technology - Abstract
There are close to five million apps that run on more than a billion smartphones as of 2020 [1] . This number is likely to continue increasing rapidly with the technological advancements in mobile computing, smaller form-factor wearable computers, and the Internet-of-Things (IoT) devices. Although form factors and specific system requirements vary, mobile platforms share common design goals, which include energy-efficiency, competitive performance, battery life, and reliability. Competitive performance requires faster operating frequency and leads to higher power consumption. In turn, power consumption increases the junction and skin temperatures, which have adverse effects on the device reliability and user experience. Therefore, highly heterogeneous systems-on-chips (SoCs) are required to achieve the performance requirements in terms of tight power consumption, energy, and cost. The design of these platforms remains challenging. Moreover, application development, let alone aggressive optimization, is notoriously difficult and time-consuming when utilizing highly specialized accelerators. The optimization problem is exacerbated by dynamic variations of application workloads and operating conditions. As a result, there is a need for novel software- and hardware-based adaptive resource management approaches that consider the platform as a whole, rather than focusing on a subset of the target system.
- Published
- 2020
- Full Text
- View/download PDF
28. SHERPA: A Lightweight Smartphone Heterogeneity Resilient Portable Indoor Localization Framework
- Author
-
Sudeep Pasricha, Saideep Tiku, Qi Han, and Branislav M. Notaros
- Subjects
Noise measurement ,business.industry ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,Fingerprint recognition ,Porting ,020202 computer hardware & architecture ,Software ,Application domain ,Embedded system ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Transceiver ,business ,Mobile device - Abstract
Indoor localization is an emerging application domain that promises to enhance the way we navigate in various indoor environments, as well as track equipment and people. Wireless signal-based fingerprinting is one of the leading approaches for indoor localization. Using ubiquitous Wi-Fi access points and Wi-Fi transceivers in smartphones has enabled the possibility of fingerprinting-based localization techniques that are scalable and low-cost. But the variety of Wi-Fi hardware modules and software stacks used in today's smartphones introduce errors when using Wi-Fi based fingerprinting approaches across devices, which reduces localization accuracy. We propose a framework called SHERPA that enables efficient porting of indoor localization techniques across mobile devices, to maximize accuracy. An in-depth analysis of our framework shows that it can deliver up to 8× more accurate results as compared to state-of-the-art localization techniques for a variety of environments.
- Published
- 2019
- Full Text
- View/download PDF
29. Guest Editors’ Introduction: Emerging Networks-on-Chip – Designs, Technologies, and Applications
- Author
-
Edoardo Fusella, Jose Flich, Ian O'Connor, Sudeep Pasricha, Mahdi Nikdast, INL - Conception de Systèmes Hétérogènes (INL - CSH), Institut des Nanotechnologies de Lyon (INL), École Centrale de Lyon (ECL), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École supérieure de Chimie Physique Electronique de Lyon (CPE)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-École Centrale de Lyon (ECL), and Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
010302 applied physics ,Computer science ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,Computer architecture ,Hardware and Architecture ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,Networks on chip ,[SPI.NANO]Engineering Sciences [physics]/Micro and nanotechnologies/Microelectronics ,Software ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience
- Published
- 2019
- Full Text
- View/download PDF
30. HPC node performance and energy modeling with the co-location of applications
- Author
-
Sudeep Pasricha, Howard Jay Siegel, David A. Bader, Daniel Dauwe, Eric Jonardi, Ryan Friese, and Anthony A. Maciejewski
- Subjects
020203 distributed computing ,Multi-core processor ,Xeon ,Computer science ,Distributed computing ,Energy modeling ,02 engineering and technology ,Execution time ,020202 computer hardware & architecture ,Theoretical Computer Science ,Scheduling (computing) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Software ,Information Systems - Abstract
Multicore processors have become an integral part of modern large-scale and high-performance parallel and distributed computing systems. Unfortunately, applications co-located on multicore processors can suffer from decreased performance and increased dynamic energy use as a result of interference in shared resources, such as memory. As this interference is difficult to characterize, assumptions about application execution time and energy usage can be misleading in the presence of co-location. Consequently, it is important to accurately characterize the performance and energy usage of applications that execute in a co-located manner on these architectures. This work investigates some of the disadvantages of co-location, and presents a methodology for building models capable of utilizing varying amounts of information about a target application and its co-located applications to make predictions about the target application’s execution time and the system’s energy use under arbitrary co-locations of a wide range of application types. The proposed methodology is validated on three different server class Intel Xeon multicore processors using eleven applications from two scientific benchmark suites. The model’s utility for scheduling is also demonstrated in a simulated large-scale high-performance computing environment through the creation of a co-location aware scheduling heuristic. This heuristic demonstrates that scheduling using information generated with the proposed modeling methodology is capable of making significant improvements over a scheduling heuristic that is oblivious to co-location interference.
- Published
- 2016
- Full Text
- View/download PDF
31. A System-Level Cosynthesis Framework for Power Delivery and On-Chip Data Networks in Application-Specific 3-D ICs
- Author
-
Nishit Kapadia and Sudeep Pasricha
- Subjects
Engineering ,business.industry ,020208 electrical & electronic engineering ,Probabilistic logic ,Multiprocessing ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,MPSoC ,020202 computer hardware & architecture ,Power (physics) ,Network on a chip ,Hardware and Architecture ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,System on a chip ,Electrical and Electronic Engineering ,Routing (electronic design automation) ,business ,Metaheuristic ,Software - Abstract
With increasing core counts ushering in power-constrained 3-D multiprocessor system-on-chips (MPSoCs), optimizing communication power dissipated by the 3-D network-on-chip (NoC) fabric is critical. At the same time, with increased power densities in 3-D ICs, problems of IR drops in the power delivery network (PDN) as well as thermal hot spots on the 3-D die are becoming very severe. Even though the PDN and NoC design goals are nonoverlapping, both the optimizations are interdependent. Unfortunately, designers today seldom consider the design of the PDN, while designing NoCs. Moreover, for each new configuration of computation core and communication mapping on an MPSoC, the corresponding intercore communication patterns, 3-D on-chip thermal profile, as well as IR-drop distribution in the PDN can vary significantly. Based on this observation, we propose a novel design-time system-level application-specific cosynthesis framework that intelligently maps computation and communication resources on a die, for a given workload. The goal is to minimize the NoC power as well as chip-cooling power and optimize the 3-D PDN architecture; while meeting performance goals and satisfying thermal constraints, for a microfluidic cooling-based application-specific 3-D MPSoC. Our experimental results indicate that the proposed 3-D NoC-PDN cosynthesis framework is not only able to meet PDN design goals unlike prior 3-D NoC synthesis approaches, but also provides better overall optimality with the solution quality improvement of up to 35.4% over a probabilistic metaheuristic-based cooptimization approach proposed in prior work.
- Published
- 2016
- Full Text
- View/download PDF
32. A Hidden Markov Model based smartphone heterogeneity resilient portable indoor localization framework
- Author
-
Saideep Tiku, Qi Han, Branislav M. Notaros, and Sudeep Pasricha
- Subjects
010302 applied physics ,060102 archaeology ,Computer science ,business.industry ,Real-time computing ,06 humanities and the arts ,01 natural sciences ,Porting ,Variety (cybernetics) ,Software ,Hardware and Architecture ,Application domain ,0103 physical sciences ,Scalability ,0601 history and archaeology ,Transceiver ,Hidden Markov model ,business ,Mobile device - Abstract
Indoor localization is an emerging application domain that promises to enhance the way we navigate in various indoor environments, as well as track equipment and people. Wireless signal-based fingerprinting is one of the leading approaches for indoor localization. Using ubiquitous Wi-Fi access points and Wi-Fi transceivers in smartphones has enabled the possibility of fingerprinting-based localization techniques that are scalable and low-cost. But the variety of Wi-Fi hardware modules and software stacks used in today's smartphones introduce errors when using Wi-Fi based fingerprinting approaches across devices, which reduces localization accuracy. We propose a framework called SHERPA-HMM that enables efficient porting of indoor localization techniques across mobile devices, to maximize accuracy. An in-depth analysis of our framework shows that it can deliver up to 8× more accurate results as compared to state-of-the-art localization techniques for a variety of environments.
- Published
- 2020
- Full Text
- View/download PDF
33. Introduction to HCW 2018
- Author
-
Sudeep Pasricha and Alexey Lastovetsky
- Subjects
Software ,Computer architecture ,Grid computing ,Computer science ,business.industry ,business ,computer.software_genre ,computer ,Heterogeneous network - Published
- 2018
- Full Text
- View/download PDF
34. Run-Time Management for Multicore Embedded Systems With Energy Harvesting
- Author
-
Sudeep Pasricha and Yi Xiang
- Subjects
Multi-core processor ,Engineering ,business.industry ,Heuristic (computer science) ,Energy management ,Workload ,Task (project management) ,Hardware and Architecture ,Embedded system ,Overhead (computing) ,Electrical and Electronic Engineering ,Heuristics ,business ,Frequency scaling ,Software - Abstract
In this paper, we propose a novel framework for runtime energy and workload management in multicore embedded systems with solar energy harvesting and a periodic hard real-time task set as the workload. Compared with prior work, our framework makes several novel contributions and possesses several advantages, including the following: 1) a semidynamic scheduling heuristic that dynamically adapts to runtime harvested power variations without losing the consistency of periodic tasks; 2) a battery–supercapacitor hybrid energy storage module for more efficient system energy management; 3) a coarse-grained core shutdown heuristic for additional energy saving; 4) energy budget planning and task allocation heuristics with process variation tolerance; 5) a novel dual-speed method specifically designed for periodic tasks to address discrete frequency levels and dynamic voltage/frequency scaling switching overhead at the core level; and 6) an extension to prepare the system for thermal issues arising at runtime during extreme environmental conditions. The experimental studies show that our framework results in a reduction in task miss rate by up to 70% and task miss penalty by up to 65% compared with the best known prior work.
- Published
- 2015
- Full Text
- View/download PDF
35. 3-D WiRED: A Novel WIDE I/O DRAM With Energy-Efficient 3-D Bank Organization
- Author
-
Ishan G. Thakkar and Sudeep Pasricha
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer science ,business.industry ,Energy consumption ,CAS latency ,Hardware and Architecture ,Universal memory ,Embedded system ,Memory architecture ,Electrical and Electronic Engineering ,Architecture ,Latency (engineering) ,business ,Software ,Dram ,Efficient energy use - Abstract
Editor’s notes: WIDE I/O DRAM is a promising 3-D memoryarchitecture for low-power/highperformance computing. This paper proposes a new WIDE I/O DRAM architecture to reduce access latency and energy consumption at the same time, which shows the possibility of further optimization of the WIDE I/O DRAM architecture and the impact of TSV usage in the memory architecture on the performance and energy consumption.
- Published
- 2015
- Full Text
- View/download PDF
36. A middleware framework for application-aware and user-specific energy optimization in smart mobile devices
- Author
-
Chris Ohlsen, Sudeep Pasricha, and Brad K. Donohoo
- Subjects
Ubiquitous computing ,Computer Networks and Communications ,business.industry ,Computer science ,Energy consumption ,Backlight ,Computer Science Applications ,Hardware and Architecture ,Embedded system ,Specific energy ,Markov decision process ,Android (operating system) ,business ,Classifier (UML) ,Mobile device ,Software ,Information Systems - Abstract
Mobile battery-operated devices are becoming an essential instrument for business, communication, and social interaction. In addition to the demand for an acceptable level of performance and a comprehensive set of features, users often desire extended battery lifetime. In fact, limited battery lifetime is one of the biggest obstacles facing the current utility and future growth of increasingly sophisticated "smart" mobile devices. This paper proposes a novel application-aware and user-interaction aware energy optimization middleware framework (AURA) for pervasive mobile devices. AURA optimizes CPU and screen backlight energy consumption while maintaining a minimum acceptable level of performance. The proposed framework employs a novel Bayesian application classifier and management strategies based on Markov Decision Processes and Q-Learning to achieve energy savings. Real-world user evaluation studies on Google Android based HTC Dream and Google Nexus One smartphones running the AURA framework demonstrate promising results, with up to 29% energy savings compared to the baseline device manager, and up to 5i?savings over prior work on CPU and backlight energy co-optimization.
- Published
- 2015
- Full Text
- View/download PDF
37. Crosstalk Mitigation for High-Radix and Low-Diameter Photonic NoC Architectures
- Author
-
Sudeep Pasricha and Sai Vineel Reddy Chittamuru
- Subjects
Engineering ,Multi-core processor ,Interconnection ,business.industry ,Detector ,Crosstalk ,Resonator ,Network on a chip ,Hardware and Architecture ,Electronic engineering ,Electrical and Electronic Engineering ,Photonics ,Crossbar switch ,business ,Software - Abstract
Photonic Network-on-chip (PNoC) is a promising alternative to design low-power and high-bandwidth interconnection infrastructure for multicore chips. The micro ring resonators, which are essential building blocks for designing PNoCs are susceptible to crosstalk that can notably degrade signal-to-noise ratio (SNR), reducing reliability of PNoCs. This paper proposes two novel encoding mechanisms to improve worst-case SNR by reducing crosstalk noise in microring resonators used within high-radix and low-diameter crossbar-based PNoCs.
- Published
- 2015
- Full Text
- View/download PDF
38. Message from the IEEE TrustCom/BigDataSE/ICESS 2017 General Chairs
- Author
-
Prasant Mohapatra, Sudeep Pasricha, Xiangjian He, Ravi Sandhu, Daniel Mosse, Jie Lu, Song Guo, Beniamino Di Martino, He, Xiangjian, Mohapatra, Prasant, Sandhu, Ravi, Guo, Song, Di Martino, Beniamino, Lu, Jie, Mosse, Daniel, and Pasricha, Sudeep
- Subjects
Computer Networks and Communication ,Information Systems and Management ,Multimedia ,Computer science ,Information System ,computer.software_genre ,Safety, Risk, Reliability and Quality ,computer ,Software - Published
- 2017
39. Silicon Nanophotonics for Future Multicore Architectures: Opportunities and Challenges
- Author
-
Yi Xu and Sudeep Pasricha
- Subjects
Network architecture ,Multi-core processor ,Silicon ,Computer science ,Nanophotonics ,chemistry.chemical_element ,Hardware_PERFORMANCEANDRELIABILITY ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Bandwidth allocation ,chemistry ,Hardware and Architecture ,Hardware_INTEGRATEDCIRCUITS ,Electronic engineering ,Electrical and Electronic Engineering ,Software - Abstract
This article surveys emerging silicon nanophotonic architectures, protocols, and components, for intra-chip and inter-chip communication. It also examines the challenges in silicon nanophotonics research, presents some of the more prominent emerging solutions, and provides a critical view of their advantages and limitations.
- Published
- 2014
- Full Text
- View/download PDF
40. METEOR
- Author
-
Sudeep Pasricha and Shirish Bahirat
- Subjects
Multi-core processor ,business.industry ,Computer science ,Mesh networking ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Chip ,Network on a chip ,Hardware and Architecture ,Embedded system ,Scalability ,Hardware_INTEGRATEDCIRCUITS ,Photonics ,business ,Massively parallel ,Software ,Data transmission - Abstract
With increasing application complexity and improvements in process technology, Chip MultiProcessors (CMPs) with tens to hundreds of cores on a chip are becoming a reality. Networks-on-Chip (NoCs) have emerged as scalable communication fabrics that can support high bandwidths for these massively parallel multicore systems. However, traditional electrical NoC implementations still need to overcome the challenges of high data transfer latencies and large power consumption. On-chip photonic interconnects with high performance-per-watt characteristics have recently been proposed as an alternative to address these challenges for intra-chip communication. In this article, we explore using low-cost photonic interconnects on a chip to enhance traditional electrical NoCs. Our proposed hybrid photonic ring-mesh NoC (METEOR) utilizes a configurable photonic ring waveguide coupled to a traditional 2D electrical mesh NoC. Experimental results indicate a strong motivation to consider the proposed architecture for future CMPs, as it can provide about 5× reduction in power consumption and improved throughput and access latencies, compared to traditional electrical 2D mesh and torus NoC architectures. Compared to other previously proposed hybrid photonic NoC fabrics such as the hybrid photonic torus, Corona, and Firefly, our proposed fabric is also shown to have lower photonic area overhead, power consumption, and energy-delay product, while maintaining competitive throughput and latency.
- Published
- 2014
- Full Text
- View/download PDF
41. Guest Editorial: Special Issue on Low-Power Dependable Computing
- Author
-
Sudeep Pasricha, Man Lin, Dakai Zhu, and Muhammad Shafique
- Subjects
Control and Optimization ,Renewable Energy, Sustainability and the Environment ,business.industry ,Computer science ,Distributed computing ,Reliability (computer networking) ,Fault tolerance ,Energy consumption ,Application software ,computer.software_genre ,Software ,Computational Theory and Mathematics ,Hardware and Architecture ,Transient (computer programming) ,Compiler ,business ,computer ,Efficient energy use - Abstract
The papers in this special section focus on low power dependable computing systems (LPDC). Faults (especially transient faults that lead to soft errors) have become more common due to the miniaturization of computing systems with continuously scaled technology sizes. Thus, it is imperative for most modern computing systems to deploy one or more types of fault-tolerance techniques. Traditionally, fault tolerance has been achieved through various error reduction, detection, and recovery techniques at different levels of the hardware/software stacks (e.g., circuit, architecture, operating systems, compiler, and application software), which generally incur power and energy overheads. Given the fact that energy has become a first-class system resource (especially for batterypowered mobile and IoT devices), it is important to understand the interdependencies between system reliability and power/energy consumption, and further investigate techniques that can address their tradeoffs. This special issue on Low-Power Dependable Computing (LPDC) seeks to tackle these challenges by exploring novel and bold ideas to achieve energy-efficient reliable computations in modern computing systems.
- Published
- 2018
- Full Text
- View/download PDF
42. MultiMaKe
- Author
-
Nikil Dutt, Sudeep Pasricha, Luis Angel D. Bathen, and Yongjin Ahn
- Subjects
Computer science ,Design flow ,Multiprocessing ,computer.file_format ,Parallel computing ,Chip ,Power budget ,Scheduling (computing) ,Software pipelining ,Hardware and Architecture ,JPEG 2000 ,computer ,Software ,Scratchpad memory - Abstract
The increasing demand for low-power and high-performance multimedia embedded systems has motivated the need for effective solutions to satisfy application bandwidth and latency requirements under a tight power budget. As technology scales, it is imperative that applications are optimized to take full advantage of the underlying resources and meet both power and performance requirements. We propose MultiMaKe, an application mapping design flow capable of discovering and enabling parallelism opportunities via code transformations, efficiently distributing the computational load across resources, and minimizing unnecessary data transfers. Our approach decomposes the application's tasks into smaller units of computations called kernels, which are distributed and pipelined across the different processing resources. We exploit the ideas of inter-kernel data reuse to minimize unnecessary data transfers between kernels, early execution edges to drive performance, and kernel pipelining to increase system throughput. Our experimental results on JPEG and JPEG2000 show up to 97% off-chip memory access reduction, and up to 80% execution time reduction over standard mapping and task-level pipelining approaches.
- Published
- 2013
- Full Text
- View/download PDF
43. A framework for low power synthesis of interconnection networks-on-chip with multiple voltage islands
- Author
-
Nishit Kapadia and Sudeep Pasricha
- Subjects
Engineering ,Interconnection ,Hardware_MEMORYSTRUCTURES ,business.industry ,Hardware_PERFORMANCEANDRELIABILITY ,Dissipation ,Chip ,Power (physics) ,Set (abstract data type) ,Core (game theory) ,Network on a chip ,Hardware and Architecture ,Embedded system ,Hardware_INTEGRATEDCIRCUITS ,Electrical and Electronic Engineering ,business ,Heuristics ,Software - Abstract
The problem of VI-aware Network-on-Chip (NoC) design is extremely challenging, especially with the increasing core counts in today's power-hungry Chip Multiprocessors (CMPs). In this paper, we propose a novel framework for automating the synthesis of regular NoCs with VIs, to satisfy application performance constraints while minimizing chip power dissipation. Our proposed framework uses a set of novel algorithms and heuristics to generate solutions that reduce network traffic by up to 62%, communication power by up to 32%, and total chip power dissipation by up to 13%, compared to the best known prior work that also solves the same problem.
- Published
- 2012
- Full Text
- View/download PDF
44. Deadline and energy constrained dynamic resource allocation in a heterogeneous computing environment
- Author
-
Jonathan Apodaca, Howard Jay Siegel, B. Dalton Young, Anthony A. Maciejewski, Yong Zou, Sudeep Pasricha, Bhavesh Khemka, Jay Smith, Adrian Ramirez, Shirish Bahirat, and Luis Diego Briceno
- Subjects
Heuristic ,Computer science ,Distributed computing ,Symmetric multiprocessor system ,Energy consumption ,Theoretical Computer Science ,Task (project management) ,Hardware and Architecture ,Resource allocation ,Heuristics ,Frequency scaling ,Software ,Energy (signal processing) ,Information Systems - Abstract
Energy-efficient resource allocation within clusters and data centers is important because of the growing cost of energy. We study the problem of energy-constrained dynamic allocation of tasks to a heterogeneous cluster computing environment. Our goal is to complete as many tasks by their individual deadlines and within the system energy constraint as possible given that task execution times are uncertain and the system is oversubscribed at times. We use Dynamic Voltage and Frequency Scaling (DVFS) to balance the energy consumption and execution time of each task. We design and evaluate (via simulation) a set of heuristics and filtering mechanisms for making allocations in our system. We show that the appropriate choice of filtering mechanisms improves performance more than the choice of heuristic (among the heuristics we tested).
- Published
- 2012
- Full Text
- View/download PDF
45. A Multi-Granularity Power Modeling Methodology for Embedded Processors
- Author
-
Sudeep Pasricha, Nikil Dutt, Fadi J. Kurdahi, and Young-Hwan Park
- Subjects
Instruction set ,Electronic system-level design and verification ,Instruction set simulator ,Computer architecture ,Hardware and Architecture ,Computer science ,Design flow ,Multiprocessing ,Granularity ,OpenRISC ,Electrical and Electronic Engineering ,Software ,Power optimization - Abstract
With power becoming a major constraint for multiprocessor embedded systems, it is becoming important for designers to characterize and model processor power dissipation. It is critical for these processor power models to be useable across various modeling abstractions in an electronic system level (ESL) design flow, to guide early design decisions. In this paper, we propose a unified processor power modeling methodology for the creation of power models at multiple granularity levels that can be quickly mapped to an ESL design flow. Our experimental results based on applying the proposed methodology on the OpenRISC and MIPS processors demonstrate the usefulness of having multiple power models. The generated models range from very high-level two-state and architectural/instruction set simulator models that can be used in transaction level models, to extremely detailed cycle-accurate models that enable early exploration of power optimization techniques. These models offer a designer tremendous flexibility to trade off estimation accuracy with estimation/simulation effort.
- Published
- 2011
- Full Text
- View/download PDF
46. Guest Editors' Introduction: Silicon Nanophotonics for Future Multicore Architectures
- Author
-
Sudeep Pasricha and Yi Xu
- Subjects
Network architecture ,Multi-core processor ,Computer architecture ,Hardware and Architecture ,Computer science ,Nanophotonics ,Special section ,High bandwidth ,Parallel computing ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Electrical and Electronic Engineering ,Chip ,Software - Abstract
The articles in this special section discuss the applications and services provided by silison nanophotonics for multicore network architectures. The need for high-performance and energy-efficient communication between processing cores has never been more critical. The increase in core counts in emerging chip multiprocessors (CMPs) has put more pressure on the communication fabric to support many more streams of high bandwidth data transfers than ever before. An important consequence of this trend is that chip power and performance are now beginning to be dominated not by processor cores but by the components that facilitate transport of data between processors and to memory.
- Published
- 2014
- Full Text
- View/download PDF
47. Evaluating Carbon Nanotube Global Interconnects for Chip Multiprocessor Applications
- Author
-
Nikil Dutt, Sudeep Pasricha, and Fadi J. Kurdahi
- Subjects
Interconnection ,Electronic system-level design and verification ,Computer science ,Hardware_PERFORMANCEANDRELIABILITY ,Energy consumption ,Integrated circuit design ,Integrated circuit ,Chip ,law.invention ,Quantum capacitance ,Hardware and Architecture ,law ,Hardware_INTEGRATEDCIRCUITS ,Electronic engineering ,Miniaturization ,Electrical and Electronic Engineering ,Software - Abstract
In ultra-deep submicrometer (UDSM) technologies, the current paradigm of using copper (Cu) interconnects for on-chip global communication is rapidly becoming a serious performance bottleneck. In this paper, we perform a system level evaluation of Carbon Nanotube (CNT) interconnect alternatives that may replace conventional Cu interconnects. Our analysis explores the impact of using CNT global interconnects on the performance and energy consumption of several multi-core chip multiprocessor (CMP) applications. Results from our analysis indicate that with improvements in fabrication technology, CNT-based global interconnects can significantly outperform Cu-based global interconnects.
- Published
- 2010
- Full Text
- View/download PDF
48. CAPPS: A Framework for Power–Performance Tradeoffs in Bus-Matrix-Based On-Chip Communication Architecture Synthesis
- Author
-
Young-Hwan Park, Sudeep Pasricha, Fadi J. Kurdahi, and Nikil Dutt
- Subjects
Engineering ,Space technology ,Speedup ,business.industry ,Multiprocessing ,Energy consumption ,Network on a chip ,Hardware and Architecture ,High-level synthesis ,Embedded system ,System on a chip ,Electrical and Electronic Engineering ,business ,Software architecture ,Software - Abstract
On-chip communication architectures have a significant impact on the power consumption and performance of emerging chip multiprocessor (CMP) applications. However, customization of such architectures for an application requires the exploration of a large design space. Designers need tools to rapidly explore and evaluate relevant communication architecture configurations exhibiting diverse power and performance characteristics. In this paper, we present an automated framework for fast system-level, application-specific, power-performance tradeoffs in a bus matrix communication architecture synthesis (CAPPS). Our study makes two specific contributions. First, we develop energy models for system-level exploration of bus matrix communication architectures. Second, we incorporate these models into a bus matrix synthesis flow that enables designers to efficiently explore the power-performance design space of different bus matrix configurations. Experimental results show that our energy macromodels incur less than 5% average cycle energy error across 180-65 nm technology libraries. Our early system-level power estimation approach also shows a significant speedup ranging from 1000 to 2000× when compared with detailed gate-level power estimation. Furthermore, on applying our synthesis framework to three industrial networking CMP applications, a tradeoff space that exhibits up to 20% variation in power and up to 40% variation in performance is generated, demonstrating the usefulness of our approach.
- Published
- 2010
- Full Text
- View/download PDF
49. Adaptive Scratch Pad Memory Management for Dynamic Behavior of Multimedia Applications
- Author
-
Nikil Dutt, Sudeep Pasricha, Yunheung Paek, Minwook Ahn, Ilya Issenin, and Doosan Cho
- Subjects
Dynamic random-access memory ,Hardware_MEMORYSTRUCTURES ,Memory hierarchy ,Multimedia ,Computer science ,business.industry ,Optimizing compiler ,Multiprocessing ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,law.invention ,Memory management ,Data access ,Computer architecture ,Memory ordering ,law ,Embedded system ,Compiler ,Electrical and Electronic Engineering ,business ,computer ,Software - Abstract
Exploiting runtime memory access traces can be a complementary approach to compiler optimizations for the energy reduction in memory hierarchy. This is particularly important for emerging multimedia applications since they usually have input-sensitive runtime behavior which results in dynamic and/or irregular memory access patterns. These types of applications are normally hard to optimize by static compiler optimizations. The reason is that their behavior stays unknown until runtime and may even change during computation. To tackle this problem, we propose an integrated approach of software [compiler and operating system (OS)] and hardware (data access record table) techniques to exploit data reusability of multimedia applications in Multiprocessor Systems on Chip. Guided by compiler analysis for generating scratch pad data layouts and hardware components for tracking dynamic memory accesses, the scratch pad data layout adapts to an input data pattern with the help of a runtime scratch pad memory manager incorporated in the OS. The runtime data placement strategy presented in this paper provides efficient scratch pad utilization for the dynamic applications. The goal is to minimize the amount of accesses to the main memory over the entire runtime of the system, which leads to a reduction in the energy consumption of the system. Our experimental results show that our approach is able to significantly improve the energy consumption of multimedia applications with dynamic memory access behavior over an existing compiler technique and an alternative hardware technique.
- Published
- 2009
- Full Text
- View/download PDF
50. Fast exploration of bus-based communication architectures at the CCATB abstraction
- Author
-
Nikil Dutt, Mohamed Ben-Romdhane, and Sudeep Pasricha
- Subjects
Communication design ,Speedup ,Computer science ,business.industry ,Distributed computing ,Design flow ,Space exploration ,Abstraction layer ,Hardware and Architecture ,Embedded system ,Transaction-level modeling ,System on a chip ,business ,Software ,Abstraction (linguistics) - Abstract
Currently, system-on-chip (SoC) designs are becoming increasingly complex, with more and more components being integrated into a single SoC design. Communication between these components is increasingly dominating critical system paths and frequently becomes the source of performance bottlenecks. It, therefore, becomes imperative for designers to explore the communication space early in the design flow. Traditionally, system designers have used Pin-Accurate Bus Cycle Accurate (PA-BCA) models for early communication space exploration. These models capture all of the bus signals and strictly maintain cycle accuracy, which is useful for reliable performance exploration but results in slow simulation speeds for complex, designs, even when they are modeled using high-level languages. Recently, there have been several efforts to use the Transaction-Level Modeling (TLM) paradigm for improving simulation performance in BCA models. However, these transaction-based BCA (T-BCA) models capture a lot of details that can be eliminated when exploring communication architectures. In this paper, we extend the TLM approach and propose a new transaction-based modeling abstraction level (CCATB) to explore the communication design space. Our abstraction level bridges the gap between the TLM and BCA levels, and yields an average performance speedup of 120% over PA-BCA and 67% over T-BCA models, on average. The CCATB models are not only faster to simulate, but also extremely accurate and take less time to model compared to both T-BCA and PA-BCA models. We describe the mechanisms that produce the speedup in CCATB models and also analyze how the achieved simulation speedup scales with design complexity. To demonstrate the effectiveness of using CCATB for exploration, we present communication space exploration case studies from the broadband communication and multimedia application domains.
- Published
- 2008
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.