10 results on '"Kastner, Ryan"'
Search Results
2. Tailor: Altering Skip Connections for Resource-Efficient Inference
- Author
-
Weng, Olivia, Marcano, Gabriel, Loncar, Vladimir, Khodamoradi, Alireza, G, Abarajithan, Sheybani, Nojan, Meza, Andres, Koushanfar, Farinaz, Denolf, Kristof, Duarte, Javier Mauricio, and Kastner, Ryan
- Subjects
Information and Computing Sciences ,Engineering ,Electrical Engineering ,Machine Learning ,Electrical and Electronic Engineering ,Computer Hardware ,Electronics ,sensors and digital hardware ,Distributed computing and systems software - Abstract
Deep neural networks use skip connections to improve training convergence. However, these skip connections are costly in hardware, requiring extra buffers and increasing on- and off-chip memory utilization and bandwidth requirements. In this paper, we show that skip connections can be optimized for hardware when tackled with a hardware-software codesign approach. We argue that while a network’s skip connections are needed for the network to learn, they can later be removed or shortened to provide a more hardware efficient implementation with minimal to no accuracy loss. We introduce Tailor , a codesign tool whose hardware-aware training algorithm gradually removes or shortens a fully trained network’s skip connections to lower their hardware cost. Tailor improves resource utilization by up to 34% for BRAMs, 13% for FFs, and 16% for LUTs for on-chip, dataflow-style architectures. Tailor increases performance by 30% and reduces memory bandwidth by 45% for a 2D processing element array architecture.
- Published
- 2023
3. Memory-Based High-Level Synthesis Optimizations Security Exploration on the Power Side-Channel
- Author
-
Zhang, Lu, Mu, Dejun, Hu, Wei, Tai, Yu, Blackstone, Jeremy, and Kastner, Ryan
- Subjects
Optimization ,Hardware ,Ciphers ,Tools ,Random access memory ,Design space exploration ,hardware security ,high-level synthesis ,power side-channel evaluation ,Electrical and Electronic Engineering ,Computer Hardware ,Computer Hardware & Architecture - Abstract
High-level synthesis (HLS) allows hardware designers to think algorithmically and not worry about low-level, cycle-by-cycle details. This provides the ability to quickly explore the architectural design space and tradeoffs between resource utilization and performance. Unfortunately, security evaluation is not a standard part of the HLS design flow. In this article, we aim to understand the effects of memory-based HLS optimizations on power side-channel leakage. We use Xilinx Vivado HLS to develop different cryptographic cores, implement them on a Spartan-6 FPGA, and collect power traces. We evaluate the designs with respect to resource utilization, performance, and information leakage through power consumption. We have two important observations and contributions. First, the choice of resource optimization directive results in different levels of side-channel vulnerabilities. Second, the partitioning optimization directive can greatly compromise the hardware cryptographic system through power side-channel leakage due to the deployment of memory control logic. We describe an evaluation procedure for power side-channel leakage and use it to make best-effort recommendations about how to design more secure architectures in the cryptographic domain.
- Published
- 2020
4. Quantitative Analysis of Timing Channel Security in Cryptographic Hardware Design
- Author
-
Mao, Baolei, Hu, Wei, Althoff, Alric, Matai, Janarbek, Tai, Yu, Mu, Dejun, Sherwood, Timothy, and Kastner, Ryan
- Subjects
Hardware security ,cryptographic function ,timing channel ,information flow ,security metric ,Electrical and Electronic Engineering ,Computer Hardware ,Computer Hardware & Architecture - Abstract
Cryptographic cores are known to leak information about their private key due to runtime variations, and there are many well-known attacks that can exploit this timing channel. In this paper, we study how information theoretic measures can quantify the amount of key leakage that can be exacted from runtime measurements. We develop and analyze 22 Rivest-Shamir-Adleman (RSA) hardware designs - each with unique performance optimizations, timing channel mitigation techniques, or discretization/randomization countermeasures. We demonstrate the effectiveness of information theoretic measures for quantifying timing leakage through correlation analysis of information theoretic measurements and attack results. Experimental results show that mutual information is a promising technique for quantifying timing leakage for RSA, advanced encryption standard, and elliptic curve cryptography ciphers, i.e., the mutual information correlates to being able to successfully guess the value of the private key. This is an important step toward a hardware security metric which allows designers to reason about security alongside traditional hardware design metrics like area, performance, and power.
- Published
- 2018
5. Hardware Accelerated Alignment Algorithm for Optical Labeled Genomes
- Author
-
Meng, Pingfan, Jacobsen, Matthew, Kimura, Motoki, Dergachev, Vladimir, Anantharaman, Thomas, Requa, Michael, and Kastner, Ryan
- Subjects
Bioengineering ,Genetics ,Human Genome ,Generic health relevance ,Design ,Performance ,Experimentation ,Genome ,acceleration ,de novo assembly ,GPU ,FPGA ,Electrical and Electronic Engineering ,Computer Hardware - Abstract
De novo assembly is a widely used methodology in bioinformatics. However, the conventional short-readbased de novo assembly is incapable of reliably reconstructing the large-scale structures of human genomes. Recently, a novel optical label-based technology has enabled reliable large-scale de novo assembly. Despite its advantage in large-scale genome analysis, this new technology requires a more computationally intensive alignment algorithm than its conventional counterpart. For example, the runtime of reconstructing a human genome is on the order of 10,000 hours on a sequential CPU. Therefore, in order to practically apply this new technology in genome research, accelerated approaches are desirable. In this article, we present three different accelerated approaches, multicore CPU, GPU, and FPGA. Against the sequential software baseline, our multicore CPU design achieved an 8.4× speedup, while the GPU and FPGA designs achieved 13.6× and 115× speedups, respectively. We also discuss the details of the design space exploration of this new assembly algorithm on these three different devices. Finally, we compare these devices in performance, optimization techniques, prices, and design efforts.
- Published
- 2016
6. RIFFA 2.1: A Reusable Integration Framework for FPGA Accelerators
- Author
-
Jacobsen, Matthew, Richmond, Dustin, Hogains, Matthew, and Kastner, Ryan
- Subjects
FPGA ,Communication ,Framework ,Performance ,communication ,synchronization ,integration ,framework ,Electrical and Electronic Engineering ,Computer Hardware - Published
- 2015
7. Gate-Level Information Flow Tracking for Security Lattices
- Author
-
Hu, Wei, Mu, Dejun, Oberg, Jason, Mao, Baolei, Tiwari, Mohit, Sherwood, Timothy, and Kastner, Ryan
- Subjects
Security ,Design ,Verification ,High-assurance system ,hardware security ,gate-level information flow tracking ,multilevel security ,security lattice ,formal method ,Computer Software ,Computer Hardware ,Design Practice & Management - Abstract
High-assurance systems found in safety-critical infrastructures are facing steadily increasing cyber threats. These critical systems require rigorous guarantees in information flow security to prevent confidential information from leaking to an unclassified domain and the root of trust from being violated by an untrusted party. To enforce bit-tight information flow control, gate-level information flow tracking (GLIFT) has recently been proposed to precisely measure and manage all digital information flows in the underlying hardware, including implicit flows through hardware-specific timing channels. However, existing work in this realm either restricts to two-level security labels or essentially targets two-input primitive gates and several simple multilevel security lattices. This article provides a general way to expand the GLIFT method for multilevel security. Specifically, it formalizes tracking logic for an arbitrary Boolean gate under finite security lattices, presents a precise tracking logic generation method for eliminating false positives in GLIFT logic created in a constructive manner, and illustrates application scenarios of GLIFT for enforcing multilevel information flow security. Experimental results show various trade-offs in precision and performance of GLIFT logic created using different methods. It also reveals the area and performance overheads that should be expected when expanding GLIFT for multilevel security.
- Published
- 2014
8. Leveraging Gate-Level Properties to Identify Hardware Timing Channels
- Author
-
Oberg, Jason, Meiklejohn, Sarah, Sherwood, Timothy, and Kastner, Ryan
- Subjects
Hardware security ,information flow tracking ,testing ,timing channels ,Electrical and Electronic Engineering ,Computer Hardware ,Computer Hardware & Architecture - Abstract
Modern embedded computing systems such as medical devices, airplanes, and automobiles continue to dominate some of the most critical aspects of our lives. In such systems, the movement of information throughout a device must be tightly controlled to prevent violations of privacy or integrity. Unfortunately, bounding the flow of information can often present a significant challenge, as information can flow through channels that are difficult to detect, such as timing channels. As has been demonstrated by recent research in hardware security, information flow tracking techniques deployed at the hardware or gate level show promise at identifying these 'timing flows' but provide no formal statements about this claim \({\scriptstyle\text{NOR}}\) mechanisms for separating out timing information from other types of flows. In this paper, we first prove that gate-level information flow tracking can in fact detect timing flows. In addition, we work to identify these timing flows separately from other flows by presenting a framework for identifying a different type of flow that we call functional flows. By using this framework to either confirm or rule out the existence of such flows, we leverage the previous work in hardware information flow tracking to effectively isolate timing flows. To show the effectiveness of this model, we demonstrate its usage on three practical examples: a shared bus (I \(^{2}\) C), a cache in a MIPS-based processor, and an RSA encryption core, all of which were written in Verilog/VHDL and then simulated in a variety of scenarios. In each scenario, we demonstrate how our framework can be used to identify timing and functional flows and also analyze our model's overhead. © 1982-2012 IEEE.
- Published
- 2014
9. Networks on Chip with Provable Security Properties
- Author
-
Wassel, Hassan MG, Gao, Ying, Oberg, Jason K, Huffmire, Ted, Kastner, Ryan, Chong, Frederic T, and Sherwood, Timothy
- Subjects
Electrical and Electronic Engineering ,Computer Hardware ,Computer Hardware & Architecture - Abstract
In systems where a lack of safety or security guarantees can be catastrophic or even fatal, noninterference is used to separate domains handling critical (or confidential) information from those processing normal (or unclassified) data for purposes of fault containment and ease of verification. This article introduces SurfNoC, an on-chip network that significantly reduces the latency incurred by strict temporal partitioning. By carefully scheduling the network into waves that flow across the interconnect, data from different domains carried by these waves are strictly noninterfering while avoiding the significant overheads associated with cycle-by-cycle time multiplexing. The authors describe the scheduling policy and router microarchitecture changes required, and evaluate the information-flow security of a synthesizable implementation through gate-level information flow analysis. When comparing their approach for varying numbers of domains and network sizes, they find that in many cases SurfNoC can reduce the latency overhead of implementing cycle-level noninterference by up to 85 percent. © 2014 IEEE.
- Published
- 2014
10. A New Hardware Design Scheme of Symbol Synchronization for an Underwater Acoustic Receiver
- Author
-
Chen Lan, Li Ying, Benson Bridget, and Kastner Ryan
- Subjects
Noise ,Frequency-shift keying ,Synchronizer ,business.industry ,Computer science ,Electronic engineering ,Time domain ,business ,Underwater acoustics ,Field-programmable gate array ,Computer hardware ,Synchronization ,Communication channel - Abstract
Conventional time domain symbol synchronizer uses static predefined threshold to evaluate the level of environment noise and extract synchronization signal, but is less feasible to apply in underwater acoustic channel because of the great delay and noise variance make the threshold definition inaccurate. The paper provides a new hardware scheme using real time orthogonal sliding correlation as adaptive threshold to build a symbol synchronizer suitable for an underwater FSK acoustic receiver. The design has been implemented and evaluated on FPGA board. Both the simulation and real data experimental results show that the design provides accurate synchronization which consumes 9% of resources slices.
- Published
- 2011
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.