Author: "Aninda Manocha" / Language: undetermined - Searchworks@Jio Institute Digital Library Search Results

1. Graphfire: Synergizing Fetch, Insertion, and Replacement Policies for Graph Analytics

Author: Aninda Manocha, Juan Luis Aragón, and Margaret Martonosi
Subjects: Computational Theory and Mathematics, Hardware and Architecture, Software, Theoretical Computer Science
Published: 2023
Full Text: View/download PDF

2. The Implications of Page Size Management on Graph Analytics

Author: Aninda Manocha, Zi Yan, Esin Tureci, Juan Luis Aragon, David Nellans, and Margaret Martonosi
Published: 2022
Full Text: View/download PDF

3. GraphAttack

Author: Tyler Sorensen, Aninda Manocha, Opeoluwa Matthews, Margaret Martonosi, Juan L. Aragón, and Esin Tureci
Subjects: Multi-core processor, Speedup, Memory hierarchy, Computer science, Data parallelism, Parallel computing, computer.software_genre, Software framework, Hardware and Architecture, Scalability, Graph (abstract data type), computer, Queue, Software, Information Systems
Abstract: Graph structures are a natural representation of important and pervasive data. While graph applications have significant parallelism, their characteristic pointer indirect loads to neighbor data hinder scalability to large datasets on multicore systems. A scalable and efficient system must tolerate latency while leveraging data parallelism across millions of vertices. Modern Out-of-Order (OoO) cores inherently tolerate a fraction of long latencies, but become clogged when running severely memory-bound applications. Combined with large power/area footprints, this limits their parallel scaling potential and, consequently, the gains that existing software frameworks can achieve. Conversely, accelerator and memory hierarchy designs provide performant hardware specializations, but cannot support diverse application demands. To address these shortcomings, we present GraphAttack, a hardware-software data supply approach that accelerates graph applications on in-order multicore architectures. GraphAttack proposes compiler passes to (1) identify idiomatic long-latency loads and (2) slice programs along these loads into data Producer/ Consumer threads to map onto pairs of parallel cores. Each pair shares a communication queue; the Producer asynchronously issues long-latency loads, whose results are buffered in the queue and used by the Consumer. This scheme drastically increases memory-level parallelism (MLP) to mitigate latency bottlenecks. In equal-area comparisons, GraphAttack outperforms OoO cores, do-all parallelism, prefetching, and prior decoupling approaches, achieving a 2.87× speedup and 8.61× gain in energy efficiency across a range of graph applications. These improvements scale; GraphAttack achieves a 3× speedup over 64 parallel cores. Lastly, it has pragmatic design principles; it enhances in-order architectures that are gaining increasing open-source support.
Published: 2021
Full Text: View/download PDF

4. Bayesian Optimization for Efficient Accelerator Synthesis

Author: Atefeh Mehrabi, Benjamin C. Lee, Aninda Manocha, and Daniel J. Sorin
Subjects: Computer science, Design space exploration, business.industry, Bayesian probability, Bayesian optimization, Resource (project management), Hardware and Architecture, High-level synthesis, Embedded system, Code (cryptography), Field-programmable gate array, business, Software, Information Systems, Electronic circuit
Abstract: Accelerator design is expensive due to the effort required to understand an algorithm and optimize the design. Architects have embraced two technologies to reduce costs. High-level synthesis automatically generates hardware from code. Reconfigurable fabrics instantiate accelerators while avoiding fabrication costs for custom circuits. We further reduce design effort with statistical learning. We build an automated framework, called Prospector, that uses Bayesian techniques to optimize synthesis directives, reducing execution latency and resource usage in field-programmable gate arrays. We show in a certain amount of time that designs discovered by Prospector are closer to Pareto-efficient designs compared to prior approaches. Prospector permits new studies for heterogeneous accelerators.
Published: 2020
Full Text: View/download PDF

5. AutoSVA: Democratizing Formal Verification of RTL Module Interactions

Author: Margaret Martonosi, David Wentzlaff, Aninda Manocha, and Marcelo Orenes-Vera
Subjects: FOS: Computer and information sciences, Programming language, Computer science, Liveness, SystemVerilog, computer.software_genre, Learning curve, Hardware Architecture (cs.AR), Effective method, Electronic design automation, Control logic, Computer Science - Hardware Architecture, Formal verification, computer, computer.programming_language
Abstract: Modern SoC design relies on the ability to separately verify IP blocks relative to their own specifications. Formal verification (FV) using SystemVerilog Assertions (SVA) is an effective method to exhaustively verify blocks at unit-level. Unfortunately, FV has a steep learning curve and requires engineering effort that discourages hardware designers from using it during RTL module development. We propose AutoSVA, a framework to automatically generate FV testbenches that verify liveness and safety of control logic involved in module interactions. We demonstrate AutoSVA’s effectiveness and efficiency on deadlock-critical modules of widely-used open-source hardware projects.
Published: 2021
Full Text: View/download PDF

6. A simulator and compiler framework for agile hardware-software co-design evaluation and exploration

Author: Margaret Martonosi, Juan L. Aragón, Aninda Manocha, Tyler Sorensen, Marcelo Orenes-Vera, and Esin Tureci
Subjects: 010302 applied physics, business.industry, Computer science, 02 engineering and technology, Software prototyping, Modular design, computer.software_genre, 01 natural sciences, 020202 computer hardware & architecture, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Programming paradigm, Compiler, business, computer, Simulation, Agile software development
Abstract: As Moore's Law has slowed and Dennard Scaling has ended, architects are increasingly turning to heterogeneous parallelism and hardware-software co-design. These trends present new challenges for simulation-based performance assessments that are central to early-stage architectural exploration. Simulators must be lightweight to support heterogeneous combinations of general-purpose cores and specialized processing units. They must also support agile exploration of hardware-software co-design, i.e. changes in the programming model, compiler, ISA, and specialized hardware. To meet these challenges, we describe our compiler and simulator pair: DEC++ and MosaicSim. Together, they provide a lightweight, modular simulator for heterogeneous systems, offering accuracy and agility designed specifically for hardware-software co-design explorations. The simulator and corresponding compiler were developed as part of the DECADES project, a multi-team effort to design and tape out a new heterogeneous architecture. We will present two case-studies in important data-science applications where DEC++ and MosaicSim enable straightforward design space explorations for emerging full-stack systems.
Published: 2020
Full Text: View/download PDF

7. MosaicSim: A Lightweight, Modular Simulator for Heterogeneous Systems

Author: Juan L. Aragón, Margaret Martonosi, Tyler Sorensen, Luca P. Carloni, Opeoluwa Matthews, Marcelo Orenes-Vera, Tae Jun Ham, Davide Giri, Aninda Manocha, and Esin Tureci
Subjects: 010302 applied physics, Multi-core processor, Dennard scaling, business.industry, Computer science, 02 engineering and technology, Modular design, computer.software_genre, 01 natural sciences, Modularity, Toolchain, 020202 computer hardware & architecture, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Programming paradigm, Compiler, business, computer, Simulation, Agile software development
Abstract: As Moore's Law has slowed and Dennard Scaling has ended, architects are increasingly turning to heterogeneous parallelism and domain-specific hardware-software co-designs. These trends present new challenges for simulation-based performance assessments that are central to early-stage architectural exploration. Simulators must be lightweight to support rich heterogeneous combinations of general purpose cores and specialized processing units. They must also support agile exploration of hardware-software co-design, i.e. changes in the programming model, compiler, ISA, and specialized hardware. To meet these challenges, we introduce MosaicSim, a lightweight, modular simulator for heterogeneous systems, offering accuracy and agility designed specifically for hardware-software co-design explorations. By integrating the LLVM toolchain, MosaicSim enables efficient modeling of instruction dependencies and flexible additions across the stack. Its modularity also allows the composition and integration of different hardware components. We first demonstrate that MosaicSim captures architectural bottlenecks in applications, and accurately models both scaling trends in a multicore setting and accelerator behavior. We then present two case-studies where MosaicSim enables straightforward design space explorations for emerging systems, i.e. data science application acceleration and heterogeneous parallel architectures.
Published: 2020
Full Text: View/download PDF

8. Prospector: Synthesizing Efficient Accelerators via Statistical Learning

Author: Aninda Manocha, Daniel J. Sorin, Atefeh Mehrabi, and Benjamin C. Lee
Subjects: 010302 applied physics, Design space exploration, Computer science, Latency (audio), 02 engineering and technology, 01 natural sciences, 020202 computer hardware & architecture, Resource (project management), Computer architecture, High-level synthesis, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Code (cryptography), Field-programmable gate array, Electronic circuit
Abstract: Accelerator design is expensive due to the effort required to understand an algorithm and optimize the design. Architects have embraced two technologies to reduce costs. High-level synthesis automatically generates hardware from code. Reconfigurable fabrics instantiate accelerators while avoiding fabrication costs for custom circuits. We further reduce design effort with statistical learning. We build an automated framework, called Prospector, that uses Bayesian techniques to optimize synthesis directives, reducing execution latency and resource usage in field-programmable gate arrays. We show in a certain amount of time designs discovered by Prospector are closer to Pareto-efficient designs compared to prior approaches.
Published: 2020
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

8 results on '"Aninda Manocha"'

1. Graphfire: Synergizing Fetch, Insertion, and Replacement Policies for Graph Analytics

2. The Implications of Page Size Management on Graph Analytics

3. GraphAttack

4. Bayesian Optimization for Efficient Accelerator Synthesis

5. AutoSVA: Democratizing Formal Verification of RTL Module Interactions

6. A simulator and compiler framework for agile hardware-software co-design evaluation and exploration

7. MosaicSim: A Lightweight, Modular Simulator for Heterogeneous Systems

8. Prospector: Synthesizing Efficient Accelerators via Statistical Learning

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Journal

Database

Publisher

8 results on '"Aninda Manocha"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources