268 results on '"William J. Dally"'
Search Results
2. A 0.190-pJ/bit 25.2-Gb/s/wire Inverter-Based AC-Coupled Transceiver for Short-Reach Die-to-Die Interfaces in 5-nm CMOS.
3. GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture.
4. A Novel High-Efficiency Three-Phase Multilevel PV Inverter With Reduced DC-Link Capacitance.
5. A 95.6-TOPS/W Deep Learning Inference Accelerator With Per-Vector Scaled 4-bit Quantization in 5 nm.
6. A 0.297-pJ/Bit 50.4-Gb/s/Wire Inverter-Based Short-Reach Simultaneous Bi-Directional Transceiver for Die-to-Die Interface in 5-nm CMOS.
7. Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training.
8. Frontier vs the Exascale Report: Why so long? and Are We Really There Yet?
9. A 0.190-pJ/bit 25.2-Gb/s/wire Inverter-Based AC-Coupled Transceiver for Short-Reach Die-to-Die Interfaces in 5-nm CMOS.
10. LNS-Madam: Low-Precision Training in Logarithmic Number System Using Multiplicative Weight Update.
11. Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network.
12. SatIn: Hardware for Boolean Satisfiability Inference.
13. SPAA'21 Panel Paper: Architecture-Friendly Algorithms versus Algorithm-Friendly Architectures.
14. Optimal Operation of a Plug-in Hybrid Vehicle with Battery Thermal and Degradation Model.
15. SpArch: Efficient Architecture for Sparse Matrix Multiplication.
16. On the model of computation: point.
17. Evolution of the Graphics Processing Unit (GPU).
18. OP-VENT: A Low-Cost, Easily Assembled, Open-Source Medical Ventilator.
19. Simba: scaling deep-learning inference with chiplet-based architecture.
20. A 17-95.6 TOPS/W Deep Learning Inference Accelerator with Per-Vector Scaled 4-bit Quantization for Transformers in 5nm.
21. A 0.297-pJ/bit 50.4-Gb/s/wire Inverter-Based Short-Reach Simultaneous Bidirectional Transceiver for Die-to-Die Interface in 5nm CMOS.
22. A Fine-Grained GALS SoC with Pausible Adaptive Clocking in 16 nm FinFET.
23. A 0.11 PJ/OP, 0.32-128 Tops, Scalable Multi-Chip-Module-Based Deep Neural Network Accelerator Designed with A High-Productivity vlsi Methodology.
24. Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture.
25. MAGNet: A Modular Accelerator Generator for Neural Networks.
26. A 2-to-20 GHz Multi-Phase Clock Generator with Phase Interpolators Using Injection-Locked Oscillation Buffers for High-Speed IOs in 16nm FinFET.
27. Darwin-WGA: A Co-processor Provides Increased Sensitivity in Whole Genome Alignments with High Speedup.
28. Accelerating Chip Design With Machine Learning.
29. A 0.32-128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm.
30. Domain-specific hardware accelerators.
31. Energy Efficient On-Demand Dynamic Branch Prediction Models.
32. Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training.
33. BaM: A Case for Enabling Fine-grain High Throughput GPU-Orchestrated Access to Storage.
34. VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference.
35. Darwin: A Genomics Co-processor Provides up to 15, 000X Acceleration on Long Read Assembly.
36. Hardware-Enabled Artificial Intelligence.
37. Ground-referenced signaling for intra-chip and short-reach chip-to-chip interconnects.
38. Bandwidth-efficient deep learning.
39. Darwin: A Genomics Coprocessor.
40. A 1.17-pJ/b, 25-Gb/s/pin Ground-Referenced Single-Ended Serial Link for Off- and On-Package Communication Using a Process- and Temperature-Adaptive Voltage Regulator.
41. VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference.
42. PatchNet - Short-range Template Matching for Efficient Video Processing.
43. Fine-grained DRAM: energy-efficient DRAM for extreme bandwidth systems.
44. Exploring the Granularity of Sparsity in Convolutional Neural Networks.
45. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks.
46. Architecting an Energy-Efficient DRAM System for GPUs.
47. A 1.17pJ/b 25Gb/s/pin ground-referenced single-ended serial link for off- and on-package communication in 16nm CMOS using a process- and temperature-adaptive voltage regulator.
48. Darwin: A Genomics Co-processor Provides up to 15, 000X Acceleration on Long Read Assembly.
49. SpArch: Efficient Architecture for Sparse Matrix Multiplication.
50. Optimal Operation of a Plug-In Hybrid Vehicle.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.