Search

Your search keyword '"Jia, Zhihao"' showing total 492 results

Search Constraints

Start Over You searched for: Author "Jia, Zhihao" Remove constraint Author: "Jia, Zhihao"
492 results on '"Jia, Zhihao"'

Search Results

1. Communication Bounds for the Distributed Experts Problem

2. A System for Microserving of LLMs

3. SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference

4. MagicPIG: LSH Sampling for Efficient LLM Generation

5. TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

6. Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs (Extended Version)

7. GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism

8. SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices

9. Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs

10. Mirage: A Multi-Level Superoptimizer for Tensor Programs

11. Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances

12. FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning

13. Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

14. Accelerating Retrieval-Augmented Language Model Serving with Speculation

15. Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

17. Drone-NeRF: Efficient NeRF Based 3D Scene Reconstruction for Large-Scale Drone Survey

18. Quarl: A Learning-Based Quantum Circuit Optimizer

19. SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification

20. Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks

21. OLLIE: Derivation-based Tensor Program Optimizer

22. Research on Control System of Three-Phase Isolated AC/DC Converter

23. BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs

24. Optimizing Mixture of Experts using Dynamic Recompilations

27. Bamboo: Making Preemptible Instances Resilient for Affordable Training of Large DNNs

28. Quartz: Superoptimization of Quantum Circuits (Extended Version)

29. TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs

30. Quanto: Optimizing Quantum Circuits with Automatic Generation of Circuit Identities

31. Collage: Seamless Integration of Deep Learning Backends with Automatic Placement

32. TOD: GPU-accelerated Outlier Detection via Tensor Operations

33. GradSign: Model Performance Inference with Theoretical Insights

34. Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads

35. Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

38. IOS: Inter-Operator Scheduler for CNN Acceleration

44. Redundancy-Free Computation Graphs for Graph Neural Networks

49. Beyond Data and Model Parallelism for Deep Neural Networks

50. Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks

Catalog

Books, media, physical & digital resources