47 results on '"Cong Hao"'
Search Results
2. Exploring and Exploiting Runtime Reconfigurable Floating Point Precision in Scientific Computing: a Case Study for Solving PDEs.
3. Residual-INR: Communication Efficient On-Device Learning Using Implicit Neural Representation.
4. ICGMM: CXL-enabled Memory Expansion with Intelligent Caching Using Gaussian Mixture Model.
5. Understanding the Performance and Estimating the Cost of LLM Fine-Tuning.
6. HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond.
7. LightningSimV2: Faster and Scalable Simulation for High-Level Synthesis via Graph Compilation and Optimization.
8. GNNBuilder: An Automated Framework for Generic Graph Neural Network Accelerator Generation, Simulation, and Optimization.
9. LightningSim: Fast and Accurate Trace-Based Simulation for High-Level Synthesis.
10. Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts.
11. DGNN-Booster: A Generic FPGA Accelerator Framework For Dynamic Graph Neural Network Inference.
12. Gamora: Graph Learning based Symbolic Reasoning for Large-Scale Boolean Networks.
13. INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order Gradient Computations in Implicit Neural Representation Processing.
14. Rapid-INR: Storage Efficient CPU-free DNN Training Using Implicit Neural Representation.
15. FlowGNN: A Dataflow Architecture for Universal Graph Neural Network Inference via Multi-Queue Streaming.
16. GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration.
17. Enabling Flexibility for Sparse Tensor Acceleration via Heterogeneity.
18. Bottleneck Analysis of Dynamic Graph Neural Network Inference on CPU and GPU.
19. Robotic Computing on FPGAs: Current Progress, Research Challenges, and Opportunities.
20. High-Level Synthesis Performance Prediction using GNNs: Benchmarking, Modeling, and Advancing.
21. M3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design.
22. Efficient Machine Learning, Compilers, and Optimizations for Embedded Systems.
23. RT-DNAS: Real-time Constrained Differentiable Neural Architecture Search for 3D Cardiac Cine MRI Segmentation.
24. Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices.
25. Hybrid Graph Models for Logic Optimization via Spatio-Temporal Information.
26. H2H: Heterogeneous Model to Heterogeneous System Mapping with Computation and Communication Awareness.
27. Unsupervised Learning for Combinatorial Optimization with Principled Objective Relaxation.
28. Software/Hardware Co-design for Multi-modal Multi-task Learning in Autonomous Systems.
29. On-FPGA Training with Ultra Memory Reduction: A Low-Precision Tensor Method.
30. 3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration.
31. Adversarial Graph Augmentation to Improve Graph Contrastive Learning.
32. Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-design.
33. IronMan: GNN-assisted Design Space Exploration in High-Level Synthesis via Reinforcement Learning.
34. WinoCNN: Kernel Sharing Winograd Systolic Array for Efficient Convolutional Neural Network Acceleration on FPGAs.
35. Generic Neural Architecture Search via Regression.
36. Program-to-Circuit: Exploiting GNNs for Program Representation and Circuit Translation.
37. ScaleHLS: Scalable High-Level Synthesis through MLIR.
38. MELOPPR: Software/Hardware Co-design for Memory-efficient Low-latency Personalized PageRank.
39. Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices.
40. AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs.
41. EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions.
42. VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization.
43. A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices.
44. NAIS: Neural Architecture and Implementation Search and its Applications in Autonomous Driving.
45. FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge.
46. SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection.
47. SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.