Search

Your search keyword '"Qi, Zhongang"' showing total 132 results

Search Constraints

Start Over You searched for: Author "Qi, Zhongang" Remove constraint Author: "Qi, Zhongang"
132 results on '"Qi, Zhongang"'

Search Results

1. DOGE: Towards Versatile Visual Document Grounding and Referring

2. mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA

3. Taming Rectified Flow for Inversion and Editing

4. E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding

5. CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

6. SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

7. How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?

8. EA-VTR: Event-Aware Video-Text Retrieval

9. PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM

11. SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model

12. RecDCL: Dual Contrastive Learning for Recommendation

13. EA-VTR: Event-Aware Video-Text Retrieval

14. PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

15. CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models

16. StyleAdapter: A Unified Stylized Image Generation Model

17. Towards Unseen Triples: Effective Text-Image-joint Learning for Scene Graph Generation

18. Sticker820K: Empowering Interactive Retrieval with Stickers

19. SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation

20. MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

21. LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

22. T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

23. Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval

24. Weakly-Supervised Temporal Action Localization by Progressive Complementary Learning

25. Do we really need temporal convolutions in action segmentation?

26. Accelerating the Training of Video Super-Resolution Models

27. CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation

28. From Heatmaps to Structural Explanations of Image Classifiers

29. Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution

30. Stochastic Block-ADMM for Training Deep Networks

31. Open-book Video Captioning with Retrieve-Copy-Generate Network

32. A Generic Object Re-identification System for Short Videos

33. Visualizing Point Cloud Classifiers by Curvature Smoothing

34. Visualizing Deep Networks by Optimizing with Integrated Gradients

35. Interactive Naming for Explaining Deep Neural Networks: A Formative Study

36. PointConv: Deep Convolutional Networks on 3D Point Clouds

37. Deep Air Learning: Interpolation, Prediction, and Feature Analysis of Fine-grained Air Quality

38. Embedding Deep Networks into Visual Explanations

Catalog

Books, media, physical & digital resources