Search

Your search keyword '"Li, Houqiang"' showing total 1,429 results

Search Constraints

Start Over You searched for: Author "Li, Houqiang" Remove constraint Author: "Li, Houqiang"
1,429 results on '"Li, Houqiang"'

Search Results

1. P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task

2. From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning

3. AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding

4. LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation

5. Scaling up Multimodal Pre-training for Sign Language Understanding

6. SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection

7. SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval

8. Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis

9. RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation

10. Text-Animator: Controllable Visual Text Video Generation

11. Prediction and Reference Quality Adaptation for Learned Video Compression

12. Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition

13. Semi-Supervised Spoken Language Glossification

14. Mini Honor of Kings: A Lightweight Environment for Multi-Agent Reinforcement Learning

15. TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

16. MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition

17. EG4D: Explicit Generation of 4D Object without Score Distillation

18. Learning Generalizable Human Motion Generator with Reinforcement Learning

19. Progressive Multi-modal Conditional Prompt Tuning

20. TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding

21. HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation

22. GaussNav: Gaussian Splatting for Visual Navigation

23. Cross-Lingual Transfer for Natural Language Inference via Multilingual Prompt Translator

24. Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction

25. Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval

26. Asymmetric Feature Fusion for Image Retrieval

27. Structure Similarity Preservation Learning for Asymmetric Image Retrieval

28. DeepEraser: Deep Iterative Context Mining for Generic Text Eraser

29. Sinkhorn Distance Minimization for Knowledge Distillation

30. Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

32. Spatial Decomposition and Temporal Fusion based Inter Prediction for Learned Video Compression

33. Passive Non-Line-of-Sight Imaging with Light Transport Modulation

34. TinySAM: Pushing the Envelope for Efficient Segment Anything Model

35. DanZero+: Dominating the GuanDan Game through Reinforcement Learning

36. Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs

37. DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding

38. PersonMAE: Person Re-Identification Pre-Training with Masked AutoEncoders

39. Progressive Recurrent Network for Shadow Removal

40. State Sequences Prediction via Fourier Transform for Representation Learning

41. I$^2$MD: 3D Action Representation Learning with Inter- and Intra-modal Mutual Distillation

42. Accelerate Presolve in Large-Scale Linear Programming via Reinforcement Learning

43. MSight: An Edge-Cloud Infrastructure-based Perception System for Connected Automated Vehicles

44. Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning

45. Sign Language Translation with Iterative Prototype

46. UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding

47. SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning

48. Text-Only Training for Visual Storytelling

49. DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory

50. Masked Motion Predictors are Strong 3D Action Representation Learners

Catalog

Books, media, physical & digital resources