Search

Your search keyword '"Zhang, Xiangyu"' showing total 206 results

Search Constraints

Start Over You searched for: Author "Zhang, Xiangyu" Remove constraint Author: "Zhang, Xiangyu" Topic computer science - computer vision and pattern recognition Remove constraint Topic: computer science - computer vision and pattern recognition
206 results on '"Zhang, Xiangyu"'

Search Results

1. Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models

2. Reconstructive Visual Instruction Tuning

3. General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

4. Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving

5. XNN: Paradigm Shift in Mitigating Identity Leakage within Cloud-Enabled Deep Learning

6. UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening

7. DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

8. Self-supervised Adversarial Training of Monocular Depth Estimation against Physical-World Attacks

9. Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?

10. Focus Anywhere for Fine-grained Multi-page Document Understanding

11. Self-Supervised Visual Preference Alignment

12. OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

13. BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks

14. SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control

15. LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning

16. Small Language Model Meets with Reinforced Vision Vocabulary

17. Slot-guided Volumetric Object Radiance Fields

18. Bootstrap Masked Visual Modeling via Hard Patches Mining

19. Compound Text-Guided Prompt Tuning via Image-Adaptive Cues

20. Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

21. Merlin:Empowering Multimodal LLMs with Foresight Minds

22. ADriver-I: A General World Model for Autonomous Driving

23. LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation

24. Unidirectional brain-computer interface: Artificial neural network encoding natural images to fMRI response in the visual cortex

25. DreamLLM: Synergistic Multimodal Comprehension and Creation

26. Language Prompt for Autonomous Driving

27. RevColV2: Exploring Disentangled Representations in Masked Image Modeling

28. Far3D: Expanding the Horizon for Surround-view 3D Object Detection

29. SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers

30. ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning

31. GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping

32. OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation

33. MOTRv3: Release-Fetch Supervision for End-to-End Multi-Object Tracking

34. Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection

35. Self-supervised Learning by View Synthesis

36. Align-DETR: Improving DETR with Simple IoU-aware BCE loss

37. Detecting Backdoors in Pre-trained Encoders

38. Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

39. VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking

40. Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception

41. Referring Multi-Object Tracking

42. Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

43. Adversarial Training of Self-supervised Monocular Depth Estimation against Physical-World Attacks

44. Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

45. Understanding Imbalanced Semantic Segmentation Through Neural Collapse

46. Reversible Column Networks

47. Twin-S: A Digital Twin for Skull-base Surgery

48. MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception

49. MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors

50. Towards 3D Object Detection with 2D Supervision

Catalog

Books, media, physical & digital resources