Search

Your search keyword '"Computer Science - Computer Vision and Pattern Recognition"' showing total 358,600 results

Search Constraints

Start Over You searched for: Descriptor "Computer Science - Computer Vision and Pattern Recognition" Remove constraint Descriptor: "Computer Science - Computer Vision and Pattern Recognition"
358,600 results on '"Computer Science - Computer Vision and Pattern Recognition"'

Search Results

1. T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

2. AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

3. DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

4. Free-form Generation Enhances Challenging Clothed Human Modeling

5. Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark

6. VLSBench: Unveiling Visual Leakage in Multimodal Safety

7. On Domain-Specific Post-Training for Multimodal Large Language Models

8. SIMS: Simulating Human-Scene Interactions with Real World Script Planning

9. Quantifying the synthetic and real domain gap in aerial scene understanding

10. $C^{3}$-NeRF: Modeling Multiple Scenes via Conditional-cum-Continual Neural Radiance Fields

11. GuardSplat: Robust and Efficient Watermarking for 3D Gaussian Splatting

12. FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation

13. SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection

14. Towards Class-wise Robustness Analysis

15. A Visual-inertial Localization Algorithm using Opportunistic Visual Beacons and Dead-Reckoning for GNSS-Denied Large-scale Applications

16. Feedback-driven object detection and iterative model improvement

17. SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens

18. Gaussian multi-target filtering with target dynamics driven by a stochastic differential equation

19. MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks

20. LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos

21. PerLA: Perceptive 3D Language Assistant

22. Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

23. LaVIDE: A Language-Vision Discriminator for Detecting Changes in Satellite Image with Map References

24. DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering

25. A Comprehensive Content Verification System for ensuring Digital Integrity in the Age of Deep Fakes

26. A Multi-Loss Strategy for Vehicle Trajectory Prediction: Combining Off-Road, Diversity, and Directional Consistency Losses

27. Real-Time Anomaly Detection in Video Streams

28. JetFormer: An Autoregressive Generative Model of Raw Images and Text

29. MonoPP: Metric-Scaled Self-Supervised Monocular Depth Estimation by Planar-Parallax Geometry in Automotive Applications

30. The Streetscape Application Services Stack (SASS): Towards a Distributed Sensing Architecture for Urban Applications

31. Forensics Adapter: Adapting CLIP for Generalizable Face Forgery Detection

32. Explaining the Impact of Training on Vision Models via Activation Clustering

33. Gated-Attention Feature-Fusion Based Framework for Poverty Prediction

34. SURE-VQA: Systematic Understanding of Robustness Evaluation in Medical VQA Tasks

35. Multimodal Whole Slide Foundation Model for Pathology

36. TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting

37. Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing

38. CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation

39. Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings

40. GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding

41. FairDD: Fair Dataset Distillation via Synchronized Matching

42. Tortho-Gaussian: Splatting True Digital Orthophoto Maps

43. Self-Supervised Denoiser Framework

44. Gaussian Splashing: Direct Volumetric Rendering Underwater

45. LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention

46. A Comprehensive Framework for Automated Segmentation of Perivascular Spaces in Brain MRI with the nnU-Net

47. Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding

48. Contextual Checkerboard Denoise -- A Novel Neural Network-Based Approach for Classification-Aware OCT Image Denoising

49. ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration

50. SkelMamba: A State Space Model for Efficient Skeleton Action Recognition of Neurological Disorders

Catalog

Books, media, physical & digital resources