Search

Your search keyword '"Computer Science - Multimedia"' showing total 5,061 results

Search Constraints

Start Over You searched for: Descriptor "Computer Science - Multimedia" Remove constraint Descriptor: "Computer Science - Multimedia" Topic computer science - computer vision and pattern recognition Remove constraint Topic: computer science - computer vision and pattern recognition
5,061 results on '"Computer Science - Multimedia"'

Search Results

1. HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR

2. Question-Answering Dense Video Events

3. 3D-GP-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors

4. SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing

5. Make Graph-based Referring Expression Comprehension Great Again through Expression-guided Dynamic Gating and Regression

6. Eetimating Indoor Scene Depth Maps from Ultrasonic Echoes

7. LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture

8. ExpLLM: Towards Chain of Thought for Facial Expression Recognition

9. PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation

10. Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation

11. FrameCorr: Adaptive, Autoencoder-based Neural Compression for Video Reconstruction in Resource and Timing Constrained Network Settings

12. Unveiling Deep Shadows: A Survey on Image and Video Shadow Detection, Removal, and Generation in the Era of Deep Learning

13. Towards Real-World Adverse Weather Image Restoration: Enhancing Clearness and Semantics with Vision-Language Models

14. Low-Resolution Face Recognition via Adaptable Instance-Relation Distillation

15. PRoGS: Progressive Rendering of Gaussian Splats

16. Coral Model Generation from Single Images for Virtual Reality Applications

17. Interpretable Convolutional SyncNet

18. Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition

19. Comparative Analysis of Modality Fusion Approaches for Audio-Visual Person Identification and Verification

20. Digit Recognition using Multimodal Spiking Neural Networks

21. Multi-scale Multi-instance Visual Sound Localization and Segmentation

22. LAR-IQA: A Lightweight, Accurate, and Robust No-Reference Image Quality Assessment Model

23. MSLIQA: Enhancing Learning Representations for Image Quality Assessment through Multi-Scale Learning

24. See or Guess: Counterfactually Regularized Image Captioning

25. Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input

26. A Simple Baseline with Single-encoder for Referring Image Segmentation

27. Hand1000: Generating Realistic Hands from Text with Only 1,000 Images

28. Alfie: Democratising RGBA Image Generation With No $$$

29. LapisGS: Layered Progressive 3D Gaussian Splatting for Adaptive Streaming

30. Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization

31. HABD: a houma alliance book ancient handwritten character recognition database

32. SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding

33. Localization of Synthetic Manipulations in Western Blot Images

34. An Open, Cross-Platform, Web-Based Metaverse Using WebXR and A-Frame

35. Riemann-based Multi-scale Attention Reasoning Network for Text-3D Retrieval

36. SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting

37. DreamCinema: Cinematic Transfer with Free Camera and 3D Character

38. MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model

39. MultiMed: Massively Multimodal and Multitask Medical Understanding

40. MCDubber: Multimodal Context-Aware Expressive Video Dubbing

41. AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results

42. Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound

43. SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition

44. Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation

45. Webcam-based Pupil Diameter Prediction Benefits from Upscaling

46. Perceptual Depth Quality Assessment of Stereoscopic Omnidirectional Images

47. Sliced Maximal Information Coefficient: A Training-Free Approach for Image Quality Assessment Enhancement

48. Harmonizing Attention: Training-free Texture-aware Geometry Transfer

49. Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation

50. Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition

Catalog

Books, media, physical & digital resources