Search

Your search keyword '"Visual Question Answering"' showing total 455 results

Search Constraints

Start Over You searched for: Descriptor "Visual Question Answering" Remove constraint Descriptor: "Visual Question Answering" Language english Remove constraint Language: english
455 results on '"Visual Question Answering"'

Search Results

1. Multi-stage reasoning on introspecting and revising bias for visual question answering.

2. Learning to enhance areal video captioning with visual question answering.

3. DCF–VQA: Counterfactual Structure Based on Multi–Feature Enhancement

4. Prompting Large Language Models with Knowledge-Injection for Knowledge-Based Visual Question Answering

5. Integrating IoT and visual question answering in smart cities: Enhancing educational outcomes

6. Vision transformer-based visual language understanding of the construction process

7. Multimodal attention-driven visual question answering for Malayalam.

8. HRVQA: A Visual Question Answering benchmark for high-resolution aerial images.

9. ViCLEVR: a visual reasoning dataset and hybrid multimodal fusion model for visual question answering in Vietnamese.

10. Sign-based image criteria for social interaction visual question answering.

11. Vision transformer-based visual language understanding of the construction process.

12. Advancing surgical VQA with scene graph knowledge.

13. DCF-VQA: COUNTERFACTUAL STRUCTURE BASED ON MULTI--FEATURE ENHANCEMENT.

14. Learning a Mixture of Conditional Gating Blocks for Visual Question Answering.

15. Enhancing machine vision: the impact of a novel innovative technology on video question-answering.

16. TRANS-VQA: Fully Transformer-Based Image Question-Answering Model Using Question-guided Vision Attention.

17. EarthVQANet: Multi-task visual question answering for remote sensing image understanding.

18. Knowledge-aware image understanding with multi-level visual representation enhancement for visual question answering.

19. A focus fusion attention mechanism integrated with image captions for knowledge graph-based visual question answering.

20. Dual modality prompt learning for visual question-grounded answering in robotic surgery

21. Graph neural networks for visual question answering: a systematic review.

22. Learning the Meanings of Function Words From Grounded Language Using a Visual Question Answering Model.

23. Diagram Perception Networks for Textbook Question Answering via Joint Optimization.

24. Relation-Aware Image Captioning with Hybrid-Attention for Explainable Visual Question Answering.

25. Dual modality prompt learning for visual question-grounded answering in robotic surgery.

26. IMCN: Improved modular co-attention networks for visual question answering.

27. Relational reasoning and adaptive fusion for visual question answering.

28. Survey of Multimodal Medical Question Answering.

29. Collaborative Modality Fusion for Mitigating Language Bias in Visual Question Answering.

30. Cross-modality Multiple Relations Learning for Knowledge-based Visual Question Answering.

31. Knowledge enhancement and scene understanding for knowledge-based visual question answering.

32. Improving VQA via Dual-Level Feature Embedding Network.

33. Toward Unsupervised Visual Reasoning: Do Off-the-Shelf Features Know How to Reason?

34. Survey of Multimodal Medical Question Answering

36. Debiased Visual Question Answering via the perspective of question types.

37. VL-Few: Vision Language Alignment for Multimodal Few-Shot Meta Learning.

38. OECA-Net: A co-attention network for visual question answering based on OCR scene text feature enhancement.

39. VL-Meta: Vision-Language Models for Multimodal Meta-Learning.

40. Correlation Information Bottleneck: Towards Adapting Pretrained Multimodal Models for Robust Visual Question Answering.

41. Self-supervised knowledge distillation in counterfactual learning for VQA.

42. MSGeN: Multimodal Selective Generation Network for Grounded Explanations.

43. Semi-Supervised Implicit Augmentation for Data-Scarce VQA †.

44. Multimodal Bi-direction Guided Attention Networks for Visual Question Answering.

45. A visual questioning answering approach to enhance robot localization in indoor environments.

46. Design of knowledge incorporated VQA based on spatial GCNN with structured sentence embedding and linking algorithm.

47. ConfigILM: A general purpose configurable library for combining image and language models for visual question answering

48. The Potential of a Visual Dialogue Agent In a Tandem Automated Audio Description System for Videos.

49. Multi-modal co-attention relation networks for visual question answering.

50. VQAPT: A New visual question answering model for personality traits in social media images.

Catalog

Books, media, physical & digital resources