Back to Search
Start Over
Fusion of Multimodal Embeddings for Ad-Hoc Video Search
- Source :
- 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), ICCV Workshops
- Publication Year :
- 2019
-
Abstract
- The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and free-style (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS benchmarking in TRECVID. In this paper, we study a new method to fuse multimodal embeddings which have been derived based on completely disjoint datasets. This method is tested on two datasets for two distinct tasks: on MSR-VTT for unique video retrieval and on V3C1 for multiple videos retrieval.
- Subjects :
- Vocabulary
Information retrieval
business.industry
Computer science
Deep learning
media_common.quotation_subject
InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
Feature extraction
02 engineering and technology
010501 environmental sciences
01 natural sciences
TRECVID
Visualization
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
business
Natural language
0105 earth and related environmental sciences
Semantic gap
media_common
Subjects
Details
- ISBN :
- 978-1-72815-023-9
- ISBNs :
- 9781728150239
- Database :
- OpenAIRE
- Journal :
- 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)
- Accession number :
- edsair.doi.dedup.....0e44f0fdd1056bb67f7ed1f879b73ef4
- Full Text :
- https://doi.org/10.1109/iccvw.2019.00233