Back to Search
Start Over
Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval
- Source :
- CVPR
- Publication Year :
- 2020
- Publisher :
- IEEE, 2020.
-
Abstract
- ImageNet pre-training has long been considered crucial by the fine-grained sketch-based image retrieval (FG-SBIR) community due to the lack of large sketch-photo paired datasets for FG-SBIR training. In this paper, we propose a self-supervised alternative for representation pre-training. Specifically, we consider the jigsaw puzzle game of recomposing images from shuffled parts. We identify two key facets of jigsaw task design that are required for effective FG-SBIR pre-training. The first is formulating the puzzle in a mixed-modality fashion. Second we show that framing the optimisation as permutation matrix inference via Sinkhorn iterations is more effective than the common classifier formulation of Jigsaw self-supervision. Experiments show that this self-supervised pre-training strategy significantly outperforms the standard ImageNet-based pipeline across all four product-level FG-SBIR benchmarks. Interestingly it also leads to improved cross-category generalisation across both pre-train/fine-tune and fine-tune/testing stages.
- Subjects :
- business.industry
Computer science
Feature extraction
Inference
02 engineering and technology
010501 environmental sciences
Machine learning
computer.software_genre
01 natural sciences
Sketch
Jigsaw
Modal
0202 electrical engineering, electronic engineering, information engineering
Task analysis
020201 artificial intelligence & image processing
Artificial intelligence
business
Image retrieval
computer
0105 earth and related environmental sciences
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- Accession number :
- edsair.doi...........da463a03f8bc591ca4a54241543e992e
- Full Text :
- https://doi.org/10.1109/cvpr42600.2020.01036