Back to Search
Start Over
On Good Practices for Task-Specific Distillation of Large Pretrained Visual Models
- Source :
- Published in Transactions on Machine Learning Research (TMLR), 2024
- Publication Year :
- 2024
-
Abstract
- Large pretrained visual models exhibit remarkable generalization across diverse recognition tasks. Yet, real-world applications often demand compact models tailored to specific problems. Variants of knowledge distillation have been devised for such a purpose, enabling task-specific compact models (the students) to learn from a generic large pretrained one (the teacher). In this paper, we show that the excellent robustness and versatility of recent pretrained models challenge common practices established in the literature, calling for a new set of optimal guidelines for task-specific distillation. To address the lack of samples in downstream tasks, we also show that a variant of Mixup based on stable diffusion complements standard data augmentation. This strategy eliminates the need for engineered text prompts and improves distillation of generic models into streamlined specialized networks.
- Subjects :
- Computer Science - Computer Vision and Pattern Recognition
Subjects
Details
- Database :
- arXiv
- Journal :
- Published in Transactions on Machine Learning Research (TMLR), 2024
- Publication Type :
- Report
- Accession number :
- edsarx.2402.11305
- Document Type :
- Working Paper