Back to Search
Start Over
Generating Cooking Recipes from Cooking Videos Using Deep Learning Considering Previous Process with Video Encoding
- Source :
- APPIS
- Publication Year :
- 2020
- Publisher :
- ACM, 2020.
-
Abstract
- Research on generating natural language captions to visual data such as images and videos has produced considerable results with deep learning methods and attracted attention in recent years. In this research, we aim to generate recipe sentences from cooking videos acquired from YouTube. We treat the task as image captioning. There are two aspects to be considered in order to do so. We believe that the semantics of each process should be taken into account to improve the captioning ' s accuracy. Furthermore, data processing, that is obtaining images from each process using several visual processing methods such as object detection should be important. We propose a captioning model where a sentence vector is embedded to consider the consistency of the recipe. From differences between generated recipes and the reference recipe, we can calculate recipe scores. We use three metrics that are used in previous studies to evaluate the image captioning model. We compare the scores to with ones from baseline models.
- Subjects :
- Closed captioning
Computer science
business.industry
Deep learning
InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
Recipe
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
02 engineering and technology
010501 environmental sciences
Semantics
computer.software_genre
01 natural sciences
Object detection
Visual processing
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
Natural language
Natural language processing
Sentence
0105 earth and related environmental sciences
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 3rd International Conference on Applications of Intelligent Systems
- Accession number :
- edsair.doi...........61aad8c3b8ed04f1d89f29644293ffa7
- Full Text :
- https://doi.org/10.1145/3378184.3378217