151. Leveraging Universal Sentence Encoder to Predict Movie Genre
- Author
-
Nikhil Kumar, Siraz Naorem, Sanjay Kumar, and Aditya Dev
- Subjects
Property (philosophy) ,Computer science ,business.industry ,Data classification ,Object (computer science) ,computer.software_genre ,Domain (software engineering) ,Encoding (memory) ,Artificial intelligence ,Sequential model ,business ,Encoder ,computer ,Sentence ,Natural language processing - Abstract
Multi-label text classification (MLTC) refers to the problem of dealing with textual data classification based on multiple labels or tags. There are numerous real-world scenarios where the need for assigning labels to a particular object arises, such that the labels are descriptive of the properties of that object. However, in real life, it is not uncommon for one object to hold more than a singular property describing itself, hence it needs multiple labels to be associated with it. In the cases of textual data, one such scenario is assigning labels to a movie describing the genre, which needs more than one genre to specify the plot in a practical scenario. This makes movie genre prediction the desired choice for multi-label classification in many kinds of literature. This paper explores and presents an in-depth analysis of the approach of solving movie-genre prediction problems using the sequential model with universal sentence encoder (USE) for text encoding, alongside the use of label powerset (LP) as the problem transformation approach. Along with that, a comparative analysis of the performance of the model with different optimizers is performed. The best outcome achieved is f1-score 0.69 and accuracy of 0.89 with Adam optimizer, which, upon comparison with other literature of the similar domain, is either an equal or better in performance.
- Published
- 2021