151. Detecting Drinking-Related Contents on Social Media by Classifying Heterogeneous Data Types
- Author
-
Lixia Yao, Omar ElTayeby, Wenwen Dou, Malak Abdullah, Todd Eaglin, and David Burlinson
- Subjects
Contextual image classification ,Computer science ,030508 substance abuse ,Binge drinking ,Data science ,Data type ,Support vector machine ,03 medical and health sciences ,0302 clinical medicine ,Identity (object-oriented programming) ,Social media ,030212 general & internal medicine ,0305 other medical science ,ComputingMilieux_MISCELLANEOUS ,Predictive modelling - Abstract
One common health problem in the US faced by colleges and universities is binge drinking. College students often post drinking related texts and images on social media as a socially desirable identity. Some public health and clinical research scholars have surveyed different social media sites manually to understand their behavior patterns. In this paper, we investigate the feasibility of mining the heterogeneous data scattered on social media to identify drinking-related contents, which is the first step towards unleashing the potential of social media in automatic detection of binge drinking users. We use the state-of-the-art algorithms such as Support Vector Machine and neural networks to classify drinking from non-drinking posts, which contain not only text, but also images and videos. Our results show that combining heterogeneous data types, we are able to identify drinking related posts with an overall accuracy of 82%. Prediction models based on text data is more reliable compared to the other two models built on image and video data for predicting drinking related contents.
- Published
- 2017