Back to Search
Start Over
Word2vec convolutional neural networks for classification of news articles and tweets
- Source :
- PLoS ONE, Vol 14, Iss 8, p e0220976 (2019), PLoS ONE
- Publication Year :
- 2019
- Publisher :
- Public Library of Science (PLoS), 2019.
-
Abstract
- Big web data from sources including online news and Twitter are good resources for investigating deep learning. However, collected news articles and tweets almost certainly contain data unnecessary for learning, and this disturbs accurate learning. This paper explores the performance of word2vec Convolutional Neural Networks (CNNs) to classify news articles and tweets into related and unrelated ones. Using two word embedding algorithms of word2vec, Continuous Bag-of-Word (CBOW) and Skip-gram, we constructed CNN with the CBOW model and CNN with the Skip-gram model. We measured the classification accuracy of CNN with CBOW, CNN with Skip-gram, and CNN without word2vec models for real news articles and tweets. The experimental results indicated that word2vec significantly improved the accuracy of the classification model. The accuracy of the CBOW model was higher and more stable when compared to that of the Skip-gram model. The CBOW model exhibited better performance on news articles, and the Skip-gram model exhibited better performance on tweets. Specifically, CNN with word2vec models was more effective on news articles when compared to that on tweets because news articles are typically more uniform when compared to tweets.
- Subjects :
- Word embedding
Computer science
Social Sciences
02 engineering and technology
010501 environmental sciences
computer.software_genre
01 natural sciences
Convolutional neural network
Machine Learning
Sociology
0202 electrical engineering, electronic engineering, information engineering
Word2vec
Grammar
Multidisciplinary
Artificial neural network
Applied Mathematics
Simulation and Modeling
Social Communication
Semantics
Social Networks
Physical Sciences
Medicine
020201 artificial intelligence & image processing
Information Technology
Network Analysis
Algorithms
Natural language processing
Research Article
Computer and Information Sciences
Neural Networks
Science
Twitter
Research and Analysis Methods
Deep Learning
Artificial Intelligence
Word Embedding
Humans
Syntax
Natural Language Processing
0105 earth and related environmental sciences
Information Dissemination
business.industry
Deep learning
Biology and Life Sciences
Linguistics
Communications
Artificial intelligence
business
Social Media
computer
Mathematics
Neuroscience
Subjects
Details
- ISSN :
- 19326203
- Volume :
- 14
- Database :
- OpenAIRE
- Journal :
- PLOS ONE
- Accession number :
- edsair.doi.dedup.....1e1616695dd57445facd6951484627e1