Back to Search
Start Over
Mixture Model and MDSDCA for Textual Data
- Source :
- Cooperative Design, Visualization, and Engineering. Lecture Notes in Computer Science, Cooperative Design, Visualization, and Engineering. Lecture Notes in Computer Science, 5738, pp.240-244, 2009, ⟨10.1007/978-3-642-04265-2_35⟩, Lecture Notes in Computer Science ISBN: 9783642042645, CDVE
- Publication Year :
- 2009
- Publisher :
- HAL CCSD, 2009.
-
Abstract
- E-mailing has become an essential component of cooperation in business. Consequently, the large number of messages manually produced or automatically generated can rapidly cause information overflow for users. Many research projects have examined this issue but surprisingly few have tackled the problem of the files attached to e-mails that, in many cases, contain a substantial part of the semantics of the message. This paper considers this specific topic and focuses on the problem of clustering and visualization of attached files. Relying on the multinomial mixture model, we used the Classification EM algorithm (CEM) to cluster the set of files, and MDSDCA to visualize the obtained classes of documents. Like the Multidimensional Scaling method, the aim of the MDSDCA algorithm based on the Difference of Convex functions is to optimize the stress criterion. As MDSDCA is iterative, we propose an initialization approach to avoid starting with random values. Experiments are investigated using simulations and textual data.
- Subjects :
- 021103 operations research
Theoretical computer science
Computer science
0211 other engineering and technologies
Initialization
02 engineering and technology
computer.software_genre
Mixture model
01 natural sciences
Visualization
Set (abstract data type)
010104 statistics & probability
Component (UML)
Expectation–maximization algorithm
[INFO]Computer Science [cs]
Data mining
Multidimensional scaling
0101 mathematics
Cluster analysis
computer
ComputingMilieux_MISCELLANEOUS
Subjects
Details
- Language :
- English
- ISBN :
- 978-3-642-04264-5
- ISBNs :
- 9783642042645
- Database :
- OpenAIRE
- Journal :
- Cooperative Design, Visualization, and Engineering. Lecture Notes in Computer Science, Cooperative Design, Visualization, and Engineering. Lecture Notes in Computer Science, 5738, pp.240-244, 2009, ⟨10.1007/978-3-642-04265-2_35⟩, Lecture Notes in Computer Science ISBN: 9783642042645, CDVE
- Accession number :
- edsair.doi.dedup.....b3191dce3c4ab0597af07abcfd6f47c9
- Full Text :
- https://doi.org/10.1007/978-3-642-04265-2_35⟩