Author: "Mohsen Pourvali" / Topic: computer science - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Mohsen Pourvali"' showing total 6 results

Start Over Author "Mohsen Pourvali" Topic computer science

6 results on '"Mohsen Pourvali"'

1. Topic Models and Fusion Methods: a Union to Improve Text Clustering and Cluster Labeling

Author: Hosna Omidvarborna, Mohsen Pourvali, and Salvatore Orlando
Subjects: Statistics and Probability, Topic model, Topic structure, Computer Networks and Communications, Computer science, Text Mining, media_common.quotation_subject, Cluster Labeling, text mining, 02 engineering and technology, lcsh:Technology, document clustering, Text mining, Artificial Intelligence, Document Clustering, 0202 electrical engineering, electronic engineering, information engineering, Quality (business), Cluster analysis, media_common, Document Enriching, cluster labeling, Information retrieval, Settore INF/01 - Informatica, business.industry, lcsh:T, 05 social sciences, IJIMAI, 020207 software engineering, document enriching, Document clustering, Sensor fusion, Computer Science Applications, Signal Processing, Cluster labeling, ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, Computer Vision and Pattern Recognition, 0509 other social sciences, 050904 information & library sciences, business
Abstract: Topic modeling algorithms are statistical methods that aim to discover the topics running through the text documents. Using topic models in machine learning and text mining is popular due to its applicability in inferring the latent topic structure of a corpus. In this paper, we represent an enriching document approach, using state-ofthe-art topic models and data fusion methods, to enrich documents of a collection with the aim of improving the quality of text clustering and cluster labeling. We propose a bi-vector space model in which every document of the corpus is represented by two vectors: one is generated based on the fusion-based topic modeling approach, and one simply is the traditional vector model. Our experiments on various datasets show that using a combination of topic modeling and fusion methods to create documents’ vectors can significantly improve the quality of the results in clustering the documents.
Published: 2019

2. Detecting Covariate Drift with Explanations

Author: Steffen Castle, Robert Schwarzenberg, and Mohsen Pourvali
Subjects: Drift detection, business.industry, Computer science, Data domain, Evaluation data, A domain, Inference, computer.software_genre, Domain (software engineering), Covariate, Artificial intelligence, Data mining, business, computer, Natural language processing
Abstract: Detecting when there is a domain drift between training and inference data is important for any model evaluated on data collected in real time. Many current data drift detection methods only utilize input features to detect domain drift. While effective, these methods disregard the model’s evaluation of the data, which may be a significant source of information about the data domain. We propose to use information from the model in the form of explanations, specifically gradient times input, in order to utilize this information. Following the framework of Rabanser et al. [11], we combine these explanations with two-sample tests in order to detect a shift in distribution between training and evaluation data. Promising initial experiments show that explanations provide useful information for detecting shift, which potentially improves upon the current state-of-the-art.
Published: 2021
Full Text: View/download PDF

3. Path-Based Visual Explanation

Author: Yao Meng, Yucheng Jin, Mohsen Pourvali, Lei Wang, Changjian Hu, Masha Gorkovenko, and Chen Sheng
Subjects: Black box (phreaking), Computer science, business.industry, Ranging, Special Interest Group, computer.software_genre, Human–computer interaction, Path (graph theory), Case-based reasoning, Artificial intelligence, Visual interface, business, computer, Natural language processing
Abstract: The ability to explain the behavior of a Machine Learning (ML) model as a black box to people is becoming essential due to wide usage of ML applications in critical areas ranging from medicine to commerce. Case-Based Reasoning (CBR) received a special interest among other methods of providing explanations for model decisions due to the fact that it can easily be paired with a black box and then can propose a post-hoc explanation framework. In this paper, we propose a CBR-Based method to not only explain a model decision but also provide recommendations to the user in an easily understandable visual interface. Our evaluation of the method in a user study shows interesting results.
Published: 2020
Full Text: View/download PDF

4. Enriching Documents by Linking Salient Entities and Lexical-Semantic Expansion

Author: Mohsen Pourvali and Salvatore Orlando
Subjects: 68p20, Document clustering, document enriching, Software, Information Systems, Artificial Intelligence, Computer science, Science, 02 engineering and technology, computer.software_genre, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Semantic expansion, Settore INF/01 - Informatica, business.industry, QA75.5-76.95, 68u15, Salient, Electronic computers. Computer science, ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Natural language processing
Abstract: This paper explores a multi-strategy technique that aims at enriching text documents for improving clustering quality. We use a combination of entity linking and document summarization in order to determine the identity of the most salient entities mentioned in texts. To effectively enrich documents without introducing noise, we limit ourselves to the text fragments mentioning the salient entities, in turn, belonging to a knowledge base like Wikipedia, while the actual enrichment of text fragments is carried out using WordNet. To feed clustering algorithms, we investigate different document representations obtained using several combinations of document enrichment and feature extraction. This allows us to exploit ensemble clustering, by combining multiple clustering results obtained using different document representations. Our experiments indicate that our novel enriching strategies, combined with ensemble clustering, can improve the quality of classical text clustering when applied to text corpora like The British Broadcasting Corporation (BBC) NEWS.
Published: 2018

5. Improving clustering quality by automatic text summarization

Author: Salvatore Orlando, Mohsen Pourvali, and Mehrad Gharagozloo
Subjects: Information retrieval, Settore INF/01 - Informatica, Computer science, Process (engineering), media_common.quotation_subject, Computer Science (all), Theoretical Computer Science, Document clustering, computer.software_genre, Automatic summarization, Field (computer science), ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, Graph (abstract data type), Quality (business), Data mining, Noise (video), Cluster analysis, computer, media_common
Abstract: Automatic text summarization is the process of reducing the size of a text document, to create a summary that retains the most important points of the original document. It can thus be applied to summarize the original document by decreasing the importance or removing part of the content. The contribution of this paper in this field is twofold. First we show that text summarization can improve the performance of classical text clustering algorithms, in particular by reducing noise coming from long documents that can negatively affect clustering results. Moreover, the clustering quality can be used to quantitatively evaluate different summarization methods. In this regards, we propose a new graph-based summarization technique for keyphrase extraction, and use the Classic4 and BBC NEWS datasets to evaluate the improvement in clustering quality obtained using text summarization.
Published: 2015

6. A new graph based text segmentation using Wikipedia for automatic text summarization

Author: Mohsen Pourvali and Ph.D. Mohammad
Subjects: Information retrieval, General Computer Science, Computer science, business.industry, Text segmentation, Text graph, computer.software_genre, Automatic summarization, Text mining, Knowledge base, Multi-document summarization, ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, Artificial intelligence, business, tf–idf, computer, Natural language processing
Abstract: The technology of automatic document summarization is maturing and may provide a solution to the information overload problem. Nowadays, document summarization plays an important role in information retrieval. With a large volume of documents, presenting the user with a summary of each document greatly facilitates the task of finding the desired documents. Document summarization is a process of automatically creating a compressed version of a given document that provides useful information to users, and multi-document summarization is to produce a summary delivering the majority of information content from a set of documents about an explicit or implicit main topic. According to the input text, in this paper we use the knowledge base of Wikipedia and the words of the main text to create independent graphs. We will then determine the important of graphs. Then we are specified importance of graph and sentences that have topics with high importance. Finally, we extract sentences with high importance. The experimental results on an open benchmark datasets from DUC01 and DUC02 show that our proposed approach can improve the performance compared to state-of-the-art summarization approaches.
Published: 2012
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

6 results on '"Mohsen Pourvali"'

1. Topic Models and Fusion Methods: a Union to Improve Text Clustering and Cluster Labeling

2. Detecting Covariate Drift with Explanations

3. Path-Based Visual Explanation

4. Enriching Documents by Linking Salient Entities and Lexical-Semantic Expansion

5. Improving clustering quality by automatic text summarization

6. A new graph based text segmentation using Wikipedia for automatic text summarization

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

6 results on '"Mohsen Pourvali"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources