Back to Search Start Over

A Probabilistic Approach for Extractive Summarization Based on Clustering Cum Graph Ranking Method

Authors :
Amreen Ahmad
Tanvir Ahmad
Sarfaraz Masood
Mohd. Khizir Siddiqui
Basma Abd El-Rahiem
Pawel Plawiak
Fahad Alblehai
Source :
IEEE Access, Vol 12, Pp 70464-70479 (2024)
Publication Year :
2024
Publisher :
IEEE, 2024.

Abstract

Online information has increased tremendously in today’s age of the Internet. As a result, the need has arisen to extract relevant content from the plethora of available information. Researchers are widely using automatic text summarization techniques for extracting useful and relevant information from voluminous available information. The summary obtained from the automatic text summarization often faces the issues of diversity and information coverage. Earlier researchers have used graph-based approaches for ranking and optimization. This research work introduces a probabilistic approach named as ClusRank for summary extraction, comprising of a two-stage sentence selection model involving clustering and then ranking of sentences. The initial stage involves clustering of sentences using a proposed overlapping clustering algorithm on the weighted network, and later selection of salient sentences using the introduced probabilistic approach. In the analysis of real-world networks, community structure development is essential because it provides strategic insights that help decision-makers make well-informed choices. Furthermore, methodologically strict community detection algorithms are required due to the occurrence of discontinuous, overlapping, and nested community patterns in such networks.This research work, an algorithm is presented for detecting overlapping communities based on the concept of rough set and granular information on links. The sentence selection algorithm based on budget maximum coverage approach supports the assumption that larger sub-topics in a document are of more importance than smaller subtopics. The performance of the proposed probabilistic ClusRank is validated on DUC2001, DUC 2002, DUC2004, and DUC 2006 data sets.

Details

Language :
English
ISSN :
21693536
Volume :
12
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.177b7e5071c44c5fbab63407741e2378
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2024.3392252