American Journal of Computer Science and Engineering Survey Open Access

  • ISSN: 2349-7238
  • Journal h-index: 9
  • Journal CiteScore: 1.72
  • Journal Impact Factor: 1.11
  • Average acceptance to publication time (5-7 days)
  • Average article processing time (30-45 days) Less than 5 volumes 30 days
    8 - 9 volumes 40 days
    10 and more volumes 45 days
Reach us +32 25889658

Abstract

A Novel Approach in Automated Bengali Text Summarizing by Statistical and Sentence Similarity Method

Md. Sadek Hossain Asif

World is now moving in faster speed with the blessings of technology. Information is vastly stored in the cloud instead of hard copy documents or compact disk. Hence, to keep information in short and concise way in the cloud, summarization of information could be a greater choice. Doing manual summarization is obviously tedious task and hence data scientists are thinking of an automated process that provides human quality summary. In this paper, we work with two algorithms, namely, statistical and sentence similarity approach. The first approach returns the summary based on frequency of word appearances processing the probability theory while the second figures out the similarity of sentences based on python NLTK corpora and WordNet modules. While testing with several inputs, we observe that the sentence similarity approach gives much better result than statistical approach although it needs a slightly much time. Therefore, sentence similarity could be considered as the best approach of automatic text summarization than statistical approach. Besides, in our paper, we choose python as a programming language considering its various advantages like having open source NLTK library, Brown Corpus and WordNet database, integration properties etc.