From web to SMS: A text summarization of Wikipedia pages with character limitation

Authors

DOI:

https://doi.org/10.4108/eai.11-6-2020.165277

Keywords:

Character-limitation summarization, SMS, LSA, TextRank, ROUGE, TACOS, Wikipedia

Abstract

Wikipedia is one of the main sources of information on the Web. But the access to this content may be difficult especially when using a basic telephone without browsing capability and only a GSM network. The only means of text-based communication remains through SMS. Due to the limitation of the number of characters, a Wikipedia page cannot always be sent through SMS. This work raises the issue of text summarization with character limitation. To solve this issue, two extractive approaches have been combined: LSA and TextRank algorithms. Generated summaries have been evaluated using ROUGE metrics. Since ROUGE metrics do not consider character limitation, a new threshold named Threshold of Acceptability for Character-Oriented Summaries (TACOS) has been proposed to appreciate ROUGE metrics. The evaluation showed the relevance of the approach for pages of at most 2000 characters. The system has been tested using the SMS simulator of RapidSMS without a GSM gateway to simulate the deployment in a real environment. To the best of our knowledge, this is the first work tackling text summarization issue with character limitation.

Downloads

Published

11-06-2020

How to Cite

1.
Fendji J, Aminatou B. From web to SMS: A text summarization of Wikipedia pages with character limitation. EAI Endorsed Trans Creat Tech [Internet]. 2020 Jun. 11 [cited 2024 Nov. 24];7(24):e5. Available from: https://publications.eai.eu/index.php/ct/article/view/1443