From web to SMS: A text summarization of Wikipedia pages with character limitation
DOI:
https://doi.org/10.4108/eai.11-6-2020.165277Keywords:
Character-limitation summarization, SMS, LSA, TextRank, ROUGE, TACOS, WikipediaAbstract
Wikipedia is one of the main sources of information on the Web. But the access to this content may be difficult especially when using a basic telephone without browsing capability and only a GSM network. The only means of text-based communication remains through SMS. Due to the limitation of the number of characters, a Wikipedia page cannot always be sent through SMS. This work raises the issue of text summarization with character limitation. To solve this issue, two extractive approaches have been combined: LSA and TextRank algorithms. Generated summaries have been evaluated using ROUGE metrics. Since ROUGE metrics do not consider character limitation, a new threshold named Threshold of Acceptability for Character-Oriented Summaries (TACOS) has been proposed to appreciate ROUGE metrics. The evaluation showed the relevance of the approach for pages of at most 2000 characters. The system has been tested using the SMS simulator of RapidSMS without a GSM gateway to simulate the deployment in a real environment. To the best of our knowledge, this is the first work tackling text summarization issue with character limitation.
Downloads
Published
How to Cite
Issue
Section
License
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 3.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.