Google Maps Data Analysis of Clothing Brands in South Punjab, Pakistan

Authors

DOI:

https://doi.org/10.4108/eetsis.v10i3.2677

Keywords:

Sentiment Analysis, Google Maps Data, Clothing Brands, Logistic Regression, Support Vector Machine

Abstract

The Internet is a popular and first-hand source of data about products and services. Before buying a product, people try to gain quick insight by scanning through online reviews about a targeted product. However, searching for a product, collecting all the relevant information, and reaching a decision is a tedious task that needs to be automated. Such composed decision-assisting text data analysis systems are not conveniently available worldwide. Such systems are a dream for major cities of South Punjab, such as Bahawalpur, Multan, and Rahimyar khan. This scenario creates a gap that needs to be filled. In this work, the popularity of clothing brands in three cities of south Punjab has been assessed by analysing the brand's popularity using sentiment analysis by prioritizing brands based on organic feedback from their potential customers. This study uses a combination of quantitative and qualitative research to examine online reviews from Google Maps. The task is accomplished by applying machine learning techniques, Logistic Regression (LR), and Support Vector Machine (SVM), on Google Maps reviews data using the n-gram feature extraction approach. The SVM algorithm proved to be better than others with the uni-bi-trigram features extraction method, achieving an average of 80.93% accuracy.

References

G. Salamander. "Why online reviews are so important?" https://eclincher.com/why-online-reviews-are-so-important/ (accessed 24/05/2022).

J. Ha and S. S. Jang, "Effects of service quality and food quality: The moderating role of atmospherics in an ethnic restaurant segment," International journal of hospitality management, vol. 29, no. 3, pp. 520-529, 2010.

J. Zhang, W. Zheng, and S. Wang, "The study of the effect of online review on purchase behavior: Comparing the two research methods," International Journal of Crowd Science, 2020.

J. Zhao, K. Liu, and L. Xu, "Sentiment analysis: mining opinions, sentiments, and emotions," ed: MIT Press One Rogers Street, Cambridge, MA 02142-1209, USA journals-info …, 2016.

B. Ideas. "Comparison of Local Review Sites: Which Platform is Growing the Fastest?" https://www.brightlocal.com/research/comparison-of-local-review-sites/ (accessed 17-05-2022).

F. Pedregosa et al., "Scikit-learn: Machine learning in Python," the Journal of machine Learning research, vol. 12, pp. 2825-2830, 2011.

S. Bird, "NLTK: the natural language toolkit," in Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, 2006, pp. 69-72.

Live-Counter. "How Big Is The Internet." https://www.live-counter.com/how-big-is-the-internet/ (accessed 27-05-2022).

W. Fu, S. Liu, and G. Srivastava, "Optimization of big data scheduling in social networks," Entropy, vol. 21, no. 9, p. 902, 2019.

C. Zheng, G. He, and Z. Peng, "A Study of Web Information Extraction Technology Based on Beautiful Soup," J. Comput., vol. 10, no. 6, pp. 381-387, 2015.

S. Gojare, R. Joshi, and D. Gaigaware, "Analysis and design of selenium webdriver automation testing framework," Procedia Computer Science, vol. 50, pp. 341-346, 2015.

B. Zhao, "Web scraping," Encyclopedia of big data, pp. 1-3, 2017.

S. Gensler, F. Völckner, M. Egger, K. Fischbach, and D. Schoder, "Listen to your customers: Insights into brand image using online consumer-generated product reviews," International Journal of Electronic Commerce, vol. 20, no. 1, pp. 112-141, 2015.

F. Rosado-Pinto, S. M. C. Loureiro, and R. G. Bilro, "How brand authenticity and consumer brand engagement can be expressed in reviews: a text mining approach," Journal of Promotion Management, vol. 26, no. 4, pp. 457-480, 2020.

N. N. Ho-Dac, S. J. Carson, and W. L. Moore, "The effects of positive and negative online customer reviews: do brand strength and category maturity matter?," Journal of marketing, vol. 77, no. 6, pp. 37-53, 2013.

B. Mathayomchan and V. Taecharungroj, "“How was your meal?” Examining customer experience using Google maps reviews," International Journal of Hospitality Management, vol. 90, p. 102641, 2020.

C. Hutto and E. Gilbert, "Vader: A parsimonious rule-based model for sentiment analysis of social media text," in Proceedings of the international AAAI conference on web and social media, 2014, vol. 8, no. 1, pp. 216-225.

Y. Kim and S. Ross, "Searching for ground truth: a stepping stone in automating genre classification," in International DELOS Conference, 2007: Springer, pp. 248-261.

S. Kannan et al., "Preprocessing techniques for text mining," International Journal of Computer Science & Communication Networks, vol. 5, no. 1, pp. 7-16, 2014.

S. Alam and N. Yao, "The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis," Computational and Mathematical Organization Theory, vol. 25, no. 3, pp. 319-335, 2019.

M. B. Alvi, N. Mahoto, M. A. Unar, and M. A. Shaikh, "An Effective Framework for Tweet Level Sentiment Classification using Recursive Text Pre-Processing Approach," International Journal of Advanced Computer Science and Applications, vol. 10, no. 6, pp. 572-581, 2019.

M. B. Alvi, N. A. Mahoto, M. Alvi, M. A. Unar, and M. A. Shaikh, "Hybrid classification model for twitter data-a recursive preprocessing approach," in 2018 5th International Multi-Topic ICT Conference (IMTIC), 2018: IEEE, pp. 1-6.

S. Liu, Z. Li, X. Cheng, and Y. Lin, "Introduction of recent advanced hybrid information processing," Mobile Networks and Applications, vol. 23, no. 4, pp. 673-676, 2018.

S. Liu, H. Zhou, and X. Cheng, "Recent Advancement in Hybrid Big Data Processing," Mobile Networks and Applications, vol. 25, no. 4, pp. 1514-1517, 2020.

J. J. Webster and C. Kit, "Tokenization as the initial phase in NLP," in COLING 1992 Volume 4: The 14th International Conference on Computational Linguistics, 1992.

B. Habert et al., "Towards tokenization evaluation," in Proceedings of LREC, 1998, vol. 98, pp. 427-431.

V. Balakrishnan and E. Lloyd-Yemoh, "Stemming and lemmatization: a comparison of retrieval performances," 2014.

M. W. Browne, "Cross-validation methods," Journal of mathematical psychology, vol. 44, no. 1, pp. 108-132, 2000.

[29] R. Feldman, "Techniques and applications for sentiment analysis," Communications of the ACM, vol. 56, no. 4, pp. 82-89, 2013.

D. Alessia, F. Ferri, P. Grifoni, and T. Guzzo, "Approaches, tools and applications for sentiment analysis implementation," International Journal of Computer Applications, vol. 125, no. 3, 2015.

T. Shivaprasad and J. Shetty, "Sentiment analysis of product reviews: a review," in 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT), 2017: IEEE, pp. 298-301.

T. Pranckevičius and V. Marcinkevičius, "Comparison of naive bayes, random forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification," Baltic Journal of Modern Computing, vol. 5, no. 2, p. 221, 2017.

M. J. H. Mughal, "Data mining: Web data mining techniques, tools and algorithms: An overview," Information Retrieval, vol. 9, no. 6, 2018.

K. Petrosyan. "Data extraction using API scraping and main challenges." https://kristinelpetrosyan.medium.com/data-extraction-using-api-scraping-and-main-challenges-de4256c1c146 (accessed 25-05-2022).

M. J. Denny and A. Spirling, "Text preprocessing for unsupervised learning: Why it matters, when it misleads, and what to do about it," Political Analysis, vol. 26, no. 2, pp. 168-189, 2018.

S. Vijayarani and R. Janani, "Text mining: open source tokenization tools-an analysis," Advanced Computational Intelligence: An International Journal (ACII), vol. 3, no. 1, pp. 37-47, 2016.

T. Hiraoka, H. Shindo, and Y. Matsumoto, "Stochastic tokenization with a language model for neural text classification," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 1620-1629.

C.-Y. J. Peng, K. L. Lee, and G. M. Ingersoll, "An introduction to logistic regression analysis and reporting," The journal of educational research, vol. 96, no. 1, pp. 3-14, 2002.

Downloads

Published

13-01-2023

How to Cite

1.
Ahmad M, Jawad K, Alvi MB, Alvi M. Google Maps Data Analysis of Clothing Brands in South Punjab, Pakistan. EAI Endorsed Scal Inf Syst [Internet]. 2023 Jan. 13 [cited 2024 Mar. 28];10(3):e10. Available from: https://publications.eai.eu/index.php/sis/article/view/2677