A Review on Recent Arabic Information Retrieval Techniques
DOI:
https://doi.org/10.4108/eetiot.v8i3.2276Keywords:
Information Retrieval, Arabic, Natural language processing, Indexing, Ranking, EvaluationAbstract
Information retrieval is an important field that aims to provide a relevant document to a user information need, expressed through a query. Arabic is a challenging language that gained much attention recently in the information retrieval domain. To overcome the problems related to its complexity, many studies and techniques have been presented, most of them were conducted to solve the stemming problem. This paper presents an overview of the Arabic information retrieval process, including various text processing techniques, ranking approaches, evaluation measures, and some important information retrieval models. The paper finally presents some recent related studies and approaches in different Arabic information retrieval fields.
Downloads
References
D. Harman, “Information retrieval: The early years,” Foundations and Trends in Information Retrieval, vol. 13, no. 5. Now Publishers Inc, pp. 425–577, 2019. doi: 10.1561/1500000065. DOI: https://doi.org/10.1561/1500000065
C. D. Manning, P. Raghavan, and H. Schütze, “An Introduction to Information Retrieval”.
W. Bruce Croft Donald Metzler Trevor Strohman, “Search Engines Information Retrieval in Practice.”
I. Guellil, H. Saâdane, F. Azouaou, B. Gueni, and D. Nouvel, “Arabic natural language processing: An overview,” Journal of King Saud University - Computer and Information Sciences, vol. 33, no. 5, pp. 497–507, Jun. 2021, doi: 10.1016/J.JKSUCI.2019.02.006. DOI: https://doi.org/10.1016/j.jksuci.2019.02.006
K. Darwish and W. Magdy, “Arabic information retrieval,” Foundations and Trends in Information Retrieval, vol. 7, no. 4, pp. 239–342, 2013, doi: 10.1561/1500000031. DOI: https://doi.org/10.1561/1500000031
IEEE Computer Society., 2011 IEEE GCC Conference and Exhibition : GCC : ... took place February 19-22, 2100 in Dubai, UAE. IEEE Computer Society, 2011.
Z. Alyafeai, M. S. Al-shaibani, M. Ghaleb, and I. Ahmad, “Evaluating Various Tokenizers for Arabic Text Classification,” Jun. 2021, [Online]. Available: http://arxiv.org/abs/2106.07540 DOI: https://doi.org/10.1007/s11063-022-10990-8
H. Alshalabi, S. Tiun, N. Omar, E. abdulwahab Anaam, and Y. Saif, “BPR algorithm: New broken plural rules for an Arabic stemmer,” Egyptian Informatics Journal, Feb. 2022, doi: 10.1016/j.eij.2022.02.006. DOI: https://doi.org/10.1016/j.eij.2022.02.006
A. A. Taan, S. U. R. Khan, A. Raza, A. M. Hanif, and H. Anwar, “Comparative Analysis of Information Retrieval Models on Quran Dataset in Cross-Language Information Retrieval Systems,” IEEE Access, vol. 9, pp. 169056–169067, 2021, doi: 10.1109/ACCESS.2021.3126168. DOI: https://doi.org/10.1109/ACCESS.2021.3126168
S. Ibrihich, A. Oussous, O. Ibrihich, and M. Esghir, “A Review on recent research in information retrieval,” in Procedia Computer Science, 2022, vol. 201, no. C, pp. 777–782. doi: 10.1016/j.procs.2022.03.106. DOI: https://doi.org/10.1016/j.procs.2022.03.106
A. el Kah and I. Zeroual, “The effects of Pre-Processing Techniques on Arabic Text Classification,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 10, no. 1, pp. 41–48, Feb. 2021, doi: 10.30534/ijatcse/2021/061012021. DOI: https://doi.org/10.30534/ijatcse/2021/061012021
A. el Mahdaouy, S. O. el Alaoui, and E. Gaussier, “Improving Arabic information retrieval using word embedding similarities,” International Journal of Speech Technology, vol. 21, no. 1, pp. 121–136, Mar. 2018, doi: 10.1007/s10772-018-9492-y. DOI: https://doi.org/10.1007/s10772-018-9492-y
H. Alshalabi, S. Tiun, N. Omar, F. N. AL-Aswadi, and K. Ali Alezabi, “Arabic light-based stemmer using new rules,” Journal of King Saud University - Computer and Information Sciences, 2021, doi: 10.1016/j.jksuci.2021.08.017. DOI: https://doi.org/10.1016/j.jksuci.2021.08.017
E. H. Nfaoui, Jāmiʻat Sīdī Muḥammad ibn ʻAbd Allāh. Faculty of Sciences Dhar El Mahraz, IEEE Computer Society, and Institute of Electrical and Electronics Engineers, ISCV’17 : 2017 Intelligent Systems and Computer Vision (ISCV) : April 17-19, 2017, Faculty of Sciences Dhar El Mahraz (FSDM), Fez, Morocco.
A. A. Freihat, M. Abbas, G. Bella, and F. Giunchiglia, “Towards an Optimal Solution to Lemmatization in Arabic,” in Procedia Computer Science, 2018, vol. 142, pp. 132–140. doi: 10.1016/j.procs.2018.10.468. DOI: https://doi.org/10.1016/j.procs.2018.10.468
M. A. Abderrahim, M. Dib, M. E. A. Abderrahim, and M. A. Chikh, “Semantic indexing of Arabic texts for information retrieval system,” International Journal of Speech Technology, vol. 19, no. 2, pp. 229–236, Jun. 2016, doi: 10.1007/s10772-015-9307-3. DOI: https://doi.org/10.1007/s10772-015-9307-3
V. N. Gudivada, D. L. Rao, and A. R. Gudivada, “Information Retrieval: Concepts, Models, and Systems,” in Handbook of Statistics, vol. 38, Elsevier B.V., 2018, pp. 331–401. doi: 10.1016/bs.host.2018.07.009. DOI: https://doi.org/10.1016/bs.host.2018.07.009
S. Dahir and A. el Qadi, “A query expansion method based on topic modeling and DBpedia features,” International Journal of Information Management Data Insights, vol. 1, no. 2, Nov. 2021, doi: 10.1016/j.jjimei.2021.100043. DOI: https://doi.org/10.1016/j.jjimei.2021.100043
H. ALMarwi, M. Ghurab, and I. Al-Baltah, “A hybrid semantic query expansion approach for Arabic information retrieval,” Journal of Big Data, vol. 7, no. 1, Dec. 2020, doi: 10.1186/s40537-020-00310-z. DOI: https://doi.org/10.1186/s40537-020-00310-z
Y. H. Farhan, M. Mohd, and S. A. M. Noah, “Survey of Automatic Query Expansion for Arabic Text Retrieval,” Journal of Information Science Theory and Practice, vol. 8, no. 4, pp. 67–86, 2020, doi: 10.1633/JISTaP.2020.8.4.6.
M. N. Asim, M. Wasim, M. U. G. Khan, N. Mahmood, and W. Mahmood, “The Use of Ontology in Retrieval: A Study on Textual, Multilingual, and Multimedia Retrieval,” IEEE Access, vol. 7, pp. 21662–21686, 2019, doi: 10.1109/ACCESS.2019.2897849. DOI: https://doi.org/10.1109/ACCESS.2019.2897849
H. M. Al-Barhamtoshy, K. M. Jambi, S. M. Abdou, and M. A. Rashwan, “Arabic Documents Information Retrieval for Printed, Handwritten, and Calligraphy Image,” IEEE Access, vol. 9, pp. 51242–51257, 2021, doi: 10.1109/ACCESS.2021.3066477. DOI: https://doi.org/10.1109/ACCESS.2021.3066477
A. Omar and M. Aldawsari, “Lexical Ambiguity in Arabic Information Retrieval: The Case of Six Web-Based Search Engines,” International Journal of English Linguistics, vol. 10, no. 3, p. 219, Apr. 2020, doi: 10.5539/ijel.v10n3p219. DOI: https://doi.org/10.5539/ijel.v10n3p219
A. Alnaied, M. Elbendak, and A. Bulbul, “An intelligent use of stemmer and morphology analysis for Arabic information retrieval,” Egyptian Informatics Journal, vol. 21, no. 4, pp. 209–217, Dec. 2020, doi: 10.1016/j.eij.2020.02.004. DOI: https://doi.org/10.1016/j.eij.2020.02.004
I. Moawad, W. Alromima, and R. Elgohary, “Bi-Gram Term Collocations-based Query Expansion Approach for Improving Arabic Information Retrieval,” Arabian Journal for Science and Engineering, vol. 43, no. 12, pp. 7705–7718, Dec. 2018, doi: 10.1007/s13369-018-3145-y. DOI: https://doi.org/10.1007/s13369-018-3145-y
A. el Mahdaouy, S. O. el Alaoui, and E. Gaussier, “Word-embedding-based pseudo-relevance feedback for Arabic information retrieval,” Journal of Information Science, vol. 45, no. 4, pp. 429–442, Aug. 2019, doi: 10.1177/0165551518792210. DOI: https://doi.org/10.1177/0165551518792210
A. el Mahdaouy, E. Gaussier, and S. O. el Alaoui, “Should one use term proximity or multi-word terms for Arabic information retrieval?,” Computer Speech and Language, vol. 58, pp. 76–97, Nov. 2019, doi: 10.1016/j.csl.2019.04.002. DOI: https://doi.org/10.1016/j.csl.2019.04.002
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 EAI Endorsed Transactions on Internet of Things
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 3.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.