ANTCorpus stands for "Arabic News Texts Corpus". It is a research project that aims to collect texts from different sources of the web by incrementing the amount of data progressively.
The acronym ANT can remind the ants' work:
"Every ant should contribute to build the nest progressively".
The files of ANT Corpus are subject to the following citation license:
By downloading ANT Corpus, you agree to cite at least one of our papers describing ANT Corpus (refer to the section below) and/or refer the project's main page in any kind of material you produce where ANT Corpus was used to conduct search or experimentation, whether be it a research paper, dissertation, article, poster, presentation, or documentation.
✅ By using this data, you have agreed to the citation licence.
Publications
📄 A. Chouigui, O. Ben Khiroun and B. Elayeb. An Arabic Multi-source News Corpus: Experimenting on Single-document Extractive Summarization.
In Arabian Journal for Science and Engineering (AJSE 2021), 46(08), 1-14, DOI : 10.1007/s13369-020-05258-z , February 2021.
📄 A. Chouigui, O. Ben Khiroun and B. Elayeb. ANT Corpus : An Arabic News Text Collection for Textual Classification.
In proceedings of the 14th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 2017), pp. 135-142, Hammamet, Tunisia, October 30 - November 3, 2017.
📄 A. Chouigui, O. Ben Khiroun and B. Elayeb. A TF-IDF and Co-occurrence Based Approach for Events Extraction from Arabic News Corpus.
In proceedings of the 23rd International Conference on Natural Language & Information Systems (NLDB 2018), pp. 272-280, Paris, France, 13-15 June 2018.
📄 A. Chouigui, O. Ben Khiroun and B. Elayeb. Related Terms Extraction from Arabic News Corpus using Word Embedding.
In: OTM Conferences & Workshops: Proceedings of the 7th International Workshop on Methods, Evaluation, Tools and Applications for the Creation and Consumption of Structured Data for the e-Society (Meta4eS'18), Springer, LNCS, pp. 1-11, Valletta, Malta, 22-26 October 2018.