Hybrid Segmentation Prototype for Arabic Text-Based Documents: Towards Plagiarism Detection

Hybrid Segmentation Prototype for Arabic Text-Based Documents: Towards Plagiarism Detection

Sonia Alouane-Ksouri, Minyar Sassi Hidri
DOI: 10.4018/ijssmet.2015010104
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The contribution of this work relates to the field of Arabic text-based document analysis for the detection of plagiarism. This analysis will be carried out according to the triadic computation model of document similarity. The authors propose a hybrid segmentation prototype for Arabic text-based documents that links different processing steps in order to generate the similarity rate between the documents of an Arabic corpus. It involves two segmentation systems and a morphological analysis in order to obtain a matrix representation adapted to the triadic similarity computation according to three abstraction levels: documents, sentences and words.
Article Preview
Top

Particularitis Of Arabic Text

In order to clearly identify this field of application, we give a brief overview of the particularities of an Arabic text: it is read and written from right to left, it lacks vowels and punctuation, the words are characterized by agglutination and the word order in the sentence by irregularity.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 6 Issues (2022): 2 Released, 4 Forthcoming
Volume 12: 6 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing