Languages in which the symbols of the language are written in a conjoined and/or flowing manner, like, Arabic, Persian, Cyrillic, etc.
Published in Chapter:
A Metaheuristic Algorithm for OCR Baseline Detection of Arabic Languages
F. Daneshfar (University of Kurdistan, Iran), W. Fathy (University of Kurdistan, Iran), and B. Alaqeband (University of Kurdistan, Iran)
Copyright: © 2015
|Pages: 28
DOI: 10.4018/978-1-4666-7258-1.ch023
Abstract
Preprocessing is a very important part of cursive languages Optical Character Recognition (OCR) systems. Thus, baseline detection, which is one of the main parts of the preprocessing operation, plays a basic role on OCR systems; improvement on baseline detection could be absolutely useful for decreasing errors in recognition words. In this chapter, a metaheuristic- and mathematical-based algorithm is recommended, which has improved the baseline detection process in relation to the well-known baseline detection algorithms. The most important advantages of the proposed method are simplicity, high speed processing, and reliability. To test this novel solution, IFN/ENIT database, which is a well-known and attending database, is utilized. However, the proposed solution is reliable to any standard database of cursive language's OCR.