A minimally distinctive unit in a writing system. Grapheme is the abstraction of a symbol. Graphemes can be letters, ligatures, numerical digits, or punctuation marks.
Published in Chapter:
Application of the Cluster Analysis in Computational Paleography
Loránd Lehel Tóth (Budapest University of Technology and Economics, Hungary), Raymond Eliza Ivan Pardede (Budapest University of Technology and Economics, Hungary), György András Jeney (Budapest University of Technology and Economics, Hungary), Ferenc Kovács (Budapest University of Technology and Economics, Hungary), and Gábor Hosszú (Budapest University of Technology and Economics, Hungary)
Copyright: © 2016
|Pages: 19
DOI: 10.4018/978-1-4666-9479-8.ch020
Abstract
This chapter presents a method to determine the actual version of a script used in constructing of a script relic from unknown origin. The glyphs belong to graphemes as models are realized in the relics as symbols. Some group of glyphs may transform their shape (shapeshifting) through time which produces various versions of scripts that use different glyphs to express the same grapheme. These glyph variants can be identified from extant relics, mainly from historical abecedaries that are used as references. Our algorithm can determine whether or not an abecedary is related to the symbols of a relic from unknown origin by means of the canonical decomposition of the glyphs and symbols. From there an aggregated value called fingerprint is created and it is unique for each relic. The fingerprints then are evaluated by clustering technique using various metrics. As the result of performing comparative evaluations the Minkowski metric provides the most interpretable clustering structure. The results of the evaluations, conclusions, and future work are also presented.