Semantic Network Model for Sign Language Comprehension

Semantic Network Model for Sign Language Comprehension

Xinchen Kang, Dengfeng Yao, Minghu Jiang, Yunlong Huang, Fanshu Li
DOI: 10.4018/IJCINI.309991
Article PDF Download
Open access articles are freely available for download

Abstract

In this study, the authors propose a computational cognitive model for sign language (SL) perception and comprehension with detailed algorithmic descriptions based on cognitive functionalities in human language processing. The semantic network model (SNM) that represents semantic relations between concepts is used as a form of knowledge representation. The proposed model is applied in the comprehension of sign language for classifier predicates. The spreading activation search method is initiated by labeling a set of source nodes (e.g., concepts in the semantic network) with weights or “activation” and then iteratively propagating or “spreading” that activation out to other nodes linked to the source nodes. The results demonstrate that the proposed search method improves the performance of sign language comprehension in the SNM.
Article Preview
Top

Introduction

Sign language (SL) comprehension is a fundamental task for computational linguists. Two types of algorithms have been proposed: (1) rule-based methods (Supalla, 1982), and (2) statistical methods (Bauer & Heinz, 2000; Huenerfauth, 2005). Rule-based methods lack the capability of planning the elements in the entire scene (Liddell, 2003). The method of modeling infinite natural language input through finite rules, especially minor rules, barely meets all requirements of SL processing (Yao et al., 2017). Therefore, statistical methods are the preferred type of algorithm for SL comprehension. Statistical models can be applied to spoken languages. Given the abundant data resources of spoken languages in the digitalized Internet age, statistical models can be applied readily. However, the raw and annotated corpora of SLs are insufficient because collecting and annotating SL videos are tedious and difficult. Data sparsity consequently remains as the most serious problem when applying statistical models onto SLs. For example, the real-time factor (RTF) of the SL video corpus is 100; that is, an hour corpus requires at least 100 hours of annotation (Dreuw et al., 2008b).

Simulating SL comprehension using traditional statistical models and machine-learning methods is difficult. Thus, reliable methods for establishing a signer’s 3-D model (which is the process of developing a mathematical representation of any three-dimensional surface of moving trajectories of signers in the space for SL via specialized software) for SL corpus building and technologies for annotating a large-scale SL video corpus automatically must be developed. Unlike the spoken language that is “a set of values that change with the passage of time” (Huenerfauth, 2005), SL does not have a writing system and thus cannot be saved in any form of written texts.

The natural language-processing system relies on texts to process spoken languages. This system records only the written text that corresponds to speech flows and relies only on the literacy of the user. On the other hand, the SL system comprises information from multiple modalities. Examples of such information are the hand shape, hand location, hand movement, hand orientation, head tilting, shoulder tilting, eye gazing, body gestures, and facial expressions. The considerable information from multiple channels in SL conveys linguistic meaning. This multi-modality nature of SL poses difficulties for the coding of SLs into a linear single-channeled character string. In addition, SLs have writing systems, such as the Sign Writing system (Sutton, 2010), ASL-phabet (Supalla et al., 2008), and HamNoSys (Prillwitz et al., 1989). However, these systems have a limited number of users (Johnston, 2004).

Many linguistic details are lost because of the multi-modality nature of SL during the translation of SL into its corresponding writing system. SLs may be understood by directly matching the visual–spatial characteristics of SL with the semantic units in the brain rather than applying written texts as an interpreting medium. Here, semantic units are generally used for processing natural languages; these units or nodes contain some information, which are used as knowledge representations form semantic units (Geva et al., 2000). Such direct matching also represents the most natural way of comprehending SLs in the brain. From this perspective, the authors present a computational cognitive model for SL comprehension that is based on the cognitive functionalities of the human brain combined with a knowledge representation theory of artificial intelligence (Shuklin, 2001).

Complete Article List

Search this Journal:
Reset
Volume 18: 1 Issue (2024)
Volume 17: 1 Issue (2023)
Volume 16: 1 Issue (2022)
Volume 15: 4 Issues (2021)
Volume 14: 4 Issues (2020)
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing