Article Preview
TopIntroduction And Background
According to Singh et al. (2000), “Systems in which human users speak to a computer in order to achieve a goal are called spoken dialogue systems (SDS)”. In recent years, task-oriented spoken dialogue system (SDS) that helps users finish tasks more efficiently via spoken interaction is being applied on various devices (Lison & Kennington 2016). There are many famous technology companies involved in this type of system, such as Apple Siri, Microsoft Cortana, Baidu Duer and so on (Hoy, 2018). As a critical component of SDS, spoken language understanding (SLU) aims to parse users’ queries and convert them to structured representations that machines can handle. The result of SLU is passed to SDS to update dialogue state and take the next proper action. Therefore, the performance of SLU is critical to SDS (Tur & De Mori, 2011).
As SLU has become a focus in research communities, the Shared Task 4 in NLPCC 2018 named “Spoken Language Understanding in Task-Oriented Dialogue Systems” tries to provide a platform for evaluation. It aims to parse users’ multiple rounds of queries in a session and convert them into some structure that machines can handle. To understand users’ queries expressed in spoken language, the task contains two subtasks, namely intent detection (ID) and slot filling (SF), which need to automatically recognize the intent of the queries and extract associated arguments or slots towards achieving a goal.
There are a large number of literatures on ID and SF, and many of them process the subtasks in a pipeline framework; firstly the intent is classified and secondly the semantic slots are extracted. ID is usually framed as a semantic utterance classification (SUC) problem. Many popular classifiers like support vector machines (SVMs) (Fan et al., 2008), maximum entropy (Chelba et al., 2003) and RNN models (Ravuri & Stolcke, 2015) have already been employed before. Similarly, SF can be treated as a sequence tagging problem, which is customarily solved by some traditional approaches, such as Hidden Markov Models (HMMs) (Pieraccini et al., 1992), Conditional Random Fields (CRF) (Raymond & Riccardi, 2007) and various RNN models (Mesnil et al., 2015; Vu et al., 2016; Huang et al., 2015). However, using pipeline systems not only takes more time to process tasks, but also cannot model the interaction between multiple subtasks.
In order to simplify the SLU system and use the shared information provided by ID and SF to promote the results of the two subtasks, more and more joint models for multiple tasks have also been proposed in recent years (Liu & Lane, 2016; Zhang & Wang, 2016; Wen et al., 2017). The ability to feature the correlations between subtasks helps them achieve competitive performances in ATIS.
Motivated by the inherent ability of bidirectional RNN in capturing the past and future features of sequence, this paper proposes a joint model, which uses Bi-LSTM to learn the representation of each word in Chinese query text and then share them with ID task and SF task. With a joint loss function, the two tasks can interact and promote each other through the shared representations. Experimental results demonstrate that the joint model outperforms separate models for each task.
The main contributions of this paper are:
- •
Adaptation of a joint model based attention mechanism, Bi-LSTM and CRF for intent determination and slot filling;
- •
An analysis of how intent determination and slot filling can benefit from the contextual information of the Chinese queries within a session.
TopBackbone Algorithm And Model
This paper proposes an Attention-based Bi-LSTM-CRF Joint Model (ABLCJ) to deal with both tasks simultaneously. The backbone algorithm used in this paper and the structure of ABLCJ model is shown in Figure 1. As the Figure shows, the model is composed of three sub-modules with gray background colors. The Bi-LSTM module below is responsible for feature extraction, the module at the upper left corner can detect intents, and the module at the upper right corner is used for slot filling.