Given a source language sentence f consisting of J words and a target language sentence e consisting of I words , the alignmentabetweeneandfis defined as a subset of the Cartesian product of the word positions:
The alignment a connects words in the target language sentence to words in the source language sentence. The set of alignments a is defined as the set of all possible connections between each word at position i in the target language sentence to one word at position j in the source language sentence. Figure 1 illustrates a word alignment between an English-Vietnamese sentence pair. The Vietnamese word tôi is aligned to the English word me because they are translations of one another. Similarly, the Vietnamese word vượt is aligned to the English word passed, etc.
An example of a word alignment between an English-Vietnamese sentence pair, the English and Vietnamese words are listed vertically and horizontally, respectively. The dark grey cells indicate the correspondences between the words in the two languages.