Article Preview
TopBackground
The exploration of large collections of video data is non-trivial. When a user requests a search, the query formulation (search criterion) can be quite difficult.
Most of search systems are based on textual metadata, which leads to several problems when searching for visual content. Generally, the user lacks information about which keywords best represent the content in which he/she is interested. In fact, different users tend to use different words to describe a same visual content. The lack of systematization in choosing query words can significantly affect the search results (De Rooij et al., 2008).
Modern systems have addressed those shortcomings by automatically detecting visual concepts derived from visual properties, such as color, texture, and shape. However, a minimum knowledge about the concept vocabulary is needed for performing a query, which is not appropriate for non-expert users (Zavesky & Chang, 2008).
Fully automated approaches have combined descriptors of multiple modalities (textual metadata, visual properties, and visual concepts). In spite of all the advances, the formulation of a query using such features is a difficult task for a human interested in a specific video (De Rooij & Worring, 2010).
Once the search results are returned, we can explore many different directions based on query type and user intention. Several visualization techniques have been proposed to assist users in the exploration of result sets (De Rooij et al., 2008; De Rooij & Worring, 2010; Zavesky & Chang, 2008; Zavesky et al., 2008).
Those methods often employ dimensionality reduction algorithms to map the high-dimensional feature space of visual properties into a fixed display. Afterwards, a display strategy is applied for producing user-browsable content (Zavesky et al., 2008).