Article Preview
Top1. Introduction
The problem of feature selection (Guyon & Elisseeff, 2003; Wang, Ye, Wang, & Yu, 2019) has been widely investigated due to its importance for pattern recognition and image processing systems. This problem can be formulated as follows: identify an optimal feature subset which provides the best tradeoff between its size and relevance for a given task. The identified features not only provide an effective solution for the task, but also provide a dimensionally-reduced view of the underlying data.
Supervised knowledge (e.g., labels or pair-wise relationships) associated to data is capable of significantly improving the performance of feature selection methods (Chandrashekar, Sahin, & Engineering, 2014). However, it should be noted that existing supervised feature selection methods are facing an enormous challenge — the generation of reliable supervised knowledge cannot catch up with the rapid growth of newly-emerging concepts and multimedia data. In practice, it is costly to annotate sufficient training data for new concepts timely, and meanwhile, impractical to retrain the feature selection model whenever a new concept emerges. As illustrated in Figure 1, traditional methods perform well on the seen concepts which have correct guidance, but they may easily fail on the unseen concepts which have never been observed, like the newly invented product “quadrotor”. Therefore, the problem of Zero-shot Feature Selection (ZSFS), i.e., building a feature selection model that generalizes well to unseen concepts with limited training data of seen concepts, deserves great attention. However, few studies have considered this problem.
The major challenge in the ZSFS problem is how to deduce the knowledge of unseen concepts from seen concepts. In fact, the primary reason why existing studies fail to handle unseen concepts is that they only consider the discrimination among seen concepts (like the 0/1-form class labels illustrated in Figure 1), such that little knowledge could be deduced for unseen concepts. To address this, as illustrated in Figure 2, we adopt the class-semantic descriptions (i.e., attributes) as supervision for feature selection. This idea is inspired by the recent development of Zero-shot Learning (ZSL) (Farhadi, Endres, Hoiem, & Forsyth, 2009) (Guo, Ding, Han, & Gao, 2017) which has demonstrated that the capacity of inferring attributes allows us to describe, compare, or even categorize unseen objects.
An attendant problem is how to identify reliable discriminative features with attributes which might be inaccurate and noisy (Jayaraman & Grauman, 2014). To alleviate this, we further propose a novel loss function (named center-characteristic loss) which encourages the selected features to capture the central characteristics of seen concepts. Theoretically, this loss function is a variant of the center loss (Wen, Zhang, Li, & Qiao, 2016) which has shown its effectiveness to learn discriminative and generalized features for categorizing unseen objects.
Figure 1. The Zero-shot Feature Selection Problem
Figure 2. Overview of the proposed method
We evaluate the performance of the proposed method on several real-world datasets, including SUN, aPY and CIFAR10. One point should be noted is that the attributes of CIFAR10 are automatically generated from a public Wikipedia text-corpus (Shaoul, 2010) by a well-known NLP tool (Huang, Socher, Manning, & Ng, 2012). The experimental evidence shows that no matter with manually or automatically generated attributes, our method generalizes well to unseen concepts.
We summarize our main contributions as follows: