Article Preview
Top1. Introduction
In recent years, researchers have witnessed the rapid development of online Knowledge Sharing Communities (KSCs), such as blogs, discussion boards, and question answering (Q&A) communities. Users can share experiences and exchange ideas with others in such KSCs. Their sharing activities generate a large amount of knowledge and also attract many expert users of each domain to participate. As a result, more and more people would like to use KSCs for problem solving. Some researchers have found that Q&A content is usually the largest part of content in KSCs (Cong et al., 2008; Feng et al., 2006). However, comparing with the large number of questions, the expert users are still scarce resources in KSCs. As a result, there are a lot of questions without satisfactory answers due to the lack of relevant experts. Thus, how to find the experts for an answer-lacking question becomes an important problem to be addressed.
The problem of expert finding for answer-lacking questions has been well studied. Some of the traditional works leverage content based approaches (e.g., Balog et al., 2006; Liu et al., 2005). In these works, researchers can utilize language models to rank user authority through the question textual distribution in each of the users' historical records. However, these approaches are usually computationally intensive and are hardly applicable to large-scale data sets. Likewise, some novel KSCs are based on multimedia content and the textual information contained in questions are often not rich enough for building language models (Yeh & Darrell, 2008). Therefore, most of the state-of-the-art works leverage question categories as query inputs to find experts (e.g., Bouguessa et al., 2008; Jurczyk & Agichtein, 2007; Kao et al., 2010). A drawback of these works is that each question can only be classified into one category by them. Actually it is usually difficult to select the best category for a question because a question is usually related to multiple categories. For example, an inexperienced user cannot easily select a better category between “Mobile Device” and “Market” for the question “Where can I buy the Nokia new mobile phone N9?” As a result, the conventional category based expert finding approaches may have poor performances for a multiple-category question.
A recent trend of KSCs is to allow users to add text tags for their questions, such as Tianya Wenda (http://www.douban.com). In these web sites, users can use tags as descriptive labels to annotate the contents they post. To be specific, user can add tags like “N9”, “Mobile Market” and “Where” for the above question. Expert users can check the tags of a given question to decide whether to answer it. Compared with the textual information of question content, user-generated tags are simplified as query inputs and can be utilized on large-scale data sets for expert finding. Moreover, user-generated tags contain richer information of the user needs than category information and can be used for facilitating experts finding. However, because the tags are generated by users but not system, they are usually ambiguous and not regular. Therefore, how to leverage these user-generated tags for expert finding becomes a great challenge. The following motivating examples intuitively illustrate the challenges of using tags for expert finding.
- •
Motivating Example 1.Joy posts a question about the Sony video game console “Play Station” and adds a tag “PS” to annotate the question. However, in many contexts, the tag “PS” may be also referred to the Adobe software “Photo Shop”.
- •
Motivating Example 2.Kate wants to buy a new mobile phone and posts a question with tags “Mobile Phone”, “Market”. However, the latent information needs for Kate are about “Discount” and “Trustable store”.
- •
Motivating Example 3.Joy posts a question about computer devices with a tag “PC”, however, Kate may add a tag “Laptop” for the same question. Actually, the different tags may represent similar meanings.