Article Preview
TopIntroduction
Authorship detection has a range of applications in a large number of fields such as forensic evidence, plagiarism detection, email filtering, and web information management (Chen, 2010; Love, 2002; Rafailidis, Nanopoulos, & Manolopoulos, 2010). In recent years, an application with growing interests is web information management. The World Wide Web provides a powerful publication platform, where a large number of images are created for different purposes. For instance, online shopping websites such as eBay enable sellers and buyers to transact on the platform for a broad variety of goods and services worldwide. The goods/products are mostly visualized by having their pictures displayed on the website so that customers can see the visual details of products which often play an important role in customers’ purchase decision-making. EBay sellers, especially those power sellers, often add their own touch to product image design in order to attract buyers’ attention. For example, some sellers add a frame and/or some promotion texts to their listing images (Figure 1), and some sellers embed the name of their store as watermarks or logos in their images. We assume that many eBay sellers have developed distinctive editing styles over time to embed to their product images (logos, background, and so on.). Such editing styles are highly repetitive within one seller’s images, but mostly distinctive among different sellers. We detect and encode such editing styles for each seller using visual features extracted from product images, and in turn use the encoded editing styles to automatically predict the ownership for unlabeled images. Through a collaborative effort between the authors’ institution and eBay, the output of this study will be used as added clues in eBay’s seller profiling system to detect and prevent account taken-over and other related fraudulent behaviors.
Figure 1. Examples of product images from online shopping websites such as eBay
There are several challenges in this study. First, we found that sellers may use more than one image editing style in composing their product images posted on eBay. For example, in Figure 1, the images in the first two rows all belong to the same seller “6ubuy6” but apparently they have visually different editing styles. Therefore, a simple image averaging technique applied to the images within the same seller will not work in the presence of multiple editing styles. Another challenge is that the same product image can be used by multiple sellers who re-edit the image according to their own style, and thus clustering based on global visual features will generate a large amount of false positives due to the common features extracted from the product itself. In this paper, we present a new edge-based clustering method to divide all images within one seller into image groups each of which corresponds to one editing style of that seller in terms of image edge maps (Abdel-Mottaleb, 2000). The clustering algorithm is based on the similarity of image edge maps. After that, summarization can be done to find the common pattern in each image group.