Amazon Product Dataset Community Detection Metrics and Algorithms

Amazon Product Dataset Community Detection Metrics and Algorithms

Chaitali Choudhary, Inder Singh, Soly Mathew Biju, Manoj Kumar
DOI: 10.4018/978-1-6684-8696-2.ch009
OnDemand:
(Individual Chapters)
Available
$33.75
List Price: $37.50
10% Discount:-$3.75
TOTAL SAVINGS: $3.75

Abstract

Community detection in social network analysis is crucial for understanding network structure and organization. It helps identify cohesive groups of nodes, allowing for targeted analysis and interventions. Girvan-Newman, Walktrap, and Louvain are popular algorithms used for community detection. Girvan-Newman focuses on betweenness centrality, Walktrap uses random walks, and Louvain optimizes modularity. Experimental results show that the label propagation algorithm (LPA) is efficient in extracting community structures. LPA has linear time complexity and does not require prior specification of the number of communities. However, it focuses on characterizing the number of communities rather than labeling them. K-clique performs well when the number of communities is known in advance. Louvain excels in modularity and community identification. Overall, community detection algorithms are essential for understanding network structures and functional units.
Chapter Preview
Top

1 Introduction

Subsystems that integrate other subsystems in a hierarchical framework constitute complex systems. Herbert A. Simon made the observation that hierarchical organisation is essential to the growth and development of complex systems as early as 1962. Many complex systems may be described as graphs or networks, where the basic building blocks and relationships between them are represented by nodes and links, respectively. Subsystems in a network appear as subgraphs with dense internal links but weak external linkages (Lancichinetti & Fortunato, 2009). Communities are what are referred to as these subparagraphs, and they are present in many networked systems. Communities explain the internal organisation of a network and imply the presence of certain connections between nodes that may not be immediately apparent via direct empirical testing. Communities may include collections of related websites, linked groups of people in social networks, biochemical routes in metabolic networks, etc. These factors have led to the detection of communities in networks becoming a fundamental subject in network research. People use the internet constantly in their daily lives. A complex network is made up of several individual users who interact in a complex way.

Complex networks’ community structure is one of its most essential properties (Zhang et al., 2019; Chen et al., 2019; Li et al., 2020; Khan et al., 2016; Zhao et al., 2019; Zhao et al., 2021; Zhu et al., 2020; Zhao et al., 2017). In several disciplines of study, community identification in complex networks has gained attention. However, connections between nodes from the same community are often many while those from different communities are frequently few. The edge in an unweighted network does not take into consideration the strength of the connection between the nodes; it just indicates the fact that there is a link between them. However, many edges in actual networks often have strong or weak links, such as the number of transactions between buyers and sellers in the commodities trading network and the number of citations between authors in the citation network. As a result, research on weighted networks has practical applications.

The conventional community-finding method ignores the links between nodes and their second-order neighbours and only considers the relationships between nodes and their near neighbours. This lowers the community detection’s accuracy. A person’s friends’ friends are more likely to be his friends than other individuals in a real social network. Similar to this, a node’s second-order neighbours may influence whether it belongs to a community. High-dimensional similarity matrices cannot represent the main features of the network typology when grouping similarity matrices. As a result, this study offers a very accurate weighted network approach for community identification that is based on deep learning. This method gets the high-dimensional network similarity matrix first, which contains the information on the nodes’ second-order neighbours’ similarity, and then performs a dimension reduction on it. To get a precise network community structure, the software ends by doing a cluster analysis on the resultant low-dimensional feature matrix. The main contributions of this article are as follows:

  • 1.

    By comparing the similarities between the nodes and the things they jointly purchased, preprocess the data. The similarity between the nodes is accounted for by the processed matrix.

  • 2.

    On a simple Amazon co-purchase data set, a number of community detection techniques are used, and fundamental evaluation parameters are computed.

  • 3.

    Varied Amazon data sets of various sizes have been the subject of many experiments. The amortised approach proposed in this study may provide a more exact community structure than traditional methodologies, according to experimental results.

Girvan-Newman, Walktrap, and Louvain are three popular community detection algorithms used in social network analysis.

Complete Chapter List

Search this Book:
Reset