Article Preview
TopIntroduction
Network Science has been an actively researched area for analysis and visualization of characteristics of large complex real-world networks. Several metrics (a.k.a. measures) have been used for network analysis. Of these, centrality of the nodes constituting the network (a measure of the importance of the nodes with respect to the rest of the nodes) is one of the key metrics. The centrality-based ranking of nodes in real-world networks is purely a link-statistic based approach and does not depend on any offline information such as reputation of the particular entity representing the node, etc (Newman, 2010). The widely-used centrality metrics are: Degree, Betweeness, Eigenvector and Closeness, all of which are studied in this paper.
We model large real-world network as a graph of vertices and edges (vertices represent the nodes - entities like players, countries, airports, books, etc constituting the real-world network, and edges represent the interactions between these entities in the form of competitions played against, airline connections, books related to a particular domain, etc). The currently available application software (e.g., (Gephi, 2014 & Pajek, 2014) as well as literature for network analysis and visualization focus on several graph theoretic properties like diameter, clustering coefficient, path length, components, communities, etc. To the best of our knowledge, we could not find any work on determining minimum connected dominating sets (CDS) of these real-world networks. A CDS (Cormen et al., 2009) of a graph is a subset of the vertices (and the edges connecting these vertices) such that every vertex in the graph is either in the CDS or is connected to a node in the CDS (in the latter case, the node is said to be covered by a node in the CDS).
A CDS forms the connected backbone of a network graph and could be used as the underlying communication topology for broadcasting throughout the network with the minimum number of forward transmissions (Meghanathan, 2012a). Only the nodes constituting the CDS need to forward a message (to the other nodes constituting the CDS) and the rest of the nodes in the network could be mere receivers. Accordingly, the primary objective of CDS-related research has been to find CDS of smaller size. Unfortunately, the problem of finding a minimum connected dominating set (CDS with the fewest number of constituent nodes) in both weighted and unit-weight graphs (undirected or directed) is a NP-hard problem (Cormen et al., 2009). Several heuristics (Gu et al, 2012; Meghanathan, 2012a, etc) have been proposed to find approximations to minimum-sized CDS. A common theme among all of these heuristics is to use a metric (like node degree) as the determining factor to include a node as part of the CDS. CDS construction starts with the node having the maximum value for the underlying metric and the nodes connected to this starting node are considered candidate nodes for inclusion to the CDS (as long as these nodes bring in at least one node that is yet to be covered). Among such candidate nodes, the node that covers the largest number of uncovered nodes is included to the CDS; this procedure is continued until all the nodes in the network graph are covered by at least one node in the CDS.
In this paper, we study the use of centrality metrics to determine CDS on large complex network graphs. Our objective is to identify one or more centrality metrics that could be used to determine CDS of smaller size in real-world networks. In this direction, we implement algorithms to determine some of the commonly used centrality metrics (Degree centrality, Eigenvector centrality, Betweeness centrality and Closeness centrality) and run them on six real-world network graphs of different sizes (ranging from 34 nodes to 332 nodes). Results obtained indicate the Betweeness centrality to be the most suited metric to determine smaller sized CDS (with respect to both the number of nodes constituting the CDS as well as the number of edges between these CDS nodes) for large complex network graphs. The Betweeness centrality scheme ranks a node based on the number of shortest paths that the node is part of between any source-destination pair and the number of such source-destination pairs on whose shortest paths the node is located.