Article Preview
TopIntroduction
A wireless sensor network is a distributed system of smart sensor nodes that collect data about the ambient environment and propagate it to one or more control centers called the sinks or base stations, where the end-user can access the data. A typical sensor node has limited computing capability and memory capacity, and operates with very limited battery power. The transmission range of a sensor node is the distance until which the signals emanating from the node can propagate and be received with appreciable signal strength. The wireless sensor network has limited bandwidth and is shared among the nodes that are within a common transmission range. The sink is normally fixed and located far away from the sensor nodes. Direct communication between the sensor nodes and the sink is expensive in terms of energy consumption and bandwidth usage. This motivates the need for data gathering algorithms that can be run at the sensor nodes to combine data into a small set of meaningful information, which is a representative of the network condition and can be transmitted to the sink, leading to significant energy and bandwidth savings. Throughout this paper, the terms ‘data gathering’, ‘data fusion’ and ‘data aggregation’ are used interchangeably. They mean the same.
In this paper, we consider a wireless sensor network where data is periodically reported from the sensor nodes to the sink. Data collection and transmission proceeds in rounds, where each node has a packet of information in each round of communication. The data packet from the sensor nodes are gathered and aggregated with the packets of peer sensor nodes and only one data packet is sent per round from the sensor network to the sink. If each sensor node directly transmits the data to the sink that is typically located far away from the network field, the total energy consumed per round will be very high. Data aggregation helps to reduce the uncorrelated noise in several signals and produce a more accurate signal that is representative of the network condition and that can be transmitted to the sink.
Various data aggregation algorithms have been proposed in the literature. Well-known among these are the Low-Energy Adaptive Clustering Hierarchy, LEACH, (Heinelman et al., 2000) and the Power-Efficient Gathering in Sensor Information Systems, PEGASIS, (Lindsey et al., 2002) algorithms. Both algorithms operate through discrete rounds of data collection. In LEACH, each round has the following two phases: a set-up phase, when clusters of sensor nodes are formed with the assignment of a cluster head for each cluster and a steady-state phase, when the data collected from the sensor nodes in each cluster is transmitted to the sink through direct transmission from the cluster heads. In PEGASIS, each round involves the formation of a chain of the sensor nodes in the field, starting from the sensor node that is farthest from the sink. The chain of sensor nodes is formed using a greedy heuristic based on the distance between the sensor nodes. The original PEGASIS algorithm resulted in huge delay as data moves across the complete chain of sensor nodes before getting transmitted to the sink. For Code Division Multiple Access, CDMA, (Viterbi, 1995) systems, PEGASIS has been later improved using a chain-based binary scheme to minimize the delay incurred and to reduce the energy*delay metric (Lindsey et al., 2001). In the chain-based binary approach, a round comprises of log N levels, where N is the number of nodes in the network. For data gathering in each round, each node transmits to a close neighbor in a given level of the hierarchy. Nodes that receive data at a given level are the only nodes that rise to the next level. At the top level, there will be only one node that will remain as the leader and it will transmit the aggregated message to the sink.