Article Preview
TopIntroduction
Recent advances in information and communications technology (ICT) have led to a significant progress in the design of devices incorporating wireless communication, processing and storage capabilities, as well as diverse sensing and actuation functionalities in a single unit that is compact, economical, autonomous and destined to become ubiquitous. This revolution appears in the form of dense and distributed large-scale self-organized wireless sensor networks (WSN) for carrying out various tasks that are of great societal interest, such as environmental monitoring and surveillance or monitoring and management in large-scale industrial infrastructures.
The HYDROBIONETS project1 is a characteristic example of such an infrastructure for water resource management. Specifically, it targets at developing a real-time microbiological wireless networked control system for water desalination and treatment plants, providing the fundamental design principles of a wireless BioMEM network (WBN) with distributed multi-sensing and multi-actuation capabilities.
The HYDROBIONETS infrastructure focuses on monitoring the complete water cycle in large-scale water treatment and desalination plants via the deployment of a WBN. Distinct sensors of the WBN measure critical microbiological and electrochemical parameters in the water at different stages of the water treatment. The associated distributed, autonomous sensing is further exploited to produce intelligent reasoning over the data by supporting advanced operations, such as, querying, high-level analysis, and alerting.
At the core of the HYDROBIONETS system, which curries out those operations, is an efficient data management and processing module. This module comprises of distinct collaborating computational nodes, which monitor and control several physical entities and dynamic phenomena. The sensor data and metadata, which are produced in streams by the sensors, can be either processed in real time or stored for further exploitation. Those data can be raw (as produced by the sensors) or aggregated, which are produced based on calculations at the node level. To accommodate the requirements of our industrial paradigm we focus on the design and development of a set of tools to deal with high-level analysis of the collected data. These tools will work on the available data, use a set of policies that would govern the “normal” operation of the sensors and the data values they report and employ in a coherent manner an appropriate statistical analysis in order to: (i) account for the underlying uncertainty of the recorded data, (ii) detect extreme events (e.g., presence of highly contaminant substances) and provide specific alerts depending on the level of severity of the event, and (iii) guarantee the validity of the detected extreme events by computing and observing pairs of distinct sensors which are highly correlated.
Concerning the first issue, we are motivated by the fact that typical WSN nodes deployed for monitoring industrial infrastructures do not handle any quality aspect of physical device data. Instead, they interface with a high-level representation and reconstruction of the sensed physical world. As a result, data processing has to additionally cope with the inherent data uncertainty, where stream data may be incomplete, imprecise or even misleading, thus hindering an accurate and reliable decision making.
Uncertainty-aware data management (Aggarwal, 2009) presents numerous challenges in terms of collecting, modeling, representing, querying, indexing and mining the data. Given that uncertainty has been recently recognized as an additional source of valuable information for data analysis which should be preserved, in contrast to existing data management systems, our approach incorporates an appropriate submodule to handle the inherent data uncertainty. More specifically, a spreadsheet-based approach is employed to identify, quantify, and combine the underlying uncertainty from the most dominant potential sources of uncertainty.