Automatic Audio and Video Event Recognition in an Intelligent Resource Management System

Automatic Audio and Video Event Recognition in an Intelligent Resource Management System

Daniel Stein, Barbara Krausz, Jobst Löffler, Robin Marterer, Rolf Bardeli, Michael Stadtschnitzer, Jochen Schwenninger
DOI: 10.4018/ijiscram.2013100101
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Event recognition systems have high potential to support crisis management and emergency response. For large-scale scenarios, however, the sheer amount of possible audio and video channels requires adequate processing of the material by automatic means. In this article, the authors focus on automatic audio and video event recognition, by means of detecting abnormalities both in train noise as well as surveillance videos, and by conducting automatic speech recognition on fire fighter communication. All components are integrated in an overall intelligent resource management system. The authors elaborate on the challenges expected from real life data and the solutions that the authors applied. The overall system, based on Event-Driven Service-Oriented Architecture, has been implemented and partly integrated into the end users' infrastructures. The system has been continuously running for more than two years, collecting data for research purposes.
Article Preview
Top

Introduction

Event recognition systems have high potential to support crisis management and emergency response. For example, mass events like sports events, concerts and festivals are very popular in human societies and attract ever growing numbers of visitors leading to high pedestrian densities in public spaces and train stations. Despite all precautions, critical situations like congestions and extremely high pedestrian densities occur rather frequently leading to deadly stampedes and terrible crowd disasters.

In such situations, the quality of decision making is highly dependent on situational awareness of the emergency management team and the availability of a common operational picture. The officer-in-charge is reliant on various information sources and on smoothly running communication chains. Information channels can be part of human-human interaction between personnel directly involved in the operation or can be multimodal input/output channels of human-machine interaction.

In this article, we address the question if automatic event processing and real-time event recognition methodology implemented in an Intelligent Resource Management (IRM) system can leverage the use of audio and video channels for improved decision making. Our assumption is that this approach can help end users in their domains to make better decisions and handle information overload in a positive way. Following the action research (Burstein & Gregor, 1999) perspective, this article mainly investigates the internal validity of the methods, and their respective objectivity and reliability, using data from real-world scenarios. The significance and the external validation of the overall system from an end user perspective, using expert interviews with firefighter personnel, is not the main focus of this article but has been thoroughly investigated in (Pottebaum, 2012). Consequently, we focus on:

  • Keyword spotting in public safety network communications, by means of automatic speech recognition (ASR);

  • Abnormal event detection in train noise, which, coupled with Global Positioning System (GPS) information, can help identify infrastructure weaknesses early;

  • Abnormal event detection in video surveillance, to detect for example early warning signs of a stampede.

This article is organized as follows: First, we will describe the architecture of the IRM system. The next section will discuss the application of robust ASR in public safety networks. Especially, the generation of an appropriate speech corpus and the improvement of recognition rates by optimizing acoustic and language models are described. Our experiments show an impressive boost of ASR accuracy for the Terrestrial Trunked Radio (TETRA) audio channel used in fire fighting operations. The following section then details on our approach for abnormal event detection in audio streams. These results are discussed in relation with the application area of public transport management. Finally, we will describe methodologies and system approaches to abnormal event detection for the purpose of video surveillance. The special focus here is on detection of abnormal behavior in large crowds and the prevention of panic situations during huge public events using an event-driven system approach.

Top

Intelligent Resource Management System Architecture

The IRM system helps actors from crisis management and emergency response to coordinate resources (e.g. persons, vehicles, equipment) in order to improve the current situation. The IRM system is a framework for IT components that (i) gathers data from several sources, (ii) applies complex calculations on incoming data, (iii) stores data semantically and provide retrieval functionality, (iv) routes data between all other components, and (v) provides an interface to the user. The IRM system explicitly relies on heterogeneous/multimodal input data and combines that data in order to identify more complex information.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 11: 2 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing