Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Domain Adaptation for Crisis Data Using Correlation Alignment and Self-Training

Hongmin Li, Oleksandra Sopova, Doina Caragea, Cornelia Caragea

Source Title: International Journal of Information Systems for Crisis Response and Management (IJISCRAM) 10(4)

DOI: 10.4018/IJISCRAM.2018100101

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Domain adaptation methods have been introduced for auto-filtering disaster tweets to address the issue of lacking labeled data for an emerging disaster. In this article, the authors present and compare two simple, yet effective approaches for the task of classifying disaster-related tweets. The first approach leverages the unlabeled target disaster data to align the source disaster distribution to the target distribution, and, subsequently, learns a supervised classifier from the modified source data. The second approach uses the strategy of self-training to iteratively label the available unlabeled target data, and then builds a classifier as a weighted combination of source and target-specific classifiers. Experimental results using Naïve Bayes as the base classifier show that both approaches generally improve performance as compared to baseline. Overall, the self-training approach gives better results than the alignment-based approach. Furthermore, combining correlation alignment with self-training leads to better result, but the results of self-training are still better.

Article Preview

Top

Introduction

From user groups, online forums, to Facebook, Twitter, Instagram, YouTube, social media platforms have become ubiquitous. The use of social media is particularly prevalent during emergencies. For instance, the Federal Emergency Management Agency (FEMA) wrote in its 2013 National Preparedness report (Maron, 2013) that during and immediately following Hurricane Sandy in 2012 “users sent more than 20 million Sandy-related Twitter posts, or tweets, despite the loss of cell phone service during the peak of the storm.” Such huge amounts of user-generated data contributed by disaster affected communities have become an important source of big crisis data for disaster response (Castillo, 2016; Reuter & Kaufhold, 2018), and at the same time have been used by the public at large to make sense of an event from social media (Stefan, Deborah, Milad, & Christian, 2018). Many research and practical studies have proved the value of social media data on disseminating warning and response information, enhancing situational awareness, facilitating allocation of resources, informing disaster risk reduction strategies and risk assessments (Watson, Finn, & Wadhwa, 2017; Reuter, Hughes, & Kaufhold, 2018; National Research Council, 2013), as well as fostering community resilience (Zhang, Drake, Li, Zobel, & Cowell, 2015). Despite these benefits, the challenges presented by the volume of the data still preclude large emergency organizations from using them routinely (Meier, 2013).

Manually sifting through voluminous streaming data to filter useful information in real time is inherently impossible. Machine learning techniques show promising results in automating the process of identifying useful, relevant and trustworthy information in big crisis data (Qadir et al., 2016), despite many practical challenges (Mendoza, Poblete, & Castillo, 2010). Many works have successfully used supervised learning algorithms to automatically classify tweets (Caragea, Squicciarini, Stehle, Neppalli, & Tapia, 2014; Imran, Elbassuoni, Castillo, Diaz, & Meier, 2013). Supervised algorithms require labeled training data to learn classifiers that can be further used to label new data of the same type (also called test data). The labels generated for the test data are usually accurate when the training and the test data are drawn from the same distribution.

The requirements above result in two main challenges that machine learning algorithms face when used to classify user-generated tweets about emerging disasters such as floods, hurricanes, and terrorist attacks. First, labeled data is not easily available for an emergent “target” disaster for which a classifier is needed to help disaster response teams identify relevant tweets, and ultimately information useful for situational awareness. Labeling data is an expensive and time-consuming process, which does not provide a real-time solution for disaster response. Labeled data from a prior “source” disaster can potentially be used to learn a supervised classifier for the target disaster (Starbird, Palen, Hughes, & Vieweg, 2010). However, another challenge is posed by the fact that data from the “source” disaster and data from the target disaster may not share the same distribution (or characteristics), and the classifier learned from the source may not perform well on the target.

Complete Article List

Search this Journal:

Reset

Open Access Articles

Volume 11: 2 Issues (2019)

Volume 10: 4 Issues (2018)

Volume 9: 4 Issues (2017)

Volume 8: 4 Issues (2016)

Volume 7: 4 Issues (2015)

Volume 6: 4 Issues (2014)

Volume 5: 4 Issues (2013)

Volume 4: 4 Issues (2012)

Volume 3: 4 Issues (2011)

Volume 2: 4 Issues (2010)

Volume 1: 4 Issues (2009)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Domain Adaptation for Crisis Data Using Correlation Alignment and Self-Training

Abstract

Introduction

Complete Article List