Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Shaping the Future of Healthcare With BERT in Clinical Text Analytics

Archana Kedar Chaudhari

Source Title: Future of AI in Biomedicine and Biotechnology

DOI: 10.4018/979-8-3693-3629-8.ch012

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Over the last two decades, electronic health records (EHRs) have evolved as a crucial repository for patient health data, encompassing both structured and unstructured information. The objective of EHR is to enhance patient care, and also to serve as tool for reducing costs, managing population health, and supporting clinical research. Natural language processing (NLP) has emerged as a valuable tool for analyzing narrative EHR data, particularly in named entity recognition (NER) tasks. But traditional NLP methodologies encounter challenges to analyze biomedical text due to variations in word distributions. Recent advancements in NLP, specifically bidirectional encoder representations from transformers (BERT), offer promising solutions. BERT utilizes a masked language model base and bidirectional transformer encoder architecture to learn deep contextual representations of words. The work provides an overview of the BERT algorithm, its architecture, and details of and its varaints like BioBERT and ClinicalBERT for various clinical text classification applications.

Chapter Preview

Top

1. Introduction

A. Background

Electronic Health records (EHR) are becoming increasingly popular over the past two decades for systematically storing patients’ records (Adler-Milstein et al., 2017; Meystre et al., 2017). EHR can be of two types: structured on unstructured. Vital signs and demographic records are usually stored in structured EHR while textual reports from patients, caregivers and healthcare institutions can be defined as unstructured EHR’s (Casto & Layman, 2013). The primary objective of EHR’s is to provide quality patient care but the records can also serve purpose of population health management, cost reduction and clinical research (Köpcke & Prokosch, 2014). The human data in clinical research though limited by sample size and scope when augmented with secondary EHR can facilitate increased patient recruitment and access broader range of clinical information for research purposes (Sarwar et al., 2022; Shah & Khan, 2020).

Natural language processing (NLP) is seen to be effective in the analysis of EHR data (Friedman & Hripcsak, 1999; Friedman et al., 2004; Ohno-Machado, 2011), particularly for named entity recognition (NER) in which clinical diagnosis, medications and other responses can be identified (Nadkarni et al., 2011).

For clinical named entity recognition, dictionary and rule based are traditional approaches. The performance of dictionary-based method depends on the size and the vocabulary defined (Akhondi et al., 2015). In rule-based approach, rules are defined built on the textual outlines. The limitations of these approach are they suffer from out of vocabulary words, that usually occurs in medical domain.

Deep learning approach have helped to overcome these traditional approaches as they can discover hidden features, textual patterns automatically and effectively. Recent advances of various deep learning models applied to Natural Language Processing (NLP) primarily consist of Conditional Random Field (CRF) and Long Short-Term Memory (LSTM). These techniques have demonstrated an enhanced accuracy in biomedical Name Entity Recognition (NER), Relation Extraction (RE) and Question-Answering (QA) in field of bio-medical text mining.

The authors in (Zhu et al., 2018) applied CNN model for biomedical name entity recognition. CNN model was used for feature extraction. But the model did not take into consideration contextual relationship among words in the sentences. In (Korvigo et al., 2018), a combination of CNN and LSTM was proposed to categorize chemical entities. In (Luo et al., 2017; Ren et al., 2018; Sahu & Anand, 2016; Tong et al., 2018) the authors have made use of LSTM for biomedical named entity recognition. LSTM demonstrates the ability to capture information in sequence of sentences, but it failed to capture the contextual meaning as it worked only in one direction. Bi-directional LSTM models were further proposed (Lyu et al., 2017; Yang et al., 2018) that helped to recognize and categorize data from both directions and extract suitable features to improve contextual information recognition.

CNN demonstrated better speed and quality for changing text into character embeddings when Conditional random fields (CRF) were used as the output model. It was demonstrated that, CNN model was able to obtain better intra-word correlations,whereas LSTM was better at defining the relationship amongwords (Zhai et al., 2018). Two channels and sentence-levelembeddings were integrated into the LSTM-CRF model to from SC-LSTM-CRF (Li & Jiang, 2017). This removed the model’s limitation of losing essential hidden features while allowing the model to utilize contextual information for sentence level embeddings for improved performance.

However, there are challenges in applying these methods due to disparity in word distributions between general and biomedical dataset. The keylimitation in traditional NLP algorithms is lack of contextual meaning with linguistic constructs spread across sentences or words distant from each other. This complex task of mapping contextual meaning with linguistic constructs is complex and crucial in HER data-analysis (Groves et al., 2016; Hripcsak & Albers, 2012).

The limitation discussed above was overcome by BERT, by introducing bi-directionality. Due to the use of bi-directional transformer encodings BERT helped to establish context in a sentence by understanding the meaning of unclear or vague language in text.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Shaping the Future of Healthcare With BERT in Clinical Text Analytics

Abstract

1. Introduction

A. Background

Complete Chapter List