Shaping the Future of Healthcare With BERT in Clinical Text Analytics

Shaping the Future of Healthcare With BERT in Clinical Text Analytics

Copyright: © 2024 |Pages: 20
DOI: 10.4018/979-8-3693-3629-8.ch012
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Over the last two decades, electronic health records (EHRs) have evolved as a crucial repository for patient health data, encompassing both structured and unstructured information. The objective of EHR is to enhance patient care, and also to serve as tool for reducing costs, managing population health, and supporting clinical research. Natural language processing (NLP) has emerged as a valuable tool for analyzing narrative EHR data, particularly in named entity recognition (NER) tasks. But traditional NLP methodologies encounter challenges to analyze biomedical text due to variations in word distributions. Recent advancements in NLP, specifically bidirectional encoder representations from transformers (BERT), offer promising solutions. BERT utilizes a masked language model base and bidirectional transformer encoder architecture to learn deep contextual representations of words. The work provides an overview of the BERT algorithm, its architecture, and details of and its varaints like BioBERT and ClinicalBERT for various clinical text classification applications.
Chapter Preview
Top

1. Introduction

A. Background

Electronic Health records (EHR) are becoming increasingly popular over the past two decades for systematically storing patients’ records (Adler-Milstein et al., 2017; Meystre et al., 2017). EHR can be of two types: structured on unstructured. Vital signs and demographic records are usually stored in structured EHR while textual reports from patients, caregivers and healthcare institutions can be defined as unstructured EHR’s (Casto & Layman, 2013). The primary objective of EHR’s is to provide quality patient care but the records can also serve purpose of population health management, cost reduction and clinical research (Köpcke & Prokosch, 2014). The human data in clinical research though limited by sample size and scope when augmented with secondary EHR can facilitate increased patient recruitment and access broader range of clinical information for research purposes (Sarwar et al., 2022; Shah & Khan, 2020).

Natural language processing (NLP) is seen to be effective in the analysis of EHR data (Friedman & Hripcsak, 1999; Friedman et al., 2004; Ohno-Machado, 2011), particularly for named entity recognition (NER) in which clinical diagnosis, medications and other responses can be identified (Nadkarni et al., 2011).

For clinical named entity recognition, dictionary and rule based are traditional approaches. The performance of dictionary-based method depends on the size and the vocabulary defined (Akhondi et al., 2015). In rule-based approach, rules are defined built on the textual outlines. The limitations of these approach are they suffer from out of vocabulary words, that usually occurs in medical domain.

Deep learning approach have helped to overcome these traditional approaches as they can discover hidden features, textual patterns automatically and effectively. Recent advances of various deep learning models applied to Natural Language Processing (NLP) primarily consist of Conditional Random Field (CRF) and Long Short-Term Memory (LSTM). These techniques have demonstrated an enhanced accuracy in biomedical Name Entity Recognition (NER), Relation Extraction (RE) and Question-Answering (QA) in field of bio-medical text mining.

The authors in (Zhu et al., 2018) applied CNN model for biomedical name entity recognition. CNN model was used for feature extraction. But the model did not take into consideration contextual relationship among words in the sentences. In (Korvigo et al., 2018), a combination of CNN and LSTM was proposed to categorize chemical entities. In (Luo et al., 2017; Ren et al., 2018; Sahu & Anand, 2016; Tong et al., 2018) the authors have made use of LSTM for biomedical named entity recognition. LSTM demonstrates the ability to capture information in sequence of sentences, but it failed to capture the contextual meaning as it worked only in one direction. Bi-directional LSTM models were further proposed (Lyu et al., 2017; Yang et al., 2018) that helped to recognize and categorize data from both directions and extract suitable features to improve contextual information recognition.

CNN demonstrated better speed and quality for changing text into character embeddings when Conditional random fields (CRF) were used as the output model. It was demonstrated that, CNN model was able to obtain better intra-word correlations,whereas LSTM was better at defining the relationship amongwords (Zhai et al., 2018). Two channels and sentence-levelembeddings were integrated into the LSTM-CRF model to from SC-LSTM-CRF (Li & Jiang, 2017). This removed the model’s limitation of losing essential hidden features while allowing the model to utilize contextual information for sentence level embeddings for improved performance.

However, there are challenges in applying these methods due to disparity in word distributions between general and biomedical dataset. The keylimitation in traditional NLP algorithms is lack of contextual meaning with linguistic constructs spread across sentences or words distant from each other. This complex task of mapping contextual meaning with linguistic constructs is complex and crucial in HER data-analysis (Groves et al., 2016; Hripcsak & Albers, 2012).

The limitation discussed above was overcome by BERT, by introducing bi-directionality. Due to the use of bi-directional transformer encodings BERT helped to establish context in a sentence by understanding the meaning of unclear or vague language in text.

Complete Chapter List

Search this Book:
Reset