Learning Disease Causality Knowledge From the Web of Health Data

Learning Disease Causality Knowledge From the Web of Health Data

Hong Qing Yu, Stephan Reiff-Marganiec
Copyright: © 2022 |Pages: 19
DOI: 10.4018/IJSWIS.297145
Article PDF Download
Open access articles are freely available for download

Abstract

Health information becomes importantly valuable for protecting public health in the current coronavirus situation. Knowledge-based information systems can play a crucial role in helping individuals to practice risk assessment and remote diagnosis. We introduce a novel approach that will develop causality-focused knowledge learning in a robust and transparent manner. Then, the machine gains the causality and probability knowledge for inference (thinking) and accurate prediction later. Besides, the hidden knowledge can be discovered beyond the existing understanding of the diseases. The whole approach is built on a Causal Probability Description Logic Framework that combines Natural Language Processing (NLP), Causality Analysis and extended Knowledge Graph (KG) technologies together. The experimental work has processed 801 diseases in total (from the UK NHS website linking with DBpedia datasets). As a result, the machine learnt comprehensive health causal knowledge and relations among the diseases, symptoms, and other facts efficiently.
Article Preview
Top

Introduction

The development of Artificial Intelligent (AI) technologies makes our daily life much easier than before. For instance, location-based mobile applications help us to find the nearest parking space to your favourite restaurant. In the business domain, BI (Business Intelligent) services assist us to make correct business decisions. In the healthcare domain, AI technologies start to show the strength of detecting diseases in the early stages to minimize the risks of further development. We will see AI technologies becoming one of the most critical future development areas to enhance human healthcare. As a result, symptoms and lifestyle-based disease research and pre-diagnose applications started to show great potential to facilitate self-health care intelligent systems. However, many research problems remain at the current fast AI implementation trends that apply cutting edge technologies such as Deep Learning algorithms and Nature Language Processing (NLP). Some key issues are (but are not limited to): data trust/quality (European Union Agency for Fundamental Right, 2019), security (Pardeep, Masud, Gaba & et al., 2021) transparent prediction (Knight, 2017), and importantly the Causal Analysis (Vorhies, 2019). In contrast to other general domain applications, these open issues are crucial in the healthcare domain. For instance, 'children eating breakfast will avoid teen obesity' (Warner, 2008) and 'eating yoghurts would reduce 19% chances of growing precancerous but only in adenomas for the man (Zheng, Wu, Song, Ogino, Fuchs & Chan, 2019). Both studies only explained associations/correlations discovered from the data observations. However, there is no evidence to tell the possible reasons for 'why'. Recently, causality research in the ML community evidenced that the causal machine learning approach can improve the accuracy of medical diagnosis (Richens, Lee & Johri, 2020). Therefore, knowledge extraction and modelling should be considered as an important step to enhance ML outcomes and expandability, not just focusing on raw data engineering. With the Semantic Web/Knowledge Graph research community growing, knowledge data becomes available and their semantic representations in the semantic cloud. We believe there are enough semantic resources to deal with causality inference and transparent probability calculations collaborating with ML algorithms. For example, we can build Semantic Knowledge Base (SKB) representing relations among symptoms, affecting anatomical structures, most affected groups (age, gender, location), lifestyle effects and drug side effects to a particular group of diseases. Our research work presented in the paper is motivated by such ideas and case studies.

This paper has its distinct contribution to developing a novel semantic modelling framework to generate causality and probability graphs from healthcare information on the Web. Then, the causality knowledge graph data will support more advanced knowledge-based data analysis to address trust, transparency and causality analysis issues. In addition, this paper is a further extension and detailed explanation of the early research outcomes published in (Yu, 2020 & Yu, 2021). The major extension includes merging two separated research methodologies to provide a more inclusive view of the proposed framework and more data evaluations.

The current ML methods and applications for disease recommendation will be reviewed and discuss their critical limitations in section 2. Section 3 will illustrate the proposed framework and its components. Section 4 will demonstrate the benefits of applying the proposed framework in the healthcare AI research domain with our experimental and evaluation results. The conclusion and future work will be drawn at the last section.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 1 Issue (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing