Article Preview
TopIntroduction
The newly found Corona virus is the cause of the COVID epidemic which creates many problems such as illness, lung oriented disease. This virus started by December 2019 and started to spread to each and every corner of the world. The virus disturbs the normal human activities by health, economics, business, education, entertainment and so on. Electronic Adoption (E-Adoption) of emerging technologies such AI, Machine Learning, and Deep Learning have potential to identification of disease efficiently and automatically. The symptoms of the disease include prolonged fever, cough, headache, and other respiratory problems. Although the virus affects people of all ages, it is observed that children are more resistant than adults. Many people affected from COVID does not experience any symptoms. To detect the disease, Computed Tomography (CT) scans is one of the efficient method in detecting the disease. Even if the patients don’t show up any symptoms of COVID, CT scan helps to identify the presence of the virus. In recent months, many researchers produce high quality machine learning based model to detect the virus. In this paper, we use instance transfer in machine learning model to detect the virus. Medical data is growing rapidly day by day. A successful machine learning model needs a huge amount of data for training purposes, however, the data required for training is sufficiently available, but it is distributed across multiple sources (Field et al., 2021) (Alfred and Obit, 2021). One simple way is to centralize all the data to train a machine learning model. Moreover, this centralization is impossible because of a few regulation obstacles and medical issues. Without centralization, it's very difficult to make use of the complete data which can help develop precise machine learning models. A traditional way is to introduce a trustable central agent (Vinanzi et al., 2021) who receives all the private and sensitive data and trains a machine learning model (Huang et al., 2019). This traditional method still is vulnerable to data leakage.
A good alternative method is to use multiple machine learning models across each data source. The machine learning model which had been trained using the largest data seems to have good performance compared with the machine learning model which had been trained using fewer samples (Futoma et al., 2020). Hence, a good machine learning model performance will be obtained when there is a sufficiently large amount of data (Kino et al., 2021).
Federated learning is one of the recent emerging technologies which enables to make use of the distributed data fully (Ahmed et al., 2022). The medical data recorded by medical devices and doctors' diagnoses are first trained by locally developed machine learning models. The end results or the parameter values of each locally trained machine learning model are then made publicly available (Liaqat et al., 2017). Any other data source at any location can access these end values to test, train and validate their models (Chandiramani et al., 2019). This prevents sensitive data to be leaked or sold by unauthorized persons.
Federated Learning prevents raw data to be shared among data centers but still sensitive data such as population statistics (mean, variance) can still be leaked by federated learning during the exchange of end values (Palanivinayagam and Nagarajan, 2020) (Shankar et al., 2020). Hence, a privacy-preserving methodology is proposed in this paper that works in the concept of exchanging dummy values within the local nodes.