Nitty-Gritty of Deep Reinforcement Learning for the Healthcare Sector

Nitty-Gritty of Deep Reinforcement Learning for the Healthcare Sector

Copyright: © 2023 |Pages: 17
DOI: 10.4018/979-8-3693-0876-9.ch016
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Deep reinforcement learning (DRL) is one of the emerging areas of machine learning which focuses on maximized rewards. DRL is a type of machine learning that combines reinforcement learning and deep learning. It uses a series of algorithms to enable an agent to learn how to make decisions in a complex environment. DRL is a subset of artificial intelligence that focuses on making decisions based on the environment and the rewards associated with each action.The goal of DRL is to maximize the long-term reward of an agent. In order to do this, the agent must use a combination of deep learning, reinforcement learning and other AI techniques to learn which actions will lead to the highest reward. DRL is used to solve a variety of problems, from playing video games to controlling robots. It is also used in autonomous driving and robotics, as well as for financial trading. DRL is a powerful tool for solving complex problems and has been used in a variety of research projects. DRL has the potential to revolutionize the way we interact with machines and the environment.
Chapter Preview
Top

2. Reinforcement Learning

Deep reinforcement learning is type of learning which stand somewhere between supervised and unsupervised learning (Tom et al., 2013).. As it does not depend completely on the set of labeled training data so it cannot be classified as supervised and it is not called unsupervised learning because uur search for an agent to maximize the reward point is ongoing. To attain the final goal of the algorithm the agent must take correct actions in various situations (Vincent et al., 2018).

To understand the concept of deep learning more clearly let us assume a condition where a person wants to play chess with his own computer system (Miguel et al., 2020).. There are two situations we need to consider:

  • If moving forward with supervised learning we must have relevant dataset.

  • If we train the machine to replicate human behavior then our machine will never be superior to humans because it only mimics their behavior.

So, if we can’t use supervised learning technique and we cannot also simply train the machine to replicate the human behavior then is there any way to make the machine to play the entire game by itself? The answer is “Yes”, we can use Reinforcement learning. Reinforcement learning is a feedback-based machine learning technique in which the agent needs to explore the environment to get rewards based on the actions performed. For every good action positive reward is awarded and for every bad action negative reward is awarded to the agent.

The main goal of the learning algorithm is to maximize the reward points at the end. Reinforcement learning uses trial and error method to solve multi-level problems. Furthermore, when we say deep reinforcement learning it means solving multiple layers of artificial neural network to copy the behavior of a human brain and take decision according to humans and sometimes better than humans (Pragati et al., 2023). Some terminologies used in Reinforcement learning are explained as below:

  • Agent: The entity taking actions, that makes changes in (or affects) the environment is known as agent

  • Action: It is defined as a set of all possible motions any agent can perform.

  • Reward: It is crucial factor that depends on the move made by an agent within the environment. It is of two types; positive reward (encouragement) for good actions and negative reward (punishment) for bad actions.

  • Policy (π): To maximize reward points policy decides which action would be better in certain state.

  • Q-value: It is also known as action value. Q Value is measure of the overall expected reward if the agent (A) is in state (s) and takes action (a), and then works until the end of the episode according to some policy (π) (Ethem et al., 2004).

  • Model-based and Model-free Learning Algorithms: In model-based algorithm use of transition and reward function is considered to estimate the optimal solution (Michael et al., 2019). Model based learning algorithm are employed when we have thorough awareness of the surroundings and how they will respond to certain actions. The environment in model-based reinforcement learning is well known to the agent. Model based reinforcement learning is well suited for static and fixed environment.

Complete Chapter List

Search this Book:
Reset