Intelligent Systems, Technologies and Applications by Unknown

Intelligent Systems, Technologies and Applications by Unknown

Author:Unknown
Language: eng
Format: epub
ISBN: 9789811539145
Publisher: Springer Singapore


3 Background of Reinforcement Learning

Information extraction in the form of concept extraction, diagnostic inference and named-entity recognition is important for medical research as well as for automatic preliminary medical care. Our objective is to extract concepts from semi-structured data like discharge summaries of the I2B2 dataset. Deep learning in the form of reinforcement learning and Bi-LSTM models is used in the process of achieving this objective.

Active learning involves selecting highly informative samples from an un-annotated dataset to decrease the cost of annotation as well as time. This problem can be formulated as a reinforcement learning problem where the state space and actions are modeled as Markov decision process (MDP). An MDP should satisfy Markov chain property, which states: The future state depends only on the current state and no other. In reinforcement learning (RL), an agent is the decisive factor, which selects the best possible action to go to the next state from the current state by maximizing the reward. The major difference between supervised learning and RL is that in the former, there is a labeled dataset against which the model can be evaluated, whereas in the latter case, the agent learns from its experience. The agent tries every possible path to reach the destination during which it learns the best path from origin to destination. Learning in RL is incorporated using a reward mechanism. Depending on the design of the reward function, the learning happens in the best possible way. There are two types of RL: positive and negative.

Positive RL: Every time the reward is maximized, it helps in gaining strength and the frequency of the behavior.

Negative RL: The attainment of negative condition is avoided in this kind of learning, thus maximizing the behavior.

RL can be applied to any kind of problem where the source and destination are known in advance. The best path taken by the agent can be automated using deep neural networks; hence, deep RL gives an advantage over the traditional approach. Any RL internally works as MDP.

The goal of any MDP is to find the best policy that can take us to the destination state with maximum reward. Markov chain Monte Carlo (MCMC) is an advancement over Markov chain. Monte Carlo approach is to calculate the reward at the end of each episode. This algorithm uses the previous steps to generate the future steps. This strategy considerably speeds up the process.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.