Deep Learning with Applications Using Python by Navin Kumar Manaswi
Author:Navin Kumar Manaswi
Language: eng
Format: epub, pdf
Publisher: Apress, Berkeley, CA
Figure 9-3 shows how the nodes of the hidden layer are connected to the nodes of the input layer.
Figure 9-3The connections
In an RNN, if the sequences are quite long, the gradients (which are essential for tuning the weight and bias) are computed during their training (backpropagation). They either vanish (multiplication of many small values less than 1) or explode (multiplication of many large values more than 1), causing the model to train very slowly.
The Concept of LSTM
Long short-term memory is a modified RNN architecture that tackles the problem of vanishing and exploding gradients and addresses the problem of training over long sequences and retaining memory. All RNNs have feedback loops in the recurrent layer. The feedback loops help keep information in “memory” over time. But, it can be difficult to train standard RNNs to solve problems that require learning long-term temporal dependencies. Since the gradient of the loss function decays exponentially with time (a phenomenon known as the vanishing gradient problem ), it is difficult to train typical RNNs. That is why an RNN is modified in a way that it includes a memory cell that can maintain information in memory for long periods of time. The modified RNN is better known as LSTM. In LSTM, a set of gates is used to control when information enters memory, which solves the vanishing or exploding gradient problem.
The recurrent connections add state or memory to the network and allow it to learn and harness the ordered nature of observations within input sequences. The internal memory means outputs of the network are conditional on the recent context in the input sequence, not what has just been presented as input to the network.
Download
Deep Learning with Applications Using Python by Navin Kumar Manaswi.pdf
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8304)
Test-Driven Development with Java by Alan Mellor(6744)
Data Augmentation with Python by Duc Haba(6661)
Principles of Data Fabric by Sonia Mezzetta(6412)
Learn Blender Simulations the Right Way by Stephen Pearson(6308)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6182)
Hadoop in Practice by Alex Holmes(5961)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5809)
RPA Solution Architect's Handbook by Sachin Sahgal(5578)
Big Data Analysis with Python by Ivan Marin(5372)
The Infinite Retina by Robert Scoble Irena Cronin(5269)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5152)
Pretrain Vision and Large Language Models in Python by Emily Webber(4341)
Infrastructure as Code for Beginners by Russ McKendrick(4102)
Functional Programming in JavaScript by Mantyla Dan(4040)
The Age of Surveillance Capitalism by Shoshana Zuboff(3959)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3814)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3616)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3593)
