Home > Computers & Technology > Computer Science > AI & Machine Learning > Neural Networks

Artificial Vision and Language Processing for Robotics by Álvaro Morena Alberola Gonzalo Molina Gallego & Unai Garay Maestre

Author:Álvaro Morena Alberola, Gonzalo Molina Gallego & Unai Garay Maestre , Date: March 13, 2020 ,Views: 1226

Artificial Vision and Language Processing for Robotics by Álvaro Morena Alberola Gonzalo Molina Gallego & Unai Garay Maestre

Author:Álvaro Morena Alberola, Gonzalo Molina Gallego & Unai Garay Maestre
Language: eng
Format: epub
Publisher: Packt Publishing Pvt Ltd
Published: 2019-04-26T16:00:00+00:00

Building Your First CNN

Note

For this chapter, we are going to still use Keras on top of TensorFlow as the backend, as mentioned in Chapter 2, Introduction to Computer Vision of this book. Also, we will still use Google Colab to train our network.

Keras is a very good library for implementing convolutional layers, as it abstracts the user so that layers do not have to be implemented by hand.

In Chapter 2, Introduction to Computer Vision, we imported the Dense, Dropout, and BatchNormalization layers by using the keras.layers package, and to declare convolutional layers of two dimensions, we are going to use the same package:

from keras.layers import Conv2D

The Conv2D module is just like the other modules: you have to declare a sequential model, which was explained in Chapter 2, Introduction to Computer Vision of this book, and we also add Conv2D:

model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3), padding='same', strides=(2,2), input_shape=input_shape))

For the first layer, the input shape has to be specified, but after that, it is no longer needed.

The first parameter that must be specified is the number of filters that the network is going to learn in that layer. As mentioned before, in the earlier layers, we will filter few layers which will be learned, rather than the layers deeper in the network.

The second parameter that must be specified is the kernel size, which is the size of the filter applied to the input data. Usually, a kernel of size 3x3 is set, or even 2x2, but sometimes when the image is large, a bigger kernel size is set.

The third parameter is padding, which is set to "valid" by default, but it needs to be set to "same," as we want to preserve the size of the input in order to understand the behavior of down-sampling the input.

The fourth parameter is strides, which, by default, is set to (1, 1). We will be setting it to (2, 2), since there are two numbers here and it has to be set for both the x and y axes.

After the first layer, we will apply the same methodology as was mentioned in Chapter 2, Introduction to Computer Vision:

model.add(BatchNormalization())

model.add(Activation('relu'))

model.add(Dropout(0.2))

As a reminder, the BatchNormalization layer is used to normalize the inputs of each layer, which helps the network converge faster and may give better results overall.

The Activation layer is where the activation function is applied, and an activation function is a function that takes the input and calculates a weighted sum of it, adding a bias and deciding whether it should be activated or not (outputting 1 and 0, respectively).

The Dropout layer helps the network avoid overfitting, which is when the accuracy of the training set is much higher than the accuracy of the validation set, by switching off a percentage of neurons.

We could apply more sets of layers like this, varying the parameters, depending on the size of the problem to solve.

The last layers remain the same as those of dense neural networks, depending on the problem.

Download

Artificial Vision and Language Processing for Robotics by Álvaro Morena Alberola Gonzalo Molina Gallego & Unai Garay Maestre.epub

Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.

Categories

Computer Vision & Pattern Recognition	Expert Systems
Intelligence & Semantics	Machine Theory
Natural Language Processing	Neural Networks