Convolutional Neural Network (CNN)
What is a CNN?
A Convolutional Neural Network (CNN or ConvNet) is a type of fully connected deep learning neural network. The network’s name originates from the convolution mathematical operation in place of general matrix multiplication in at least one layer.
With the inclusion of convolutional layers, the CNN design more closely resembles a living organism’s vision processing. Research by neurophysiologists David H. Hubel and Torsten Wiesel on the visual cortexes of cats and monkeys showed individual neurons respond to small regions of the visual field. They also determined that fields of each individual neuron had overlap and varied sizes.
CNN’s were then developed with this concept in mind. For example, Convolution Layers will not look at an image of 100 x 100 pixels; instead, they will look at a window of 10 x 10 pixels. The CNN will scan the window across the image incrementing by one pixel to the right across the image then moving one pixel down. This process repeats until the entire image has been scanned. Each window processed by CNN is replicating the firing of a neuron in the visual cortex.
The returns from each node in the Convolution Layer are then passed into the Pooling Layer. In this layer, the outputs from the nodes are then compared to the adjacent nodes to influence the output of this layer. The Pooling process emulates the overlapping fields in the visual cortex.
The output from the Pooling Layer can then be passed into subsequent Convolution Layers followed by a Pooling Layer. This will refine the data from the image into smaller and smaller segments. After moving through all the paired Convolution and Pooling Layers, a Fully-connected Layer is used to combine the output from all the nodes. This is the layer that will statistics and weights to give the probabilities for each possible output.
How to build a basic CNN?
Now that we understand the basic layers of a CNN, we can build out the structure. In this example, we will use Python3 and the Keras library. Python is the most popular language for AI, and Keras is one of the more popular API which uses a TensorFlow backend. We will not go over processing the dataset, but we can assume that we have a set of images of the same size which would fall into binary subsets.
|from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing.image import img_to_array, load_img
from keras import layers, models, optimizers
from keras import backend as K
from sklearn.model_selection import train_test_split
The first step will be to initialize the model. The Sequential model in Keras is a linear stack of layers, and it will have the functions to be able to add and hold the different layers, configure the training process, perform the training, and evaluate input.
model = models.Sequential()
Now we can start adding the layers. Since this example is for an image, and images are 2-dimensional, we will want to add a Conv2D layer. This will then need to be followed by a Pooling layer
input_shape=(img_width, img_height, 3),
Additional sets of layers can be added, and parameters can be altered in each layer. Each additional layer will help with increasing accuracy. However, the more layers added, the more complex the CNN. The time it takes to train the CNN will increase as the complexity increases. There will reach a point where additional layers will not improve the test error during training. Though this is a much more complex topic. For this example, we will just copy the above to create three layers.
Now that we have multiple layers1, we can now flatten the output from our layers and compile our model.
You now have a Conv2D NN with three layers which is ready for you to train with your data set.
Where can you use a CNN?
The first application of a CNN which should come to mind is image recognition. CNNs were designed to replicate how the visual cortex functions, so this is the most logical use for the network. CNNs can be trained to look for specific objects in an image. If you had a set of images containing pictures of cats and dogs, the network could be trained to identify if the subject of the image was a cat or dog and sort it into the correct subset. With the labeled images, a search feature could be enhanced to return more accurate results on any unlabeled images.
To take this a step further, CNN could be employed for facial recognition. Multiple smartphone manufacturers have applied a CNN to create the various Face Unlock features for their devices, and Facebook uses CNNs to create the suggested labels in the photo albums. CNNs can even analyze the features of a human face with enough detail to give a prediction of the emotion displayed regardless of the individual in the image.
Most people will also have encountered CNN with their mobile banking apps. They are used for character recognition in mobile check deposits. CNN has trained to identify the data on the check and verify the amount field. The same principle is applied to the digitalization of documents in the legal, insurance, and medical fields.
CNNs have more uses in the medical field outside of digitizing written medical records. Many advances in the medical field can be attributed to the use of CNNs process the data from MRI or CT scans. In November of 2018, Frontier Science released a report on the process of using a CNN to predict the likelihood of mild cognitive impairment progressing to Alzheimer’s[i] with a 73.04% accuracy. These same concepts have been applied to predict cancer and assist with the diagnosis of multiple different diseases.