Understanding How Capsule Networks Work

Understanding How Capsule Networks Work

What are Capsule Networks?

Before diving deep into capsule network, there is a necessity to know what is a capsule and what is the need capsule network as Convolutional Neural Network (CNN) already have existence for dealing with the same kind of the task. A capsule is a combination of neurons whose performance vector signifies the creation of real instance parameters of a particular type of an object or it's part. The predictions are made using matrices which are a transformation in nature, for a real instance parameter which belong the upper-level capsules.

The output of the capsules (which are at the lower level) is sent to the other capsules (which are at the higher level), the performance vectors of these capsules are calculated by a significant nature of the scalar product and the results as predictions coming from the capsules at a lower level.

The problem with CNN is that they can not consider the orientation of the object which is demanded to be detected in a particular task. So if an image had eyes, nose, lips, and ears but placed anywhere in the image, then it will be detected as a face. That is the reason why capsule networks come into the play as they have the power to identify the object with their specific orientation.


How Capsule Networks Work?

There are two main concepts on which Capsule Networks works -

The first concept is the “Representation of Multidimensional Entities.” This is handled creating a feature from grouping these properties.

The second concept is “initializing the features which are at higher-level with the help of a concession between features which are at lower-level.” This is also known as “routing by agreement.”

Let's understand working with the help of an example:

In a face detector. The basics features of a face such as “eyes,” “ear,” “nose” and “lips” etc. These features are all multidimensional features. So the capsules also train themselves accordingly. But it also demands an agreement between these features to make useful predictions and to create a connection between them.


Capsule Networks Benefits

As stated above CNN are bad at encoding the orientation of the object, so it needs perfectly oriented images dataset, but capsule networks discard this limitation from scratch.

As CNN only give preference to the presentation of the specific features, not to the location of these features, this property increase the invariance in between these features which is again get discarded when it comes the case of capsule networks.

In the case capsule networks, the information (information related to the features) of the lower-level neurons is passed to those higher level neurons which require this information specifically. The information will not be given to every higher-level neuron which itself save a lot of processing time.

This routing between the neurons (from lower-level to higher-level) also improve the predictions rate, or we can say accuracy.

It gives a more human-like touch to the model as compared to the CNN model.


Why Adopting Capsule Network matters?

In the modern era of computer graphics, the fight is all about providing the more and more human touch which can be only offered by coming near to the fact that how human brain process the information, how human brain treat when human eyes see some object in general.

To be particular human brain also gives a marginal preference to the orientation of the object. Take an example and suppose a face which has eyes of cat, nose of a human, ear of an elephant and lips of a human again. This combination of misplacement will still recognize a human face by a CNN if it is solely trained on the images of the different human faces. To an extent, it is Artificial Intelligence but not true Artificial Intelligence as it lacks that human touch which when sees these types of misplacement also identify these as misplacement, not as something else.

That is why these types of more evolved technologies such as capsule networks are needed in the real world. In simple words, it tries to keep the good things of CNN and give away the bad stuff of CNN in two different ways -

Invariance -The capsules have the power to signify the presence of specificity in the feature. With the help of it, it maintains the translation invariant same as CNN's.

Equivariance -This signifies a straightforward property as the features make adjustments from image to image, feature vector representations also make adjustments which give a touch of equivariant among the model.


How to adopt Capsule Networks?

There are many frameworks such as Tensorflow, Keras, MxNet, etc. any other platform which is related to the implementation of Deep learning. After choosing the framework, the following steps should be followed -

First of all, the conventional functions related to a neural network such as placeholders function, one hot encoding function, Optimizer function, etc. should be defined. These parameters can be used for Data Visualization and Analysis of the data.

The next step is to frame the first layer of the network which will be a simple convolutional layer. The size of the filter will be (9*9). The stride should of 1 and padding is set as “Valid.”

The next layer will have a stride of 2 and will have a small dissimilarity in the activation function when compared for the convolutional layer. This layer will be known as PrimaryCaps layer.

A squash function will be defined here which will be used as block non-linearity.

An epsilon function can be used here for providing stability to all the computations. The reason for doing this is simple as it neglects the chances of having nan values in the calculations.

Now it's time to define dynamic routing between adjacent layer the where the algorithm related to dynamic routing will be implemented. The dynamic routing algorithm is the heart of this whole model.


Capsule Networks Best Practices

Best practices for implementing Capsule Networks are -

  • Knowledge of the nitty-gritty things of the Deep Learning.
  • Knowledge of the specifics of any framework (such as tensorflow and keras etc.).
  • Knowledge of “How to place convolutional layers” in a neural network.
  • And lastly but the most important, knowledge of the “Dynamic Routing” Algorithm.

Capsule Network Frameworks

As Capsule Network is just a different way to implement neural network so it can be implemented using languages which support deep learning approach such as Python, but the central part is to choose the framework. So following are the frameworks which can be used to implement capsule networks -

  • Tensorflow
  • Caffe
  • Torch/PyTorch
  • MXNet
  • Keras
  • Chainer

Share :