Key Aspects of Pooling and Dropout Layers | AI Planet (formerly DPhi)

Video Transcript

And what is pooling? So we discussed about pooling, as you can see here, pooling helps in reducing the number of parameters by downsampling. What that means is we would take specific patches of the image again, depending on the size of our pooling layer, where we, if we, let's say, set the pooling layer to be two cross two, we would look at four pixels at a time and we would take the maximum value from it. And that basically becomes the output of the downsampled feature map. So typically this would be a feature map coming from a conf layer. This would go into the two cross two pooling layer and we would end up with a smaller sized feature map. And the reason for applying pooling is reduction in parameters, which help in faster training and helping in preventing overfitting. And like I said before, there are various types of pooling available. It is not like there is only max pooling in general. There are various types of pooling operations like sum, mean, max, and so on. But typically max pooling is the one which is preferred. One of the reasons is taking a max of pixels is much faster than doing a mean or a sum, if you think from a computation perspective. And also it has proven to give a much better performance than mean or sum. That is what even I've seen based on some of my projects, which I've done kind of a while back. And the other aspect is it helps in enhancing the key aspects of feature maps in terms of highlighting the specific activations, which are, or the pixels which are being activated. So that is another reason for applying this max pooling. And pooling size, as I discussed, is the number of image pixels you consider or set of these pixels to do the downsampling. Dropout, I will touch upon this briefly. I'm hoping that you all may have heard about this before. So dropout is a technique where randomly selected neurons are completely ignored during training. So randomly X percent, like let's say if dropout is 0.3, X percent of these neurons or units will be completely ignored during training. And this helps in preventing overfitting because this is randomly done. And basically this is a regularization mechanism.