Demystifying Deep Learning
Machine learning models power the modern society from image recognition to recommendation in websites, from web searches to helping people navigate in short time. They are increasingly becoming popular and are a must in smartphones, cameras, automobiles etc. Traditional machine learning algorithms were limited in their ability to process raw data. For many decades the design of machine learning algorithms required hand coded and meticulous engineering with high expertise in various subjects.
Deep learning is the branch of machine learning that allows computational models consisting of multiple layers to learn patterns of data with multiple levels of abstraction. These models have dramatically improved the efficiency in speech recognition, visual object recognition, object detection and many other domains including healthcare, drug discovery, computational neuroscience, and genomics. Deep learning discovers insatiable structures in large data sets by using algorithms such as back propagation algorithm to indicate how the machine should change its internal parameters used to compute the representation in each layer from the representation in the previous layer. They have brought breakthroughs in the processing of images, videos, and audio.
The deep learning consists of neural network models that are based loosely on how we think brain works; it’s an abstract version of the behavior of the brain. An artificial neural network may have single of multiple inputs and outputs. The input is fed into the neural network consisting of many layers having multiple neurons and the output is given. Based on the output the network tries to learn by itself, here the learning is done by the weights present in the neurons. So these weights adjust itself after each iteration to satisfy the output. This is called learning; the neurons strengthen itself every time the predicted is different from the actual. In our Brain, the real neurons take in some combination of their inputs and decide to fire or not to fire and we call this as spikes. Here in Artificial Neural Networks, the neurons do emit a spike but instead gives out a real number value. The function of these neurons is the weighted sum of the inputs along with a bias. A bias is a function that is added to improve the neural network performance.
Neural networks have been there from the 70s. There were invented in the early 70s but became popular in the late 80s but they became absolute in 90s when Dr. Marvin Minsky form MIT showed that the neural networks cannot learn the XOR output gate. The other major problems were the lack of computational power necessary to train large models meant neural nets couldn’t be applied to larger problems on larger interesting data sets and there was a lack of large, interesting data sets. Pioneers of neural networks like Dr Rosenblatt, Dr Geoffrey Hinton, Dr Yann Le Cunn, Dr Yoshua Beningo, Dr David Rulehemart, Dr Andrew N G kept on working and made breakthroughs in the field. Dr. Hinton is known as the father of Artificial Neural Networks whose contributions were instrumental in designing algorithms such as back propagation, dropout, deep belief nets etc which we will be telling you about in our future posts.
The reason deep learning can be applied across such a diverse set of fields is because they have the same concepts or building block that can be applied to different areas such as audio, video, robotics, web searches etc. The most amazing part of the deep learning idea is that we don’t have to tell what feature to learn, the neural nets themselves learn them. Like all machine learning algorithms there are three learning approaches, 1)Supervised Learning 2)Unsupervised Learning 3)Reinforcement Learning.
For example, we train a neural network model to learn Gandhiji image which is called training the network, then we show the neural network model a new set of images, which is known as test data and tell the network to recognize Gandhiji, the network identifies Gandhiji’s images. This we call it as Supervised Learning.
Sometimes we will have multiple images of Gandhiji, say when you type in the google search as Gandhiji you get many photos of Gandhiji so how does the machine to group different Gandhiji photos under the name of Gandhiji. This is called clustering of data. When there are a group of images a centroid is taken and then images with same structures are grouped together. This is called Unsupervised Learning.
The other kind of learning is the Reinforcement Learning which is a reward-based learning. The output gets refined with respect to the rewards. For example when a robot tries to move its arm to target position (x,y), it might come to some position nearby (x1,y1), now for that position reward will be given based on its performance, and if it does not reach the target the reward will be less based on the distance between the target and the position of the arm. It is like scoring marks in the exam. If you solve an equation completely you will get the maximum marks and if you solve half the equation you get half the marks. Here reward is the marks. So based on the reward the arm keeps changing its position to reach the target position. Reinforcement learning takes a lot of data and time but they are currently the most advanced in learning methods. Reinforcement learning is used in Self Driving cars, chess playing robots.
Deep learning uses the above learning models for different problems, depending on its objective and complexity.
Tensor flow Playground from Google gives you an interactive visualization of neural networks and Deep Learning. It gives an insight on how neural networks work. Below is the URL to unleash for the understanding of the power of neural networks. Tensorflow Playground