The internet is filled with tutorials to get started with Deep Learning. You can choose to get started with the superb Stanford courses CS221 or CS224, Fast AI courses or Deep Learning AI courses if you are an absolute beginner. All except Deep Learning AI are free and accessible from the comfort of your home. All you need is a good computer (preferably with a Nvidia GPU) and you are good to take your first steps into Deep Learning.
This blog is however not addressing the absolute beginner. Once you have a bit of intuition about how Deep Learning algorithms work, you might want to understand how things work below the hood. While most work in Deep Learning (the 10% apart from Data Munging viz 90% of total work) is adding layers like Conv2d, changing hyperparameters in different types of optimization strategies like ADAM or using batchnorm and other techniques just by writing one line commands in Python (thanks to the awesome frameworks available), a lot of the people might feel a deep desire to know what happens behind the scenes. This is the list of resources which might help you get to know what happens inside the hood when you (say) put a conv2d layer or call T.grad in Theano.
Deep Learning Book is of course the most famous and well-known resource. Other good resources are Professor Charniak’s course and paper which is a technical introduction to Deep Learning. There are other resources too which you might want to take up if you want to understand things from a particular perspective. For example, the tutorial was written from point of applied mathematicians and if you just want to start coding without going into any theory then read here. One more recommended resource is Deep Learning course in PyTorch here. This course talks about things bottom up and help you grab a bigger perspective.
Issues about Backpropogation
There are many times when people are not sure about “how are gradient descent and backpropagation the same thing ?” or “what exactly is the chain rule and backpropagation?”. To understand the basics, we might choose to read the original paper by Rumelhart, Hinton, and Williams where it all started. The paper is located here and is a very simple to understand the document.
Some other very useful resources one can read on top of this are Karpathy’s blog on backward prop derivation and this video explaining backprop’s derivation.
Linear Algebra and other Maths
Anyone would redirect someone aspiring to learn Linear Algebra to goto Prof. Strang’s course. This is probably the best resource on Linear Algebra on the planet. Similar is the case with Optimization course of Prof. Boyd here or Calculus on Manifolds book for vector calculus (You can find a pdf when you Google “Calculus on Manifolds”). However, one doesn’t need to go through the depth at which these resources look at their subjects to jump into Deep learning. A very quick way to get started is to take the quick refresher on all prerequisite Calculus for Deep Learning which is available here. There is also this very good set of lecture notes to just look at convex optimization used in Deep Learning. Another good resource is Sebastian Reuder’s paper here. I also like these lecture notes to understand derivatives on tensors.
Automatic Differentiation and Deep learning libraries
Automatic Differentiation isn’t something that you absolutely need to know when you are doing Deep Learning. Most frameworks like Torch, Theano or tensorflow do it for you automatically. In most cases, you don’t even have to know how the differentiation is being done. That said, if you are determined to understand how Deep Learning frameworks work, you might want to understand how automatic differentiation works here. Other good resources to understand how Deep Learning libraries function are can be found in this blog and video.
Convolutional Neural Networks:
The most useful things you might need after you have done some courses which enable you to use basic convents is to understand how convolutions work on images. “What is output shape after you apply a certain type of convolutions on an input ?” , “How does stride affect convolutions ?”, “What is Batch Normalization ?” and stuff like that. The two best resources I have seen for these type of applied questions are the tutorial here and Ian Goodfellow’s talk here. A more thorough review on Convnets is here if you want to get an idea. This review on object detection is a very good resource in the topic.
Deep Learning in NLP
The Stanford 224 course I pointed out earlier in the blog is a very good starting point and you should be good enough for almost everything. There is also a course on youtube by Graham Neubig (which uses dynet) here. There is also an NLP book by Yoav Goldberg which you might like. (Newer) advances of NLP after this book was written are reviewed here. There is also a very common question about whether to use convnets or RNNs (LSTMs/GRUs) on text. A good overview is here.
Sutton and Barto is the bible to get started with these methods. The book is free and is available here. A very good review of recent Deep Reinforcement Learning methods is available here. There is this very interesting tutorial on Reinforcement Learning here.
A good review of Monte Carlo Tree Search (which is a part of AlphaGo algorithm by Deepmind apart from Deep Reinforcement Learning techniques) is here. I, however, used this quick tutorial to learn about them.
Some other good reviews/tutorials
A good tutorial about GANs (Generative Adversarial Networks) and generative models, in general, is what Goodfellow gave in ICLR 2016. It can be found here. Neural Networks have been used to transfer art (for example in Prisma app), a detailed survey of methods to do that can be found here. Another good survey on Multi-Task Learning (combining multiple tasks by same neural network) by Reuder is here.
Although Deep Learning works amazingly well on multiple problems, we know there will always be some places where they have not reached yet. Some good criticisms to read are Failures of gradient-based learning by Shalev-Shwartz et al. , this talk by Hinton which lists some difficulties for convnets and how convnets cannot decipher negatives of images they train on. Another criticism here went viral/controversial few days back is this. There is also this extensive report on the malicious use of Deep Learning.
This is a huge field of making artificial/real data points that can fool Convnets. I could have put it in in criticism sections but didn’t as
1) they are not a technical challenge for all applications and
2) I am not very well read on them. One very cool case to get started and interested is here where they generate “adversarial objects” to fool neural networks.
You can also read about Machine Learning algorithms you should know to become a Data Scientist here.
We hope you liked the article. Please Sign Up for a free ParallelDots account to start your AI journey. You can also check demo’s of PrallelDots AI APIs here.