It seems that quite a few people with interest in deep learning think of it in terms of unsupervised pre-training, autoencoders, stacked RBMs and deep belief networks. It’s easy to get into this groove by watching one of Geoff Hinton’s videos from a few years ago, where he bashes backpropagation in favour of unsupervised methods that are able to discover the structure in data by themselves, the same way as human brain does. Those videos, papers and tutorials linger. They were state of the art once, but things have changed since then.
Geoff Hinton is a living legend. He almost single-handedly invented backpropagation for training feed-forward neural networks. Despite in theory being universal function approximators, these networks turned out to be pretty much useless for more complex problems, like computer vision and speech recognition. Professor Hinton responded by creating deep networks and deep learning, an ultimate form of machine learning. Recently we’ve been fortunate to ask Geoff a few questions and have him answer them.
RStudio is an IDE for R. It gives the language a bit of a slickness factor it so badly needs. The nice thing about the software, beside good looks, is that it integrates console, help pages, plots and editor (if you want it) in one place.
How to represent features for machine learning is an important business. For example, deep learning is all about finding good representations. What exactly they are depends on a task at hand. We investigate how to use available labels to obtain good representations.
We have already written a few articles about Pylearn2. Today we’ll look at PyBrain. It is another Python neural networks library, and this is where similiarites end.
We’d like to be able to predict stock market. That seems like a nice way of making money. We’ll address the fundamental issue: can stocks be predicted in the short term, that is a few days ahead?
Out of 215 contestants, we placed 8th in the Cats and Dogs competition at Kaggle. The top ten finish gave us the master badge. The competition was about discerning the animals in images and here’s how we did it.
IPython is known for the notebooks. But the first thing they list on their homepage is a “powerful interactive shell”. And that’s true - if you use Python interactively, you’ll dig IPython.
A while ago we’ve shown how to get predictions from a Pylearn2 model. It is a little tricky, partly because of splitting data into batches. If you’re able to fit your data in memory, you can strip the batch handling code and it becomes easier to see what’s going on. We exercise the concept to distinguish cats from dogs again, with superior results.
Recently at least two research teams made their pre-trained deep convolutional networks available, so you can classify your images right away. We’ll see how to go about it, with data from the Cats & Dogs competition at Kaggle as an example.