RStudio is an IDE for R. It gives the language a bit of a slickness factor it so badly needs. The nice thing about the software, beside good looks, is that it integrates console, help pages, plots and editor (if you want it) in one place.
Good representations, distance, metric learning and supervised dimensionality reduction
How to represent features for machine learning is an important business. For example, deep learning is all about finding good representations. What exactly they are depends on a task at hand. We investigate how to use available labels to obtain good representations.
PyBrain - a simple neural networks library in Python
We have already written a few articles about Pylearn2. Today we’ll look at PyBrain. It is another Python neural networks library, and this is where similiarites end.
Are stocks predictable?
We’d like to be able to predict stock market. That seems like a nice way of making money. We’ll address the fundamental issue: can stocks be predicted in the short term, that is a few days ahead?
Yesterday a kaggler, today a Kaggle master: a wrap-up of the cats and dogs competition
Out of 215 contestants, we placed 8th in the Cats and Dogs competition at Kaggle. The top ten finish gave us the master badge. The competition was about discerning the animals in images and here’s how we did it.
Why IPy: reasons for using IPython interactively
IPython is known for the notebooks. But the first thing they list on their homepage is a “powerful interactive shell”. And that’s true - if you use Python interactively, you’ll dig IPython.
How to get predictions from Pylearn2
A while ago we’ve shown how to get predictions from a Pylearn2 model. It is a little tricky, partly because of splitting data into batches. If you’re able to fit your data in memory, you can strip the batch handling code and it becomes easier to see what’s going on. We exercise the concept to distinguish cats from dogs again, with superior results.
Classifying images with a pre-trained deep network
Recently at least two research teams made their pre-trained deep convolutional networks available, so you can classify your images right away. We’ll see how to go about it, with data from the Cats & Dogs competition at Kaggle as an example.
Regularizing neural networks with dropout and with DropConnect
We continue with CIFAR-10-based competition at Kaggle to get to know DropConnect. It’s supposed to be an improvement over dropout. And dropout is certainly one of the bigger steps forward in neural network development. Is DropConnect really better than dropout?
A/B testing with bayesian bandits in Google Analytics
A/B testing is a way to optimize a web page. Half of visitors see one version, the other half another, so you can tell which version is more conducive to your goal - for example selling something. Since June 2013 A/B testing can be conveniently done with Google Analytics. Here’s how.