Recently Rob Zinkov published his selection of interesting-looking NIPS papers. Inspired by this, we list some more. Rob seems to like Bayesian stuff, we’re more into neural networks. If you feel like browsing, Andrej Karpathy has a page with all NIPS 2013 papers. They are categorized by topics discovered by running LDA. When you see an interesting paper, you can discover ones ranked similiar by TF-IDF. Here’s what we found.
Object recognition in images with cuda-convnet
Object recognition in images is where deep learning, and specifically convolutional neural networks, are often applied and benchmarked these days. To get a piece of the action, we’ll be using Alex Krizhevsky’s cuda-convnet, a shining diamond of machine learning software, in a Kaggle competition.
CUDA on a Linux laptop
After testing CUDA on a desktop, we now switch to a Linux laptop with 64-bit Xubuntu. Getting CUDA to work is harder here. Will the effort be worth the results?
Maxing out the digits
Recently we’ve been investigating the basics of Pylearn2. Now it’s time for a more advanced example: a multilayer perceptron with dropout and maxout activation for the MNIST digits.
How much data is enough?
A Reddit reader asked how much data is needed for a machine learning project to get meaningful results. Prof. Yaser Abu-Mostafa from Caltech answered this very question in his online course.
Big data made easy
An overview of key points about big data. This post was inspired by a very good article about big data by Chris Stucchio (linked below). The article is about hype and technology. We hate the hype.
Pylearn2 in practice
What do you get when you mix one part brilliant and one part daft? You get Pylearn2, a cutting edge neural networks library from Montreal that’s rather hard to use. Here we’ll show how to get through the daft part with your mental health relatively intact.
What you wanted to know about AUC
AUC, or Area Under Curve, is a metric for binary classification. It’s probably the second most popular one, after accuracy. Unfortunately, it’s nowhere near as intuitive. That is, until you have read this article.
Predicting solar energy from weather forecasts plus a NetCDF4 tutorial
Kaggle again. This time, solar energy prediction. We will show how to get data out of NetCDF4 files in Python and then beat the benchmark.
Our followers and who else they follow
Recently we hit 400 followers mark on Twitter. To celebrate we decided to do some data mining on you, specifically to discover who our followers are and who else they follow. For your viewing pleasure we packaged the results nicely with Bootstrap. Here’s some data science in action.