Evaluating recommender systems

2015-08-31

If you dig a little, there’s no shortage of recommendation methods. The question is, which model to choose. One of the primary decision factors here is quality of recommendations. You estimate it through validation, and validation for recommender systems might be tricky. There are a few things to consider, including formulation of the task, form of available feedback, and a metric to optimize for. We address these issues and present an example.

Deep nets generating stuff

2015-06-30

The last few weeks have been a time of neural nets generating stuff. By deep nets we mean recurrent and convolutional neural networks, while the stuff is text, music, images and even video.

Classifying text with bag-of-words: a tutorial

2015-06-08

There is a Kaggle training competition where you attempt to classify text, specifically movie reviews. No other data - this is a perfect opportunity to do some experiments with text classification.

Kaggle has a tutorial for this contest which takes you through the popular bag-of-words approach, and a take at word2vec. The tutorial hardly represents best practices, most certainly to let the competitors improve on it easily. And that’s what we’ll do.

Real-time interactive movie recommendation

2015-05-25

Research into recommender systems took off with the Netflix challenge, which started in 2006. For three years many contenders worked hard to achieve the prescribed error threshold. Finally, in 2009 Netflix awarded the prize, one million dollars.

The emperor’s new clothes: distributed machine learning

2015-04-27

We can think of two reasons for using distributed machine learning: because you have to (so much data), or because you want to (hoping it will be faster). Only the first reason is good.

Distributed computation generally is hard, because it adds an additional layer of complexity and communication overhead. The ideal case is scaling linearly with the number of nodes; it rarely takes place. Emerging evidence shows that very often, one big machine, or even a laptop, outperforms a cluster.

What you wanted to know about AI, part II

2015-03-23

In part one we attempted to show that fears of true AI have very little to do with present reality. That doesn’t stop people from believing: they say it might take many decades for machine intelligence to emerge.

How to dispute such claims? It is possible that real AI will appear. It’s also possible that a giant asteroid will hit the earth. Or a meteorite, or a comet. Maybe hostile aliens will land, there were a few movies about that too.

What you wanted to know about AI

2015-03-16

Recently a number of famous people, including Bill Gates, Stephen Hawking and Elon Musk, warned everybody about the dangers of machine intelligence. You know, SkyNet. Terminators. The Matrix. HAL 9000. (Her would be OK probably, we haven’t seen that movie.) Better check that AI, then, maybe it’s the last moment to keep it at bay.

Juergen Schmidhuber’s answers from the Reddit AMA

2015-03-05

On March 4th Jrgen Schmidhuber tackled ask me anything questions on Reddit. The professor was very keen to answer, in fact he continued to do so on the 5th, 6th and beyond. Here are some of his thoughts we found interesting, grouped by topic.

Torch vs Theano

2015-02-09

Recently we took a look at Torch 7 and found its data ingestion facilities less than impressive. Torch’s biggest competitor seems to be Theano, a popular deep-learning framework for Python.

Loading data in Torch (is a mess)

2015-01-19

Torch 7 is a GPU accelerated deep learning framework. It had been rather obscure until recent publicity caused by adoption by Facebook and DeepMind. This entirely anecdotal article describes our experiences trying to load some data in Torch. In short: it’s impossible, unless you’re dealing with images.

← Older Contents Newer →

FastML

Machine learning made easy