Machine learning made easy

Best Buy mobile contest

There’s a contest on Kaggle called ACM Hackaton. Actually, there are two, one based on small data and one on big data. Here we will be talking about small data contest - specifically, about beating the benchmark - but the ideas are equally applicable to both.

The deal is, we have a training set from Best Buy with search queries and items which users clicked after the query, plus some other data. Items are in this case Xbox games like “Batman” or “Rocksmith” or “Call of Duty”. We are asked to predict an item user clicked given the query. Metric is MAP@5 (see an explanation of MAP).

The problem isn’t typical for Kaggle, because it doesn’t really require using machine learning in traditional sense. To beat the benchmark, it’s enough to write a short script. Concretely ;), we’re gonna build a mapping from queries to items, using the training set. It will be just a Python dictionary looking like this:

'forzasteeringwheel': {'2078113': 1}, 'finalfantasy13': {'9461183': 3, '3519923': 2}, 'guitarps3': {'2633103': 1}

Keys in the dictionary are queries and items are game IDs with click count. One thing maybe worth noting here is that we “prepare” the queries, meaning that “Rock Smith_1” becomes “rocksmith1” - the process is to filter out all non-alphanumeric characters and convert to lower case. This way, queries like “Rocksmith 1” and “Rock Smith_1” are the same. This makes sense, because the intent behind them is the same, only spelling is different.

When we are asked to predict an item for a given query, we will check if the query is in our dictionary. If it is, we will recommend up to five IDs found there, starting from the one with most clicks. When there’s less than five IDs, we take the rest from the benchmark. It is very easy because benchmark always recommends five most popular items:

[9854804, 2107458, 2541184, 2670133, 2173065]

Similarly, if the query is not in our dictionary, we will take all five recommendations from the benchmark.

The script takes a few seconds to run and produces a score of 72,8%, while benchmark is 14,5%. The leader at the time of writing had 77,3%.

Usage: <train file> <test file> <output file>

For example: train.csv test.csv predictions.txt