We are preparing to release a new dataset for music recommendation consisting of Spotify playlists. Spotify API has rate limits, so we’re counting on you to help download some playlists.
How to solve the cheaters problem in online shooter games
The most common form of cheating in first person shooter games is wall-hacking, or seeing enemy players through obstacles. We propose a solution to this problem building on a mechanism already used in some professional e-sports matches: taking random screenshots during gameplay.
If a game takes screenshots and uploads them to “the cloud”, either interested players or neural networks can look at them and detect cheating, so that people who deal with banning cheaters only have to handle a relatively small number of high-probability cases. We explain the details and address the objections and problems that might come up.
How to solve the cheaters problem in Counter Strike, with or without machine learning
Counter Strike is the most popular and long-lived first person shooter game ever. As great as it can be, it has its problems. The most important problem concerns cheating. This problem became a lot worse lately. In this article, we offer a few relatively simple solutions.
One weird regularity of the stock market
Everybody had the fantasy of predicting the stock market. We investigated the subject in Are stocks predictable?. In short, they are not, at least the prices. The next step would be to go from prices to volatility measures. The reason is that one can use the volatility to properly price stock options using the Black-Scholes model. Wikipedia says that the formula has only one parameter that cannot be directly observed in the market: the average future volatility of the underlying asset. Therefore, the question is, can one predict that volatility?
Classifying time series using feature extraction
When you want to classify a time series, there are two options. One is to use a time series specific method. An example would be LSTM, or a recurrent neural network in general. The other one is to extract features from the series and use them with normal supervised learning. In this article, we look at how to automatically extract relevant features with a Python package called tsfresh.
Google’s principles on AI weapons, mass surveillence, and signing out
In June Google published its ”AI principles”, the post signed by the CEO himself. It talks about AI sensors for predicting the risk of wildfires. Of farmers using AI to monitor the health of their herds. Of doctors starting to use AI to help diagnose cancer and prevent blindness. Great stuff! We take a look at the context.
How to use the Python debugger
This article is not about machine learning, but about a piece of software engineering that often comes handy in data science practice. When writing code, everybody gets errors. Sometimes it is difficult to debug them. Using a debugger may help, but can also be intimidating. This is a TLDR tutorial on using pdb in IPython, focused on looking at variables inside functions.
Preparing continuous features for neural networks with GaussRank
We present a novel method for feature transformation, akin to standardization. The method comes from Michael Jahrer, who recently has won another competition and afterwards shared the approach he used.
Two faces of overfitting
Overfitting is on of the primary problems, if not THE primary problem in machine learning. There are many aspects to it, but in a general sense, overfitting means that estimates of performance on unseen test examples are overly optimistic. That is, a model generalizes worse then expected.
We explain two common cases of overfitting: including information from a test set in training, and the more insidious form: overusing a validation set.
Goodbooks-10k: a new dataset for book recommendations
There have been a few recommendations datasets for movies (Netflix, Movielens) and music (Million Songs), but not for books. That is, until now.