In R community, there’s this one guy, Hadley Wickam, who by himself made R great again. One of the many, many things he came up with - so many they call it a *hadleyverse* - is the dplyr package, which aims to make data analysis easy and fast. It works by allowing a user to take a data frame and apply to it a pipeline of operations resulting in a desired outcome (an example in just a minute). This approach turned out to be successful. Then people have ported key pieces to Pandas.

# Deep learning architecture diagrams

As a wild stream after a wet season in African savanna diverges into many smaller streams forming lakes and puddles, so deep learning has diverged into a myriad of specialized architectures. Each architecture has a diagram. Here are some of them.

# Factorized convolutional neural networks, AKA separable convolutions

The paper in question proposes a way to reduce the amount of computation needed in convolutional networks roughly three times, while keeping the same accuracy. Here’s what you wanted to know about this method (already available in TensorFlow), reprinted from two smart folks.

# How to make those 3D data visualizations

In this article we show how to produce interactive 3D visualization of datasets. These are very good visualizations. The best, really.

Now, you can use Cubert to make these beauties. However, if you’re more of a do-it-yourself type, here’s a HOWTO.

# Adversarial validation, part two

In this second article on adversarial validation we get to the meat of the matter: what we can do when train and test sets differ. Will we be able to make a better validation set?

# ^one weird trick for training char-^r^n^ns

Character-level recurrent neural networks are attractive for modelling text specifically because of their low input and output dimensionality. You have only so many chars to represent - lowercase letters, uppercase letters, digits and various auxillary characters, so you end up with 50-100 dimensions (each char is represented in one-hot encoding).

Still, it’s a drag to model upper and lower case separately. It adds to dimensionality, and perhaps more importantly, a network gets no clue that ‘a’ and ‘A’ actually represent pretty much the same thing.

# Adversarial validation, part one

Many data science competitions suffer from a test set being markedly different from a training set (a violation of the “identically distributed” assumption). It is then difficult to make a representative validation set. We propose a method for selecting training examples most similar to test examples and using them as a validation set. The core of this idea is training a probabilistic classifier to distinguish train/test examples.

In part one, we inspect the ideal case: training and testing examples coming from the same distribution, so that the validation error should give good estimation of the test error and classifier should generalize well to unseen test examples.

# Coming out

People often ask how we’ve been able to learn about and cover so many different and diverse topics in machine learning (using at least three different programming languages - Python, Matlab, and R) and generally achieve such prominence in the community, all this in a relatively short time. Today we finally give a definitive answer.

# Bayesian machine learning

So you know the Bayes rule. How does it relate to machine learning? It can be quite difficult to grasp how the puzzle pieces fit together - we know it took us a while. This article is an introduction we wish we had back then.

# What next?

We have a few ideas about what to write about next and are looking for your feedback. Vote in the poll at the bottom of this post.