FastML

Machine learning made easy

How to make those 3D data visualizations

In this article we show how to produce interactive 3D visualization of datasets. These are very good visualizations. The best, really.

Now, you can use Cubert to make these beauties. However, if you’re more of a do-it-yourself type, here’s a HOWTO.

Let’s say you’ve performed dimensionality reduction with a method of your choosing and have some data points looking like this:

cid,x,y,z
1.0,0.131364496515,-0.590685372085,-1.00062387318
-1.0,-1.90206919581,-0.0518527188196,-1.01665336703
1.0,2.29749236265,-0.982830132008,0.0511009011955

First goes the class label and then the three dimensions. The software we use, data-projector, needs a JSON file:

{"points": [
    {"y": "-79.0866574", "x": "-3.15971493", "z": "-98.5084333", "cid": "1.0"}, 
    {"y": "-50.3503514", "x": "-100.0", "z": "-100.0", "cid": "0.0"}, 
    {"y": "-100.0", "x": "100.0", "z": "-0.643983041", "cid": "1.0"}
]}

The dimensions in the cube go from -100 to 100, so we rescale the data accordingly:

d = pd.read_csv( input_file )
assert set( d.columns ) == set([ 'cid', 'x', 'y', 'z' ])

scaler = MinMaxScaler( feature_range=( -100, 100 ))
d[[ 'x', 'y', 'z' ]] = scaler.fit_transform( d[[ 'x', 'y', 'z' ]])

If our labels are in order (starting from 0), we’re ready to save to JSON:

d_json = { 'points': json.loads( d.astype( str ).to_json( None, orient= 'records' )) }
json.dump( d_json, open( output_file, 'wb' ))

Why the acrobatics in the first line? We could save directly with:

d.astype( str ).to_json( output_file, orient = 'records' )

The reason is that we need to wrap the data in a dictionary with one key called ‘points’. Therefore, we:

  • convert the data frame to json to a string
  • load it into a JSON object
  • dump the object to a file

The complete code is available at GitHub.

Viewing

Now move data.json to the data-projector directory and open index.html with your browser. That is, if your browser happens to be Firefox.

If you’re using Chrome, you’ll need to access index.html through HTTP, because apparently Chrome policy doesn’t allow loading data from external files when opening a file from a local disk.

Yasser Souri gives one solution:

  • open a console and in the data-projector directory
  • type python -m SimpleHTTPServer 80 (assuming python 2.x)
  • open http://localhost in your browser

Comments