Machine learning made easy

Towards goodbooks-100k

Last year, we published a new dataset for book recommendations, goodbooks-10k. As the name suggests, it contains ratings for ten thousand popular books. The dataset is available from GitHub.

We have some more raw data of the same kind, so it would make sense to publish a bigger dataset. Let’s say, with ratings for 100k books. Or even a million, if there are that many. Also, people have asked about rating timestamps. Currently we don’t have any, but it may be possible to collect them.

To gauge the interest, we would like to raise some money. The progress will be updated on this page. Now it is 1 contributor, 17 EUR.

  • initial target: the first donation - ACHIEVED
  • current target: five donations

If you’re interested in making this bigger dataset happen, please make it known by contributing. By default, the names of the contributors will be recorded in the docs for posterity.