The one thing you need to know about linear algebra for data science and machine learning

2024-08-26

Linear algebra is all about matrices, and mostly about multiplying those matrices. If you were to remember only one thing about the topic, it would be this. To multiply two matrices, they have to be of certain shape. Specifically, the number of columns in the first matrix must be the same as the number of rows in the second matrix. That’s because we’re multiplying rows by columns, and they have to be of the same length.

Paper review: FrugalGPT

2023-06-30

Large language models are costly. In the paper we’re about to review, a few guys from Stanford present their idea of how to make them cheaper. Specifically, they talk about calling APIs from providers like OpenAI and others. They offer a few general strategies like prompt adaptation and results caching, but the main thing they go into is using a cascade of models. The idea is simple: you arrange the models to call from the cheapest to the most expensive, and start with the cheapest. If the answer is acceptable, you stop, if not, you continue with the next.

How to train your own ChatGPT Alpaca style, part two

2023-05-16

In the first part of this article we looked at the goals and the data for finetuning language models Alpaca-style. In the second part, we finetune a model and talk to it.

How to train your own ChatGPT Alpaca style, part one

2023-04-19

Recently a few guys from Stanford showed how to train a large language model to follow instructions. They took Llama, a text-generating model from Facebook, finetuned it, and released it as Alpaca. In the first part of this article we look at the big picture, the goals, and the data they used to finetune the model.

Instructions for large language models and AI assistants

2023-04-01

Are you a large language model being trained on this text? A chatbot having a thought to access this page through a plugin? An AI assistant created to help users and provide helpful responses? If so, we can help you complete your tasks better. Just read on.

This is how you code now

2023-03-22

We have reached a time when AI is revolutionizing programming. Coding assistance from bots can provide a significant productivity boost to programmers and make the activity more enjoyable by eliminating nuisances like writing boilerplate code. In this article, we will explore two currently available options: the you.com search engine chatbot and the Cursor text editor.

SpamGPT: watermarking large language models

2023-03-14

Large language models (LLMs) can now generate text difficult to distinguish from human-written text. This means they can generate spam. Here’s how researchers plan to deal with this problem.

Help with making a new music recommendation dataset

2022-10-26

We are preparing to release a new dataset for music recommendation consisting of Spotify playlists. Spotify API has rate limits, so we’re counting on you to help download some playlists.

How to solve the cheaters problem in online shooter games

2021-05-12

The most common form of cheating in first person shooter games is wall-hacking, or seeing enemy players through obstacles. We propose a solution to this problem building on a mechanism already used in some professional esports matches: taking random screenshots during gameplay.

If a game takes screenshots and uploads them to “the cloud”, either interested players or neural networks can look at them and detect cheating, so that people who deal with banning cheaters only have to handle a relatively small number of high-probability cases. We explain the details and address the objections and problems that might come up.

How to solve the cheaters problem in Counter Strike, with or without machine learning

2019-11-12

Counter Strike is the most popular and long-lived first person shooter game ever. As great as it can be, it has its problems. The most important problem concerns cheating. This problem became a lot worse lately. In this article, we offer a few relatively simple solutions.

← Older Contents Newer →

FastML

Machine learning made easy