How ChatGPT actually works

March 13, 2023

Machine learning models generally learn by making slight modifications to their own layers as they receive more sample data to process; this technique is known as training a model. They know which changes to make by attempting to minimize their loss function, which typically reflects the difference between their outputs and their labels. These labels are the expected results for their training data, and the goal of every neural network is to make its output match its dataset’s labels.

For example, if our cat-detection model predicted that a given photo didn’t include a cat, but the photo’s label specified that there actually was a cat, the computed loss would be large. On the other hand, if both the dataset and the model agree that the photo included a cat, the loss becomes relatively small. Since this strategy requires humans to guide the model’s training with training labels, it’s often referred to as supervised learning.

Unfortunately, labeling a dataset requires having humans manually verify the contents of millions of samples. It’s often easier to create a massive unlabeled dataset than even a small labeled one. A technique known as unsupervised learning makes it possible to use unlabeled datasets for training, typically by generating labels from the training data itself. AI researchers frequently use unsupervised pre-training to teach a neural network the structure of a certain type of data before later training with a much smaller labeled dataset (also known as fine-tuning) to solve a specific problem.

OpenAI’s Generative Pretrained Transformer (GPT) line of transformer neural networks all underwent unsupervised pre-training using massive text datasets scraped from various online sources, such as Common Crawl. These neural networks initially received incomplete sentences and attempted to predict the word or symbol that followed the input they were given. With billions of words in their dataset, the GPT models quickly learned to write with impeccable grammar and punctuation.

Since this pre-training focused on text from across the internet, GPT-3 initially acted like an advanced autocomplete tool, unable to respond to questions or converse with people. OpenAI’s alignment team (which tries to “align” the goals of each AI model with those of its human users) fine-tuned GPT-3 with human-generated instructional queries and responses to create InstructGPT, which included many of the instruction-following capabilities of ChatGPT but lacked the ability to converse with a human. After making some minor improvements to GPT-3, OpenAI fine-tuned it with a dataset similar to that of InstructGPT, but also including added human conversation data. The final result was ChatGPT.