LLM
Large language model, or a very strong “next-word guesser”
“LLM” went mainstream with tools like ChatGPT. In plain language: a large language model is software trained to work with text at scale—and its core trick is deceptively simple.
An LLM is a whiz at “what comes next?”
In one sentence, an LLM is a very capable next-token predictor : given some text, it scores what should follow, then repeats.
If you hear “Once upon a time, in a certain place…” you can already guess a few likely continuations. The model has seen enormous amounts of text, so it internalizes many such patterns—not by memorizing your chat, but by learning statistics of language at scale.
Stack many of those tiny decisions in a row, and you get a long reply that reads as if someone planned it—even though the training objective was basically “make the next word look plausible.”
Why the word “large”?
“Large” usually points to two things:
- Training data: far more text (and code, and more) than any human can read—books, the web, forums, and other sources, filtered and mixed depending on the project.
- Parameters: the adjustable numbers inside the model. Frontier systems can have hundreds of billions to trillions of them. More capacity often (not always) means richer patterns—and heavier compute to train and run.
As scale grows, models can pick up not only word n-grams but also more context, nuance, and something closer to “common sense” in language—still with limits and blind spots, but impressively useful for many tasks.
Summary
- check_circle An LLM builds text by repeatedly choosing high-probability next words (tokens), not by “looking things up” like a search engine in one step.
- check_circle “Large” points to huge training data and very big models (many parameters).
- check_circle Today’s chat assistants (e.g. ChatGPT-class products, Gemini) are often LLMs with extra tooling and safety layers on top.
sell Tags
book Related
More in English; Japanese site has the full set of explainers too.