Cynobi AI Academy
Course 1 · AI Foundations · Lesson 03

How an LLM Actually Works

Underneath the magic, it does one thing: predict the next chunk of text, over and over. It’s autocomplete at planetary scale — and the same trick that makes it beautifully fluent is exactly what makes it confidently wrong.

Free · Course 1 ~5 min video Type · Misconception

The one mental model

Read the text → predict the next token → add it → look again → repeat. To decide what’s likely, it turns words into numbers (similar meanings sit close together), and at each step it picks from a ranked list of probabilities. How boldly it picks is set by temperature.

Key terms

Next-token prediction
The whole engine: guess the next chunk of text, add it, repeat. It isn’t planning the ending — it’s choosing the next step.
Embeddings
Words turned into long lists of numbers (coordinates). Meaning becomes position — “king” sits near “queen.” king − man + woman ≈ queen.
Probabilities
At each step it ranks candidate next-words with probabilities, and usually (not always) picks near the top.
Temperature
How adventurous the pick is. Low = focused, repeatable (facts/code). High = creative, riskier.

The misconception to drop

“There’s a mind in the box that knows things and looks them up.”
It’s an extremely well-read autocomplete predicting the next word. That single mechanism explains both sides: the fluency is real, and so is the confident-but-wrong — it aims for what sounds likely, not what’s true.

Use it better today

1
Because it predicts a continuation, a rich, specific start makes the answer you want the most likely one. A vague prompt leaves it guessing.
2
Because it predicts, not knows, verify the specifics — names, numbers, quotes — every time.

Ask the AI Tutor

Pause anytime and ask — the tutor answers from this lesson’s material.

What does “predict the next token” mean? How can words become numbers? What does temperature actually change? Why does the same trick cause hallucination?