Large Language Model (LLM)
A neural network trained to predict the next token in a sequence, using next-token prediction as the training objective.
A neural network trained to predict the next token in a sequence, using next-token prediction as the training objective. CoT demonstrates that LLMs, despite their simple training objective, can learn to reason when shown appropriate examples. Modern LLMs (GPT-3, PaLM, Claude, etc.) all use CoT variants in production.