Inference-Time Scaling

Appears in 1 paper

The broader principle of improving model performance by allocating more compute at inference time, rather than only at training time.

As used in Paper 23 — Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Model Parameters →

The broader principle of improving model performance by allocating more compute at inference time, rather than only at training time. Test-time compute is a form of inference-time scaling.