Verifier

Appears in 2 papers

A component that evaluates whether a proposed solution is correct.

As used in Paper 23 — Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Model Parameters →

A component that evaluates whether a proposed solution is correct. For math, a verifier might be a Python interpreter (execute the code and check if it produces the right answer). The quality of the verifier determines the quality of Best-of-N selection.

As used in Paper 24 — rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking →

A component (usually a PRM or ORM) that evaluates solution quality. rStar-Math uses both PRM (step-level) and code execution (outcome-level) verification.

Paper 23 — Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Model Parameters → Paper 24 — rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking →

Appears in papers