Human Rater Agreement / Inter-Rater Reliability

Appears in 1 paper

Measure of how often different human raters agree on which output is better.

As used in Paper 15 — Training Language Models to Follow Instructions with Human Feedback →

Measure of how often different human raters agree on which output is better. In this paper, ~73% agreement. Lower agreement means ambiguity in preferences; higher agreement means clear preference signal. Disagreement is expected due to subjective taste variations.

Paper 15 — Training Language Models to Follow Instructions with Human Feedback →

Appears in papers