BPE (Byte Pair Encoding)
A subword tokenisation algorithm that splits words into common subunits.
A subword tokenisation algorithm that splits words into common subunits. "unhappiness" → "un" + "happiness" → stored as three tokens. Handles rare words by decomposing them. GPT-1's vocabulary: 40,478 BPE tokens.