Query Chunk (Q Chunk)
The subset of query vectors on a given GPU.
The subset of query vectors on a given GPU. In Ring Attention, GPU i computes attention for its Q chunk against all P KV chunks (one per round). Q chunks remain on their original GPU; KV chunks circulate.