Latency Hiding

Appears in 1 paper

Making communication latency disappear by overlapping it with computation.

As used in Paper 19 — Ring Attention with Blockwise Transformers for Near-Infinite Context →

Making communication latency disappear by overlapping it with computation. If computation time ≥ communication time, the communication cost is "hidden" in the wall-clock time. Ring Attention's key advantage.