All-to-All Communication
A communication pattern where every GPU sends data to every other GPU.
A communication pattern where every GPU sends data to every other GPU. O(P²) complexity. Ring Attention avoids this by using a ring topology (each GPU communicates with only 2 neighbours).