Alignment matrix (attention heatmap)
A grid where each row corresponds to a target word and each column to a source word.
A grid where each row corresponds to a target word and each column to a source word. Each cell shows the attention weight αₜᵢ — how much the model "looked at" source word i when generating target word t. High-value cells appear bright on the heatmap. Visualised in Figure 3 of the original paper.