Sigmoid
The activation function σ(z) = 1/(1+e⁻ᶻ).
The activation function σ(z) = 1/(1+e⁻ᶻ). It squashes any real number to a value strictly between 0 and 1. Key property: it is differentiable everywhere, allowing the chain rule to flow through it. Key problem: its derivative is always ≤ 0.25, causing the vanishing gradient problem in deep networks.