Section 04

How It Works — Step by Step

First Learning Machine The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain 1958

How It Works — Step by Step

Let us walk through the Perceptron on a concrete example. We will train it to recognise the AND gate — one of the simplest possible classification tasks.

The AND gate: given two inputs (0 or 1), output 1 only if both inputs are 1. Otherwise output 0.

Input 1Input 2Correct Output
000
010
100
111

Step 1: Initialise the weights

The Perceptron starts with random weights. Let us set:

  • Weight for Input 1: w₁ = 0.5
  • Weight for Input 2: w₂ = 0.5
  • Threshold (also called bias, with weight w₀): θ = 0.7
  • Learning rate (how big each update step is): η = 0.1

The learning rate controls how fast we adjust. Too big and we overshoot. Too small and learning is very slow. 0.1 is a common starting point.


Step 2: Feed in the first training example

Take the first row: Input 1 = 0, Input 2 = 0. Correct output = 0.

Compute the weighted sum:

sum = (w₁ × input₁) + (w₂ × input₂)
    = (0.5 × 0) + (0.5 × 0)
    = 0

Apply the threshold:

If sum ≥ threshold (0.7): output = 1
If sum < threshold (0.7): output = 0

0 < 0.7, so output = 0

Compare with correct answer (0). Correct! No update needed.


Step 3: Feed in the second training example

Input 1 = 0, Input 2 = 1. Correct output = 0.

sum = (0.5 × 0) + (0.5 × 1) = 0.5
0.5 < 0.7, so output = 0

Correct again. No update.


Step 4: Feed in the third training example

Input 1 = 1, Input 2 = 0. Correct output = 0.

sum = (0.5 × 1) + (0.5 × 0) = 0.5
0.5 < 0.7, so output = 0

Correct. No update.


Step 5: Feed in the fourth training example

Input 1 = 1, Input 2 = 1. Correct output = 1.

sum = (0.5 × 1) + (0.5 × 1) = 1.0
1.0 ≥ 0.7, so output = 1

Correct! No update.

All four examples are correct. The Perceptron has learned the AND gate on the first pass — because our initial weights happened to be right.


Step 6: What a weight update looks like (when we make a mistake)

Let us see what would happen if we started with worse weights. Say w₁ = 0.2, w₂ = 0.2, θ = 0.7.

Feed in Input 1 = 1, Input 2 = 1. Correct output = 1.

sum = (0.2 × 1) + (0.2 × 1) = 0.4
0.4 < 0.7, so output = 0

WRONG. We predicted 0 but the answer is 1. This is a false negative.

The Perceptron Learning Rule says: increase the weights for inputs that are active (non-zero) when we make this kind of mistake.

New w₁ = w₁ + (η × error × input₁)
       = 0.2 + (0.1 × 1 × 1)    ← error = +1 because we needed a 1 and got a 0
       = 0.3

New w₂ = w₂ + (η × error × input₂)
       = 0.2 + (0.1 × 1 × 1)
       = 0.3

After this update: w₁ = 0.3, w₂ = 0.3. The sum for (1,1) is now 0.6 — still below 0.7, still wrong. But we are getting warmer. After a few more such updates, the weights will reach values that produce the correct answer.

The math for this is explained in full in The Mathematics section →.


Step 7: Repeat until convergence

We cycle through all training examples repeatedly — each full cycle is called an epoch. After each epoch, we check: are all predictions correct? If yes, we are done. If no, keep going.

Rosenblatt proved that for linearly separable data (like AND), the Perceptron will always converge — always reach perfect accuracy — in a finite number of steps, no matter what random weights you start with.


Step 8: Use the trained Perceptron

Once trained, the Perceptron’s weights are fixed. Given any new input, it computes the weighted sum, applies the threshold, and outputs 0 or 1. No more learning — just prediction.

This distinction between training (learning the weights) and inference (using the trained weights to predict) is fundamental to all machine learning systems, including the largest models today.


Next: The Mathematics →