How Neural Networks Learn

1

02

The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain
Frank Rosenblatt 1958 Beginner 50 minutes

Turing asked if machines could think. Rosenblatt built one that could learn. The Perceptron is the grandfather of every neural network alive today — the first machine that adjusted itself based on experience, rather than following rules someone wrote by hand.

Read
2

03

Learning Representations by Back-propagating Errors
David Rumelhart, Geoffrey Hinton, Ronald Williams 1986 Intermediate 55 minutes

The Perceptron could learn, but only simple patterns. Multi-layer networks could learn complex patterns, but nobody knew how to train them. This paper answered that question — with a single elegant algorithm that is still the beating heart of every neural network trained today.

Read
3

04

Long Short-Term Memory
Sepp Hochreiter, Jürgen Schmidhuber 1997 Intermediate

Read
4

08

Attention Is All You Need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Łukasz Kaiser, Illia Polosukhin 2017 Intermediate

Read
5

13

Scaling Laws for Neural Language Models
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Aditya Ramesh, Prafulla Dhariwal, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei 2020 Intermediate

Read
6

16

Let's Verify Step by Step: A Process Supervision Approach to Reward Modeling
Hunter Lightman, Vineet Kosaraju, Yura Burda, Harriet Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe 2023 Intermediate

Read

Want to go deeper? Browse all 24 papers or explore the math behind them.

All 24 Papers ∑ Math Playground