1. 1
    02
    The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain
    Frank Rosenblatt 1958 Beginner 50 minutes

    Turing asked if machines could think. Rosenblatt built one that could learn. The Perceptron is the grandfather of every neural network alive today — the first machine that adjusted itself based on experience, rather than following rules someone wrote by hand.

    Read
  2. 2
    03
    Learning Representations by Back-propagating Errors
    David Rumelhart, Geoffrey Hinton, Ronald Williams 1986 Intermediate 55 minutes

    The Perceptron could learn, but only simple patterns. Multi-layer networks could learn complex patterns, but nobody knew how to train them. This paper answered that question — with a single elegant algorithm that is still the beating heart of every neural network trained today.

    Read
  3. 3
    04
    Long Short-Term Memory
    Sepp Hochreiter, Jürgen Schmidhuber 1997 Intermediate
    Read
  4. 4
    08
    Attention Is All You Need
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Łukasz Kaiser, Illia Polosukhin 2017 Intermediate
    Read
  5. 5
    13
    Scaling Laws for Neural Language Models
    Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Aditya Ramesh, Prafulla Dhariwal, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei 2020 Intermediate
    Read
  6. 6
    16
    Let's Verify Step by Step: A Process Supervision Approach to Reward Modeling
    Hunter Lightman, Vineet Kosaraju, Yura Burda, Harriet Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe 2023 Intermediate
    Read

Want to go deeper? Browse all 24 papers or explore the math behind them.