1. 1
    01
    Computing Machinery and Intelligence
    Alan Turing 1950 Beginner 45 minutes

    The paper that started it all. Alan Turing asked a deceptively simple question — can machines think? — and in trying to answer it, invented the entire framework through which we still think about artificial intelligence today.

    Read
  2. 2
    08
    Attention Is All You Need
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Łukasz Kaiser, Illia Polosukhin 2017 Intermediate
    Read
  3. 3
    10
    Improving Language Understanding by Generative Pre-Training
    Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever 2018 Intermediate
    Read
  4. 4
    11
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova 2018 Intermediate
    Read
  5. 5
    12
    Language Models are Few-Shot Learners
    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei 2020 Intermediate
    Read
  6. 6
    13
    Scaling Laws for Neural Language Models
    Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Aditya Ramesh, Prafulla Dhariwal, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei 2020 Intermediate
    Read
  7. 7
    14
    Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc V. Le, Denny Zhou 2022 Intermediate
    Read
  8. 8
    15
    Training Language Models to Follow Instructions with Human Feedback
    Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelley, Emma Coleman, Brennan Zoph, Amanda Askell, Solal Picciotto, Ariel Herbert-Voss, Jeff Engstrom, Christopher Olah, Gretchen Krueger, Ryan Felsher, Timothy Telleen-Lawton, Tom Conerly, Tamera Lanham, Karina Nguyen, Todd Henighan, Saurav Kadavath, Nick Joseph, Tom Brown, Jack Clark, Dawn Song, Dario Amodei, Ilya Sutskever, Paul Christiano, Sam Altman 2022 Intermediate
    Read
  9. 9
    23
    Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Model Parameters
    Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar 2024 Intermediate
    Read
  10. 10
    24
    rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
    Xinyu Guan, Li Lyna Zhang, Yifei Liu, Ning Shang, Youran Sun, Yi Zhu, Fan Yang, Ruofei Zhang, Yin Zhang, Mao Yang, Weizhu Chen 2025 Intermediate
    Read

Want to go deeper? Browse all 24 papers or explore the math behind them.