1. 1
    05
    Efficient Estimation of Word Representations in Vector Space (Word2Vec)
    Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean 2013 Intermediate
    Read
  2. 2
    06
    Sequence to Sequence Learning with Neural Networks
    Ilya Sutskever, Oriol Vinyals, Quoc V. Le 2014 Intermediate
    Read
  3. 3
    07
    Neural Machine Translation by Jointly Learning to Align and Translate
    Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio 2014 Intermediate
    Read
  4. 4
    08
    Attention Is All You Need
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Łukasz Kaiser, Illia Polosukhin 2017 Intermediate
    Read
  5. 5
    10
    Improving Language Understanding by Generative Pre-Training
    Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever 2018 Intermediate
    Read
  6. 6
    11
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova 2018 Intermediate
    Read
  7. 7
    12
    Language Models are Few-Shot Learners
    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei 2020 Intermediate
    Read
  8. 8
    14
    Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc V. Le, Denny Zhou 2022 Intermediate
    Read

Want to go deeper? Browse all 24 papers or explore the math behind them.