RoBERTa
Robustly Optimized BERT Pretraining Approach.
Robustly Optimized BERT Pretraining Approach. Facebook AI's 2019 replication of BERT with more data, no NSP, larger batch sizes, and longer training. Beat BERT-large on all benchmarks, showing the original BERT was significantly undertrained.