Improving Language Models by Retrieving from Trillions of Tokens

Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche 0002, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan, Jack W. Rae, Erich Elsen, Laurent Sifre. Improving Language Models by Retrieving from Trillions of Tokens. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu 0001, Sivan Sabato, editors, International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. Volume 162 of Proceedings of Machine Learning Research, pages 2206-2240, PMLR, 2022. [doi]

@inproceedings{BorgeaudMHCRM0L22,
  title = {Improving Language Models by Retrieving from Trillions of Tokens},
  author = {Sebastian Borgeaud and Arthur Mensch and Jordan Hoffmann and Trevor Cai and Eliza Rutherford and Katie Millican and George van den Driessche 0002 and Jean-Baptiste Lespiau and Bogdan Damoc and Aidan Clark and Diego de Las Casas and Aurelia Guy and Jacob Menick and Roman Ring and Tom Hennigan and Saffron Huang and Loren Maggiore and Chris Jones and Albin Cassirer and Andy Brock and Michela Paganini and Geoffrey Irving and Oriol Vinyals and Simon Osindero and Karen Simonyan and Jack W. Rae and Erich Elsen and Laurent Sifre},
  year = {2022},
  url = {https://proceedings.mlr.press/v162/borgeaud22a.html},
  researchr = {https://researchr.org/publication/BorgeaudMHCRM0L22},
  cites = {0},
  citedby = {0},
  pages = {2206-2240},
  booktitle = {International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA},
  editor = {Kamalika Chaudhuri and Stefanie Jegelka and Le Song and Csaba Szepesvári and Gang Niu 0001 and Sivan Sabato},
  volume = {162},
  series = {Proceedings of Machine Learning Research},
  publisher = {PMLR},
}