Language models with Transformers

arXiv preprint arXiv:1904.09408 (arXiv 2019), 2019-04-01 00:00:00 -0700