dm.cs.tu-dortmund.de/en/mlbits/neural-nlp-decoders/
Decoder Models – Lecture Notes
Le, Q.V. and Salakhutdinov, R. 2019. Transformer-XL: Attentive language models beyond a fixed-length context . Proc. Association for computational linguistics, ACL (2019), 2978–2988.
[FaLeDa18]
Fan, A [...] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D. and Sutskever, I. 2019. Language models are unsupervised multitask learners. (2019).
[XYHZ20]
Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, …