dm.cs.tu-dortmund.de/en/mlbits/neural-nlp-finetuning/
Finetuning and Optimization – Lecture Notes
11
52+1
CPU
f32
32
1
8
23+1
CPU
f16
16
1
5
10+1
CPU since 2013
bf16
16
1
8
7+1
GPU since RTX, Ampere
bf8 E4M3
8
1
4
3+1
GPU since Hopper
bf8 E5M2
8
1
5
2+1
GPU since Hopper
int8
8
1
0
7
CPU, GPU since [...] translated data
References
[DLBZ22]
Dettmers, T., Lewis, M., Belkada, Y. and Zettlemoyer, L. 2022. LLM.int8(): 8-bit matrix multiplication for transformers at scale. CoRR . abs/2208.07339, (2022). DOI: 10.48550/ARXIV …