大模型效率优化¶

约 31 个字预计阅读时间不到 1 分钟

Reference¶

https://mp.weixin.qq.com/s/tdPrtsxOfnpyQzE25psdUQ
https://intro-llm.github.io/chapter/LLM-TAP-v2.pdf
https://medium.com/@florian_algo/model-quantization-1-basic-concepts-860547ec6aa9
KV 缓存
模型量化
训练显存计算
低精度训练