大语言模型(Large Language Models, LLMs)¶
约 153 个字 预计阅读时间 1 分钟
Reference¶
-
NLP基础
- https://web.stanford.edu/class/cs224n/index.html#schedule
-
Build a Large Language Model (From Scratch)
- https://github.com/rasbt/LLMs-from-scratch
-
大规模语言模型:从理论到实践
- https://intro-llm.github.io/#chapter
- https://github.com/intro-llm/intro-llm-code/tree/main
-
llmbook:
- https://github.com/datawhalechina/llmbook/tree/main/slides/
-
https://web.stanford.edu/~jurafsky/slp3/10.pdf
-
https://luhengshiwo.github.io/LLMForEverybody/
-
项目实战
- https://github.com/PTGWong/MovieAssistant-ChatGLM4-RAG
- https://github.com/shubhamprajapati7748/ecommerce-chatbot
Contents¶
- Large Language Models
- Transformer变体
- Decoder-Only: GPT, LLaMA
- Encoder-Only: Bert
- MoEs
- 人类对齐
- Distributed Training
- DeepSpeed实战
- Supervised Fine-Tuning (SFT): LoRA, QLoRA
- 基于 DeepSpeed-Chat 训练类 ChatGPT 对话模型
- Reinforcement Learning
- verl实战
- 复杂推理: Agent
- 使用LangChain构建Agent
- RAG
- 使用 LangChain 框架实现检索增强生成系统
- 大模型效率优化
- vLLM实战
- 大模型评估
- 大模型本地部署实战
Paper Reading¶
-
LLMs:
- GPT-4
- LLAMA
- DEEPSEEK
-
Position Embedding:
- ROPE
-
SFT:
- LoRA
-
RAG: