文章研读¶

约 227 个字预计阅读时间 1 分钟

记录一下一些论文和博客的研读笔记

Contents¶

苏神围绕 Transformer 的一系列博客
- https://spaces.ac.cn/content.html?tag=attention
CNN：
- Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position
- LeNet-5:
  - GradientBased Learning Applied to Document Recognition
- AlexNet:
  - ImageNet Classification with Deep Convolutional Neural Networks
- VGG:
  - Very Deep Convolutional Networks for Large-Scale Image Recognition
- GoogleNet:
  - Going Deeper with Convolutions
- ResNet:
  - Deep Residual Learning for Image Recognition
- CNN Survey:
  - Convolutional Neural Networks: A Survey (Moez Krichen 2023)
RNN：
- Neural networks and physical systems with emergent collective computational abilities
- learning internal representations by error propagation
- Finding Structure in Time
- LSTM：
  - Long Short-Term Memory
- Bi-RNN：
  - Bidirectional Recurrent Neural Networks
- GRU：
  - Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
- RNN Survey:
  - Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks
  - Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey (Benyamin Ghojogh)
Encoder-Decoder and Attention:
- Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
- Sequence to Sequence Learning with Neural Networks
- Neural Machine Translation by Jointly Learning to Align and Translate
- Effective Approaches to Attention-based Neural Machine Translation
Transformer:
- Attention Is All You Need
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- T5
- Longformer: The Long-Document Transformer
- Performer: Efficient Attention with Faster Transformers
- Swin Transformer