0%

Transformer

Transformer

Sequence-to-sequence (Seq2seq)

image-20240101101103745

Hokkien(闽南语、台语)

image-20240101101234381

Text to Speech (TTS) Synthesis

image-20240101101640728

Seq2Seq for Chatbot

image-20240101101827226

Most Natural Language Processing appliactions…

image-20240101102140809

image-20240101102327020

Seq2seq for Syntactic Parsing

image-20240101102531810

image-20240101102720789

Seq2seq for Multi-label Classification

image-20240101102909592

Seq2seq for Object Detection

image-20240101103012103

Seq2seq

image-20240101103207397

Encoder

image-20240101103305367

image-20240101103422704

Batch Norm:同一个 dimension ,不同 feature,不同 example,去计算 mean $m$ 和 standard deviation $\sigma$

Layer Norm:同一个 example,同一个 feature,不同的 dimension 去计算 mean $m$ 和 standard deviation $\sigma$

image-20240101104456398

image-20240101104633223

To learn more……

image-20240101104834514

Autoregressive

image-20240101111821221

image-20240101111904056

image-20240101112025286

Self-attention->Masked Self-attention

image-20240101112147601

image-20240101112329693

image-20240101112507976

image-20240101112739147

AT vs NAT

image-20240101113246793

Transformer

image-20240101113406890

image-20240101113624620

Cross Attention

image-20240101113957467

image-20240101113936719

Training

image-20240101114222410

image-20240101114447538

Copy Mechanism

image-20240101114649828

image-20240101114857894

Guided Attention

image-20240101115332541

image-20240101115643932

image-20240101115951952

Optimizing Evaluation Metrics?

image-20240101120310167

exposure bias

image-20240101120514182

image-20240101120627640