0%

BERT and its family

BERT and its family

image-20240101151311412

image-20240101151613828

Pre-train Model

image-20240101151957204

image-20240101152133087

image-20240101152335163

image-20240101152512194

image-20240101152725539

image-20240101152955469

Bigger Model

image-20240101153116056

Smaller Model

image-20240101153238937

image-20240101153318918

Network Architecture

image-20240101153501115

How to finetune

image-20240101153600574

NLP tasks

image-20240101153652018

Input

image-20240101153809242

Output

image-20240101154009934

image-20240101154046559

image-20240101154141208

Copy from Input(BERT)

image-20240101154306730

General Sequence(V1)

image-20240101154524829

General Sequence(V2)

image-20240101154700304

image-20240101154913886

Adaptor

image-20240101155120982

image-20240101155219649

image-20240101155325597

image-20240101155535672

Weighted Features

image-20240101155715505

Why Pre-train Models?

image-20240101155802260

Why Fine-tune?

image-20240101155824585

image-20240101160313458