0%

BERT and its family

发表于 2024-01-01 更新于 2024-05-26

BERT and its family

Pre-train Model

Bigger Model

Smaller Model

Network Architecture

How to finetune

NLP tasks

Input

Output

Copy from Input(BERT)

General Sequence(V1)

General Sequence(V2)

Adaptor

Weighted Features

Why Pre-train Models?

Why Fine-tune?