BERT and its family 发表于 2024-01-01 更新于 2024-05-26 BERT and its family Pre-train Model Bigger Model Smaller Model Network Architecture How to finetune NLP tasks Input Output Copy from Input(BERT) General Sequence(V1) General Sequence(V2) Adaptor Weighted Features Why Pre-train Models? Why Fine-tune? 打赏