Model NLP¶
The config file of NLP models contain three main sections:
trainer: Trainer section contains the configs for PTL training and you may find more info at ../../introduction/core.html#model-training and `PTL Trainer class API <https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#trainer-class-api>’.
exp_manager: the configs of experiment manager. You can find more info at ../../introduction/core.html#experiment-manager
model: contains the configs of the datasets, model architecture, tokenizer, optimizer, scheduler, etc.
- The following sub-sections of the model section are shared among most of the NLP models.
tokenizer: specifies the tokenizer
language_model: specifies the underlying model to be used as the encoder
optim: the configs of the optimizer and scheduler ../../introduction/core.html
The ‘tokenizer’ and ‘language_model’ sections have the following parameters:
Parameter |
Data Type |
Description |
model.tokenizer.tokenizer_name |
string |
Tokenizer name, will be filled automatically based on model.language_model.pretrained_model_name |
model.tokenizer.vocab_file |
string |
Path to tokenizer vocabulary |
model.tokenizer.tokenizer_model |
string |
Path to tokenizer model (only for sentencepiece tokenizer) |
model.language_model.pretrained_model_name |
string |
Pre-trained language model name, for example: bert-base-cased or bert-base-uncased |
model.language_model.lm_checkpoint |
string |
Path to the pre-trained language model checkpoint |
model.language_model.config_file |
string |
Path to the pre-trained language model config file |
model.language_model.config |
dictionary |
Config of the pre-trained language model |
- The parameter model.language_model.pretrained_model_name can be one of the following:
megatron-bert-345m-uncased, megatron-bert-345m-cased, biomegatron-bert-345m-uncased, biomegatron-bert-345m-cased, bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased
distilbert-base-uncased, distilbert-base-cased,
roberta-base, roberta-large, distilroberta-base
albert-base-v1, albert-large-v1, albert-xlarge-v1, albert-xxlarge-v1, albert-base-v2, albert-large-v2, albert-xlarge-v2, albert-xxlarge-v2