The TF-NLP library provides a collection of scripts for training and evaluating transformer-based models, on various tasks such as sentence classification, question answering, and translation. Additionally, we provide checkpoints of pretrained models which can be finetuned on downstream tasks.
Model Garden can be easily installed with
pip install tf-models-nightly. After installation, check out
this instruction
on how to train models with this codebase.
By default, the experiment runs on GPUs. To run on TPUs, one should overwrite
runtime.distribution_strategy and set the tpu address. See RuntimeConfig for details.
In general, the experiments can run with the folloing command by setting the
corresponding ${TASK}, ${TASK_CONFIG}, ${MODEL_CONFIG}.
EXPERIMENT=???
TASK_CONFIG=???
MODEL_CONFIG=???
EXRTRA_PARAMS=???
MODEL_DIR=??? # a-folder-to-hold-checkpoints-and-logs
python3 train.py \
--experiment=${EXPERIMENT} \
--mode=train_and_eval \
--model_dir=${MODEL_DIR} \
--config_file=${TASK_CONFIG} \
--config_file=${MODEL_CONFIG} \
--params_override=${EXRTRA_PARAMS}
EXPERIMENTcan be found underconfigs/TASK_CONFIGcan be found underconfigs/experiments/MODEL_CONFIGcan be found underconfigs/models/
train.pylooks up the registeredExperimentConfigwith${EXPERIMENT}- Overrides params in
TaskConfigin${TASK_CONFIG} - Overrides params
modelinTaskConfigwith${MODEL_CONFIG} - Overrides any params in
ExperimentConfigwith${EXTRA_PARAMS}
Note that
${TASK_CONFIG},${MODEL_CONFIG},${EXTRA_PARAMS}can be optional when EXPERIMENT default is enough.${TASK_CONFIG},${MODEL_CONFIG},${EXTRA_PARAMS}are only guaranteed to be compatible to it's${EXPERIMENT}that defines it.
| NAME | EXPERIMENT | TASK_CONFIG | MODEL_CONFIG | EXRTRA_PARAMS |
|---|---|---|---|---|
| BERT-base GLUE/MNLI-matched finetune | bert/sentence_prediction | glue_mnli_matched.yaml | bert_en_uncased_base.yaml | data and bert-base hub inittask.train_data.input_path=/path-to-your-training-data,task.validation_data.input_path=/path-to-your-val-data,task.hub_module_url=https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4 |
| BERT-base GLUE/MNLI-matched finetune | bert/sentence_prediction | glue_mnli_matched.yaml | bert_en_uncased_base.yaml | data and bert-base ckpt inittask.train_data.input_path=/path-to-your-training-data,task.validation_data.input_path=/path-to-your-val-data,task.init_checkpoint=gs://tf_model_garden/nlp/bert/uncased_L-12_H-768_A-12/bert_model.ckpt |
| BERT-base SQuAD v1.1 finetune | bert/squad | squad_v1.yaml | bert_en_uncased_base.yaml | data and bert-base hub inittask.train_data.input_path=/path-to-your-training-data,task.validation_data.input_path=/path-to-your-val-data,task.hub_module_url=https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4 |
| ALBERT-base SQuAD v1.1 finetune | bert/squad | squad_v1.yaml | albert_base.yaml | data and albert-base hub inittask.train_data.input_path=/path-to-your-training-data,task.validation_data.input_path=/path-to-your-val-data,task.hub_module_url=https://tfhub.dev/tensorflow/albert_en_base/3 |
| Transformer-large WMT14/en-de scratch | wmt_transformer/large | ende-32k sentencepiecetask.sentencepiece_model_path='gs://tf_model_garden/nlp/transformer_wmt/ende_bpe_32k.model' |