|
| 1 | +# Multi-task library |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +The multi-task library offers lite-weight interfaces and common components to support multi-task |
| 6 | +training and evaluation. It makes no assumption on task types and specific model |
| 7 | +structure details, instead, it is designed to be a scaffold that effectively |
| 8 | +compose single tasks together. Common training scheduling are implemented in the |
| 9 | +default module and it saves possibility for further extension on customized use |
| 10 | +cases. |
| 11 | + |
| 12 | +The multi-task library support: |
| 13 | + |
| 14 | +- *joint* training: individual tasks perform forward passes to get a joint |
| 15 | + loss and one backward pass happens. |
| 16 | +- *alternative* training: individual tasks perform independent forward and |
| 17 | + backward pass. The mixture of tasks is controlled by sampling different |
| 18 | + tasks for train steps. |
| 19 | + |
| 20 | +## Library components |
| 21 | + |
| 22 | +### Interfaces |
| 23 | + |
| 24 | +* [multitask.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/multitask.py#L15) |
| 25 | + serves as a stakeholder of multiple |
| 26 | + [`Task`](https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L34) |
| 27 | + instances as well as holding information about multi-task scheduling, such |
| 28 | + as task weight. |
| 29 | + |
| 30 | +* [base_model.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/base_model.py) |
| 31 | + offers access to each single task's forward computation, where each task is |
| 32 | + represented as a `tf.keras.Model` instance. Parameter sharing between tasks |
| 33 | + is left to concrete implementation. |
| 34 | + |
| 35 | +### Common components |
| 36 | + |
| 37 | +* [base_trainer.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/base_trainer.py) |
| 38 | + provides an abstraction to optimize a multi-task model that involves with |
| 39 | + heterogeneous datasets. By default it conducts joint backward step. Task can |
| 40 | + be balanced through setting different task weight on corresponding task |
| 41 | + loss. |
| 42 | + |
| 43 | +* [interleaving_trainer.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/interleaving_trainer.py) |
| 44 | + derives from base trainer and hence shares its housekeeping logic such as |
| 45 | + loss, metric aggregation and reporting. Unlike the base trainer which |
| 46 | + conducts joint backward step, interleaving trainer alternates between tasks |
| 47 | + and effectively mixes single task training step on heterogeneous data sets. |
| 48 | + Task sampling with respect to a probabilistic distribution will be supported |
| 49 | + to facilitate task balancing. |
| 50 | + |
| 51 | +* [evaluator.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/evaluator.py) |
| 52 | + conducts a combination of evaluation of each single task. It simply loops |
| 53 | + through specified tasks and conducts evaluation with corresponding data |
| 54 | + sets. |
| 55 | + |
| 56 | +* [train_lib.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/train_lib.py) |
| 57 | + puts together model, tasks then trainer and triggers training evaluation |
| 58 | + execution. |
| 59 | + |
| 60 | +* [configs.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/configs.py) |
| 61 | + provides a top level view on the entire system. Configuration objects are |
| 62 | + mimicked or composed from corresponding single task components to reuse |
| 63 | + whenever possible and maintain consistency. For example, |
| 64 | + [`TaskRoutine`](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/configs.py#L25) |
| 65 | + effectively reuses |
| 66 | + [`Task`](https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L34); |
| 67 | + and |
| 68 | + [`MultiTaskConfig`](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/configs.py#L34) |
| 69 | + serves as a similar role of |
| 70 | + [`TaskConfig`](https://github.com/tensorflow/models/blob/master/official/core/config_definitions.py#L211) |
| 71 | + |
| 72 | +### Notes on single task composability |
| 73 | + |
| 74 | +The library is designed to be able to put together multi-task model by composing |
| 75 | +single task implementations. This is reflected in many aspects: |
| 76 | + |
| 77 | +* Base model interface allows single task's `tf.keras.Model` implementation to |
| 78 | + be reused, given the shared parts in a potential multi-task case are passed |
| 79 | + in through constructor. A good example of this is |
| 80 | + [`BertClassifier`](https://github.com/tensorflow/models/blob/master/official/nlp/modeling/models/bert_classifier.py#L24) |
| 81 | + and |
| 82 | + [`BertSpanLabeler`](https://github.com/tensorflow/models/blob/master/official/nlp/modeling/models/bert_span_labeler.py), |
| 83 | + where the backbone network is initialized out of classifier object. Hence a |
| 84 | + multi-task model that conducts both classification + sequence labeling using |
| 85 | + a shared backbone encoder could be easily created from existing code. |
| 86 | + |
| 87 | +* Multi-task interface holds a set of Task objects, hence completely reuse the |
| 88 | + input functions, loss functions, metrics with corresponding aggregation and |
| 89 | + reduction logic. **Note, under multi-task training situation, the |
| 90 | + [`build_model()`](https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L144) |
| 91 | + are not used**, given partially shared structure cannot be specified with |
| 92 | + only one single task. |
| 93 | + |
| 94 | +* Interleaving trainer works on top of each single task's |
| 95 | + [`train_step()`](https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L223). |
| 96 | + This hides the optimization details from each single task and focuses on |
| 97 | + optimization scheduling and task balancing. |
0 commit comments