Skip to content

Commit 3afb839

Browse files
Updated internal links with external github links in Multi-Task Library
PiperOrigin-RevId: 640221637
1 parent c42c666 commit 3afb839

File tree

1 file changed

+97
-0
lines changed

1 file changed

+97
-0
lines changed

official/nlp/docs/multi_task.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# Multi-task library
2+
3+
## Overview
4+
5+
The multi-task library offers lite-weight interfaces and common components to support multi-task
6+
training and evaluation. It makes no assumption on task types and specific model
7+
structure details, instead, it is designed to be a scaffold that effectively
8+
compose single tasks together. Common training scheduling are implemented in the
9+
default module and it saves possibility for further extension on customized use
10+
cases.
11+
12+
The multi-task library support:
13+
14+
- *joint* training: individual tasks perform forward passes to get a joint
15+
loss and one backward pass happens.
16+
- *alternative* training: individual tasks perform independent forward and
17+
backward pass. The mixture of tasks is controlled by sampling different
18+
tasks for train steps.
19+
20+
## Library components
21+
22+
### Interfaces
23+
24+
* [multitask.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/multitask.py#L15)
25+
serves as a stakeholder of multiple
26+
[`Task`](https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L34)
27+
instances as well as holding information about multi-task scheduling, such
28+
as task weight.
29+
30+
* [base_model.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/base_model.py)
31+
offers access to each single task's forward computation, where each task is
32+
represented as a `tf.keras.Model` instance. Parameter sharing between tasks
33+
is left to concrete implementation.
34+
35+
### Common components
36+
37+
* [base_trainer.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/base_trainer.py)
38+
provides an abstraction to optimize a multi-task model that involves with
39+
heterogeneous datasets. By default it conducts joint backward step. Task can
40+
be balanced through setting different task weight on corresponding task
41+
loss.
42+
43+
* [interleaving_trainer.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/interleaving_trainer.py)
44+
derives from base trainer and hence shares its housekeeping logic such as
45+
loss, metric aggregation and reporting. Unlike the base trainer which
46+
conducts joint backward step, interleaving trainer alternates between tasks
47+
and effectively mixes single task training step on heterogeneous data sets.
48+
Task sampling with respect to a probabilistic distribution will be supported
49+
to facilitate task balancing.
50+
51+
* [evaluator.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/evaluator.py)
52+
conducts a combination of evaluation of each single task. It simply loops
53+
through specified tasks and conducts evaluation with corresponding data
54+
sets.
55+
56+
* [train_lib.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/train_lib.py)
57+
puts together model, tasks then trainer and triggers training evaluation
58+
execution.
59+
60+
* [configs.py](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/configs.py)
61+
provides a top level view on the entire system. Configuration objects are
62+
mimicked or composed from corresponding single task components to reuse
63+
whenever possible and maintain consistency. For example,
64+
[`TaskRoutine`](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/configs.py#L25)
65+
effectively reuses
66+
[`Task`](https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L34);
67+
and
68+
[`MultiTaskConfig`](https://github.com/tensorflow/models/blob/master/official/modeling/multitask/configs.py#L34)
69+
serves as a similar role of
70+
[`TaskConfig`](https://github.com/tensorflow/models/blob/master/official/core/config_definitions.py#L211)
71+
72+
### Notes on single task composability
73+
74+
The library is designed to be able to put together multi-task model by composing
75+
single task implementations. This is reflected in many aspects:
76+
77+
* Base model interface allows single task's `tf.keras.Model` implementation to
78+
be reused, given the shared parts in a potential multi-task case are passed
79+
in through constructor. A good example of this is
80+
[`BertClassifier`](https://github.com/tensorflow/models/blob/master/official/nlp/modeling/models/bert_classifier.py#L24)
81+
and
82+
[`BertSpanLabeler`](https://github.com/tensorflow/models/blob/master/official/nlp/modeling/models/bert_span_labeler.py),
83+
where the backbone network is initialized out of classifier object. Hence a
84+
multi-task model that conducts both classification + sequence labeling using
85+
a shared backbone encoder could be easily created from existing code.
86+
87+
* Multi-task interface holds a set of Task objects, hence completely reuse the
88+
input functions, loss functions, metrics with corresponding aggregation and
89+
reduction logic. **Note, under multi-task training situation, the
90+
[`build_model()`](https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L144)
91+
are not used**, given partially shared structure cannot be specified with
92+
only one single task.
93+
94+
* Interleaving trainer works on top of each single task's
95+
[`train_step()`](https://github.com/tensorflow/models/blob/master/official/core/base_task.py#L223).
96+
This hides the optimization details from each single task and focuses on
97+
optimization scheduling and task balancing.

0 commit comments

Comments
 (0)