|
| 1 | +# Model Garden overview |
| 2 | + |
| 3 | +The TensorFlow Model Garden provides implementations of many state-of-the-art |
| 4 | +machine learning (ML) models for vision and natural language processing (NLP), |
| 5 | +as well as workflow tools to let you quickly configure and run those models on |
| 6 | +standard datasets. Whether you are looking to benchmark performance for a |
| 7 | +well-known model, verify the results of recently released research, or extend |
| 8 | +existing models, the Model Garden can help you drive your ML research and |
| 9 | +applications forward. |
| 10 | + |
| 11 | +The Model Garden includes the following resources for machine learning |
| 12 | +developers: |
| 13 | + |
| 14 | +- [**Official models**](#official) for vision and NLP, maintained by Google |
| 15 | + engineers |
| 16 | +- [**Research models**](#research) published as part of ML research papers |
| 17 | +- [**Training experiment framework**](#training_framework) for fast, |
| 18 | + declarative training configuration of official models |
| 19 | +- [**Specialized ML operations**](#ops) for vision and natural language |
| 20 | + processing (NLP) |
| 21 | +- [**Model training loop**](#orbit) management with Orbit |
| 22 | + |
| 23 | +These resources are built to be used with the TensorFlow Core framework and |
| 24 | +integrate with your existing TensorFlow development projects. Model |
| 25 | +Garden resources are also provided under an [open |
| 26 | +source](https://github.com/tensorflow/models/blob/master/LICENSE) license, so |
| 27 | +you can freely extend and distribute the models and tools. |
| 28 | + |
| 29 | +Practical ML models are computationally intensive to train and run, and may |
| 30 | +require accelerators such as Graphical Processing Units (GPUs) and Tensor |
| 31 | +Processing Units (TPUs). Most of the models in Model Garden were trained on |
| 32 | +large datasets using TPUs. However, you can also train and run these models on |
| 33 | +GPU and CPU processors. |
| 34 | + |
| 35 | +## Model Garden models |
| 36 | + |
| 37 | +The machine learning models in the Model Garden include full code so you can |
| 38 | +test, train, or re-train them for research and experimentation. The Model Garden |
| 39 | +includes two primary categories of models: *official models* and *research |
| 40 | +models*. |
| 41 | + |
| 42 | +### Official models {:#official} |
| 43 | + |
| 44 | +The [Official Models](https://github.com/tensorflow/models/tree/master/official) |
| 45 | +repository is a collection of state-of-the-art models, with a focus on |
| 46 | +vision and natural language processing (NLP). |
| 47 | +These models are implemented using current TensorFlow 2.x high-level |
| 48 | +APIs. Model libraries in this repository are optimized for fast performance and |
| 49 | +actively maintained by Google engineers. The official models include additional |
| 50 | +metadata you can use to quickly configure experiments using the Model Garden |
| 51 | +[training experiment framework](#training_framework). |
| 52 | + |
| 53 | +### Research models {:#research} |
| 54 | + |
| 55 | +The [Research Models](https://github.com/tensorflow/models/tree/master/research) |
| 56 | +repository is a collection of models published as code resources for research |
| 57 | +papers. These models are implemented using both TensorFlow 1.x and 2.x. Model |
| 58 | +libraries in the research folder are supported by the code owners and the |
| 59 | +research community. |
| 60 | + |
| 61 | +## Training experiment framework {:#training_framework} |
| 62 | + |
| 63 | +The Model Garden training experiment framework lets you quickly assemble and |
| 64 | +run training experiments using its official models and standard datasets. The |
| 65 | +training framework uses additional metadata included with the Model Garden's |
| 66 | +official models to allow you to configure models quickly using a declarative |
| 67 | +programming model. You can define a training experiment using Python commands in |
| 68 | +the [TensorFlow Model library](../../api_docs/python/tfm/core) |
| 69 | +or configure training using a YAML configuration file, like this |
| 70 | +[example](https://github.com/tensorflow/models/blob/master/official/vision/configs/experiments/image_classification/imagenet_resnet50_tpu.yaml). |
| 71 | + |
| 72 | +The training framework uses |
| 73 | +[`tfm.core.base_trainer.ExperimentConfig`](../../api_docs/python/tfm/core/base_trainer/ExperimentConfig) |
| 74 | +as the configuration object, which contains the following top-level |
| 75 | +configuration objects: |
| 76 | + |
| 77 | +- [`runtime`](https://www.tensorflow.org/api_docs/python/tfm/core/base_task/RuntimeConfig): |
| 78 | + Defines the processing hardware, distribution strategy, and other |
| 79 | + performance optimizations |
| 80 | +- [`task`](https://www.tensorflow.org/api_docs/python/tfm/core/config_definitions/TaskConfig): |
| 81 | + Defines the model, training data, losses, and initialization |
| 82 | +- [`trainer`](https://www.tensorflow.org/api_docs/python/tfm/core/base_trainer/TrainerConfig): |
| 83 | + Defines the optimizer, training loops, evaluation loops, summaries, and |
| 84 | + checkpoints |
| 85 | + |
| 86 | +For a complete example using the Model Garden training experiment framework, |
| 87 | +see the |
| 88 | +[Image classification with Model Garden](../../tutorials/images/classification_with_model_garden) |
| 89 | +tutorial. For information on the training experiment framework, check out the |
| 90 | +[TensorFlow Models API documentation](../../api_docs/python/tfm/core). |
| 91 | +If you are looking for a solution to manage training loops for your model |
| 92 | +training experiments, check out [Orbit](#orbit). |
| 93 | + |
| 94 | +## Specialized ML operations {:#ops} |
| 95 | + |
| 96 | +The Model Garden contains many vision and NLP operations specifically designed |
| 97 | +to execute state-of-the-art models that run efficiently on GPUs and TPUs. Review |
| 98 | +the TensorFlow Models Vision library API docs for a list of specialized [vision |
| 99 | +operations](../../api_docs/python/tfm/vision). Review the |
| 100 | +TensorFlow Models NLP Library API docs for a list of [NLP |
| 101 | +operations](../../api_docs/python/tfm/nlp). These libraries |
| 102 | +also include additional utility functions used for vision and NLP data |
| 103 | +processing, training, and model execution. |
| 104 | + |
| 105 | +## Training loops with Orbit {:#orbit} |
| 106 | + |
| 107 | +The Orbit tool is a flexible, lightweight library designed to make it easier to |
| 108 | +write custom training loops in TensorFlow 2.x, and works well with the Model |
| 109 | +Garden [training experiment framework](#training_framework). Orbit handles |
| 110 | +common model training tasks such as saving checkpoints, running model |
| 111 | +evaluations, and setting up summary writing. It seamlessly integrates with |
| 112 | +`tf.distribute` and supports running on different device types, including CPU, |
| 113 | +GPU, and TPU hardware. The Orbit tool is also [open |
| 114 | +source](https://github.com/tensorflow/models/blob/master/orbit/LICENSE), so you |
| 115 | +can extend and adapt to your model training needs. |
| 116 | + |
| 117 | +You generally train TensorFlow models by writing a |
| 118 | +[custom training loop](https://www.tensorflow.org/guide/keras/writing_a_training_loop_from_scratch), |
| 119 | +or using the high-level Keras |
| 120 | +[Model.fit](../../api_docs/python/tf/keras/Model#fit) |
| 121 | +function. For simple models, you can define and manage a custom training loop |
| 122 | +with low-level TensorFlow methods such as `tf.GradientTape` or `tf.function`. |
| 123 | +Alternatively, you can use the high-level Keras `Model.fit`. |
| 124 | + |
| 125 | +However, if your model is complex and your training loop requires more flexible |
| 126 | +control or customization, then you should use Orbit. You can define most of your |
| 127 | +training loop by extending Orbit's `AbstractTrainerclass`. Learn more about the |
| 128 | +Orbit tool in the [Orbit API documentation](../../api_docs/python/orbit). |
| 129 | + |
| 130 | +Note: You can use the Keras API to do what Orbit does, but you must override |
| 131 | +the TensorFlow `train_step` function or use callbacks like ModelCheckpoint or |
| 132 | +TensorBoard. For more information about modifying the behavior of `train_step`, |
| 133 | +check out the |
| 134 | +[Customize what happens in Model.fit](https://www.tensorflow.org/guide/keras/customizing_what_happens_in_fit) |
| 135 | +page. |
0 commit comments