Skip to content

Commit 4306a91

Browse files
authored
Update model-card-annotated.md
Updates to address [this issue](#1125) - Addition of training regime in the annotated model card to keep this doc and the template in sync. - Defined training_regime, along with examples
1 parent fdbce63 commit 4306a91

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

docs/hub/model-card-annotated.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,9 @@ _Write 1-2 sentences on what the training data is. Ideally this links to a Datas
158158

159159
## Training Procedure [optional]
160160

161+
_When you want to know what hardware you'll need to fine-tune a model, consider the following factors: the number of parameters in the model and the training regime you plan to use._
162+
163+
_e.g For instance, a model with 3B parameters and fp32 precision format needs at least 48GB of GPU memory, while bf16 requires at least 24GB of memory with Amphere or higher hardware. Mixed pf16 requires at least 54GB of GPU memory._
161164

162165
### Preprocessing
163166

@@ -166,6 +169,13 @@ _Write 1-2 sentences on what the training data is. Ideally this links to a Datas
166169

167170
_Detail tokenization, resizing/rewriting (depending on the modality), etc._
168171

172+
### Training Hyperparameters
173+
174+
175+
* **Training regime:** training_regime`
176+
177+
_Detail the model training process, specifically the type of precision used - whether it is **fp32/fp16/bf16** - and whether it is **mixed or non-mixed precision**?_
178+
169179
### Speeds, Sizes, Times
170180

171181

0 commit comments

Comments
 (0)