File tree Expand file tree Collapse file tree 1 file changed +11
-0
lines changed Expand file tree Collapse file tree 1 file changed +11
-0
lines changed Original file line number Diff line number Diff line change @@ -313,6 +313,17 @@ rm -rf $OUTPUT_DIR && \
313
313
--gin_bindings=train_eval.warmstart_policy_dir=\" $WARMSTART_OUTPUT_DIR /saved_policy\"
314
314
```
315
315
316
+ You may also start a tensorboard to monitor the training process with
317
+
318
+ ``` shell
319
+ tensorboard --logdir=$OUTPUT_DIR
320
+ ```
321
+
322
+ Mainly check the reward_distribution section for the model performance. It
323
+ includes the average reward and the percentile of the reward distributions
324
+ during training. Positive reward means an improvement against the heuristic,
325
+ and negative reward means a regression.
326
+
316
327
### Evaluate trained policy on a corpus (Optional)
317
328
318
329
Optionally, if you are interested in seeing how the trained policy (` $OUTPUT_DIR/saved_policy ` )
You can’t perform that action at this time.
0 commit comments