FAQ

When should models be retrained?

Models should be retrained based on newly downloaded data when any of these conditions are true:

Existing labels have been deleted, renamed, or otherwise modified such that prediction labels are stale
New labels have been created and applied to issues/pulls, and it's desired for those labels to begin getting predicted
The repository has gained a high volume of issues/pulls compared to when it was trained, and prediction accuracy is low
The predicted labels are not meeting expectations for any other reason

If a model is not retrained under these circumstances:

A label that has been deleted or renamed can be predicted again, which results in recreating the label automatically
New labels will not be predicted
Prediction accuracy degrades over time

High volume repositories with stable labels can go years without the need for retraining. Because retraining is straightforward and self-service though, teams are empowered to retrain their models at the cadence they find valuable. The results of testing predictions will inform whether a newly trained model should be promoted into use.

Retraining invocation must never be automated

Teams may be tempted to use a cron schedule to automate retraining on a regular bases, but this must not be done. Training must remain a human-triggered event with review of the test data before promotion into usage.

Why do references to the issue-labeler's workflows use full length SHAs?

When onboarding, the workflows added that invoke the issue-labeler reference "reusable workflows" in the dotnet/issue-labeler repository using the full length commit SHA for the associated issue-labeler version.

Reusable workflows can be referenced using either tags or full length commit SHAs
GitHub's Security hardening for GitHub Actions documentation recommends pinning to the commit SHA as the most secure approach, and we adhere to that guidance
The short SHA is not supported by GitHub in this context, and the full length SHA must be used

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FAQ

When should models be retrained?

Retraining invocation must never be automated

Why do references to the issue-labeler's workflows use full length SHAs?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally