Skip to content

Commit 65d79ea

Browse files
Eugene Vinitskyclaude
andcommitted
Log avg_distance_per_infraction (port from emerge/temp_training)
Adds the metric: avg_distance_per_infraction = total_distance_travelled / total_infractions which tracks how far agents drive between offroad/collision/red-light events — a useful single-scalar driving-quality signal for wandb. The two underlying log fields already exist in puffer-4 and are already aggregated per-step in add_log; only the binding-side ratio was missing. Both fields are normalized per-agent by static_vec_aggregate_logs, but the ratio is invariant to that 1/n scaling so we just compute it directly. Tiny epsilon clamp on the denominator guards against div-by-zero in the rare case of zero infractions across the entire aggregation window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 094a2af commit 65d79ea

1 file changed

Lines changed: 9 additions & 0 deletions

File tree

sim/binding.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,14 @@ void my_init(Env *env, Dict *kwargs) {
183183
}
184184

185185
void my_log(Log *log, Dict *out) {
186+
// Log fields are normalized per-agent by static_vec_aggregate_logs, but
187+
// a ratio of two such fields is invariant to that 1/n scaling, so the
188+
// per-agent rates give the same answer as the cross-population totals.
189+
// Tiny epsilon clamp guards against div-by-zero when no agent had an
190+
// infraction in the entire aggregation window.
191+
float avg_distance_per_infraction =
192+
log->total_distance_travelled / fmaxf(1e-6f, log->total_infractions);
193+
186194
dict_set(out, "score", log->score);
187195
dict_set(out, "episode_return", log->episode_return);
188196
dict_set(out, "episode_length", log->episode_length);
@@ -191,6 +199,7 @@ void my_log(Log *log, Dict *out) {
191199
dict_set(out, "num_goals_reached", log->num_goals_reached);
192200
dict_set(out, "avg_speed_per_agent", log->avg_speed_per_agent);
193201
dict_set(out, "dnf_rate", log->dnf_rate);
202+
dict_set(out, "avg_distance_per_infraction", avg_distance_per_infraction);
194203
dict_set(out, "n", log->n);
195204
}
196205

0 commit comments

Comments
 (0)