Log avg_distance_per_infraction#417
Open
eugenevinitsky wants to merge 1 commit intopuffer-4from
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a new driving-quality metric to the Drive environment logs by deriving avg_distance_per_infraction from existing episode log totals in the binding layer.
Changes:
- Added an
nparameter tomy_logso binding code can reconstruct totals from the normalized aggregate log. - Logged a new
avg_distance_per_infractionmetric insim/binding.c. - Updated both vector-environment log call sites to pass the aggregate count through.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/vecenv.h |
Updates the my_log declaration and both aggregate logging call sites to pass n. |
sim/binding.c |
Computes and exports the new avg_distance_per_infraction metric from aggregated log data. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
188
to
+201
| @@ -191,6 +198,7 @@ void my_log(Log *log, Dict *out) { | |||
| dict_set(out, "num_goals_reached", log->num_goals_reached); | |||
| dict_set(out, "avg_speed_per_agent", log->avg_speed_per_agent); | |||
| dict_set(out, "dnf_rate", log->dnf_rate); | |||
| dict_set(out, "avg_distance_per_infraction", avg_distance_per_infraction); | |||
fa236fa to
65d79ea
Compare
Adds the metric: avg_distance_per_infraction = total_fleet_distance / max(1, total_fleet_infractions) which tracks how far agents drive between offroad/collision/red-light events — a useful single-scalar driving-quality signal for wandb. The two underlying log fields already exist in puffer-4 and are already aggregated per-step in add_log; only the binding-side ratio was missing. my_log gains an `n` parameter and multiplies the per-agent-normalized log->total_* fields by n to recover raw fleet totals. The ratio itself is invariant to the 1/n scaling, but the un-normalization makes the fmaxf(1.0f, total_infractions) clamp behave correctly: it floors the denominator at "one infraction across the whole fleet", so a window with zero infractions reports total fleet distance instead of distance/epsilon. Both static_vec_log call sites in vecenv.h pass n. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
65d79ea to
edae760
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
env/avg_distance_per_infraction = total_distance_travelled / max(1, total_infractions). Tracks how far agents drive between offroad / collision / red-light events — a useful single-scalar driving-quality signal for wandb.total_distance_travelled,total_infractions) already existed inLogand were already accumulated per-step inadd_log; only the binding-side ratio was missing.pufferlib/ocean/drive/binding.c:1820).Implementation note
my_loggains annparameter so it can recover totals from the per-agent-normalized aggregate produced bystatic_vec_aggregate_logs. Both call sites invecenv.hupdated to passn. Withoutn,log->total_infractionsis the per-agent rate and thefmaxf(1.0f, ...)clamp would mis-fire for any rate < 1.Test plan
env/avg_distance_per_infraction🤖 Generated with Claude Code