Skip to content

Fix: Reward-hacking SectorCREnv and observation airspeed#45

Draft
StefanHamm wants to merge 2 commits intoTUDelft-CNS-ATM:mainfrom
StefanHamm:fix/reward-hacking-sector
Draft

Fix: Reward-hacking SectorCREnv and observation airspeed#45
StefanHamm wants to merge 2 commits intoTUDelft-CNS-ATM:mainfrom
StefanHamm:fix/reward-hacking-sector

Conversation

@StefanHamm
Copy link

Refactored speed inputs from TAS to CAS to align with speedupdate logic. Normalized the input by dividing by D_VELOCITY to give the agent relative deviation feedback rather than absolute values. Additionally, introduced a penalty term for speed changes to reduce reward hacking.

…ation and integrating it into the reward system
@StefanHamm
Copy link
Author

#44 Fixes

@StefanHamm StefanHamm changed the title Enhance speed management in SectorCREnv by adding speed change calcul… Fix: Reward-hacking SectorCREnv and observation airspeed Jan 19, 2026
@StefanHamm
Copy link
Author

For the reward formulation maybe use the cas speed. E.g. safe speeds to operate the airplane so enough lift is generated. And keep tas for overall speed which is importand to know how fast flying through airspace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant