-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Open
Description
from @jyoung8607:
Not sure if this is mici specific, but encountered it on mici. This wasn't a shutdown from manager or the UI. The actual culprit appears to be the AGNOS backup power monitor.
Route: 89c35171edb9a761/00000009--17b7760bc1
(openpilot) jyoung@jy-workstation-ubuntu:~/openpilot$ selfdrive/debug/filter_log_message.py 89c35171edb9a761/00000009--17b7760bc1 | grep "until shutdown"
[0.817446] MAIN 10433 python - 0:59:58.988530 until shutdown / timestamps={'startup': 13.977595046, 'watchdog': 8005.121139925, 'engaged': 735.443136548}
[61.235272] MAIN 10433 python - 0:59:59.733559 until shutdown / timestamps={'startup': 13.977595046, 'watchdog': 8066.063220115, 'engaged': 735.443136548}
[121.236893] MAIN 10433 python - 0:59:59.618171 until shutdown / timestamps={'startup': 13.977595046, 'watchdog': 8126.069350563, 'engaged': 735.443136548}
[181.485775] MAIN 10433 python - 0:59:59.490381 until shutdown / timestamps={'startup': 13.977595046, 'watchdog': 8186.064851272, 'engaged': 735.443136548}
[241.561063] MAIN 10433 python - -1 day, 22:54:48.765652 until shutdown / timestamps={'startup': 13.977595046, 'watchdog': 0, 'engaged': 735.443136548}
(openpilot) jyoung@jy-workstation-ubuntu:~/openpilot$ selfdrive/debug/filter_log_message.py 89c35171edb9a761/00000009--17b7760bc1 | grep poweroff
[241.538999] MAIN 69307 sudo - root : PWD=/ ; USER=root ; COMMAND=/usr/sbin/poweroff
The manager watchdog read got back a zero. When this happens, the device will shut down if time since boot and time since the last engagement is more than 60 minutes. That will trigger whether onroad or offroad. In my case, I was onroad but I hadn't engaged yet. I think there were also complaints of devices shutting down unexpectedly early when offroad, this could be a trigger.
Suggestions:
- Simple truncate-and-write probably allowed this, move power_watchdog to a param for proper atomic writes
- clear logging of AGNOS power monitor shutdown events, the minus sign on the shutdown timer was extremely easy to miss
- maybe assert on monotonic values going backward?
greatgitsby