-
Notifications
You must be signed in to change notification settings - Fork 927
fix(log_router): Do not enter critical section until mutex is taken (AEGHB-1190) #565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hello Vincent-Dalstra, we appreciate your contribution to this project! Click to see more instructions ...
Review and merge process you can expect ...
|
49bec8a to
b21245c
Compare
|
Hi, @Vincent-Dalstra , thanks for your contribution. In addition to modifying the code, please also update the component version under components/utilities/log_router:
|
| esp_err_t esp_log_router_to_file(const char* file_path, const char* tag, esp_log_level_t level) | ||
| { | ||
| // Create mutex for thread safety | ||
| g_log_router_mutex = xSemaphoreCreateMutex(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mutex should be ensured to be initialized only once, because esp_log_router_to_file needs to support routing to different files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for catching that. Fixed: [62f8dd4]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have also changed the failure message from ESP_LOGW, to ESP_LOGE, to reflect its severity.
|
|
||
| First release version. | ||
|
|
||
| ## v0.1.1 - 2025-08-25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please place the latest changes at the top, i.e., before ## v0.1.0 - 2025-7-1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my bad. Fixed: [adb014e]
2e03166 to
62f8dd4
Compare
|
LGTM! It would be better to adjust it into a single commit. @leeebo ,please add MR TAG |
Wait as long as it takes to acquire the mutex (previously, it would only wait 100ms for the mutex before entering the critical section anyway!) fix(log_router): ensure mutex is created successfully, or else return an error Check if the mutex (which ensures thread safety) was created successfully, and return an error if it wasn't. Moved this check to the top of the function, so that it doesn't need to clean up memory in the case of failure. Since successfully creating the mutex is now a precondition for creating the list, we can safely remove the NULL checks when using the mutex.
38c0c6d to
944a2eb
Compare
|
Thank you. I've squashed it all into 944a2eb |
leeebo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vincent-Dalstra LGTM! Thanks for your contribution!
|
sha=944a2eb0d82a7a40b90b411b4cc3cead774a2675 |
Description
There are two commits: the first fixes the actual issue I encountered, while the second eleminates two redundant checks for NULL, discovered while investigating the issue.
(1) Previously, esp_log_router_flash_vprintf() would wait up to 100ms for a mutex, but if the mutex was still not available it would still enter the critical section, and panic when it attempted to return the mutex.
Removing the timeout gets us the desired behaviour.
(2) In addition, it could also enter the critical section without taking the mutex if the mutex handle was NULL. While in practice this is almost impossible, it still doesn't make any sense. Better to make it actually impossible, and eliminate these null checks entirely.
Related
Fixes #564
Testing
Tested with the same hardware and software where I initially found issue #564
Before applying these changes, the problem would occur shortly after boot with approximately 50% chance.
After, the issue stopped. It also does not hang or trigger the watchdog.
Checklist
Before submitting a Pull Request, please ensure the following: