-
Notifications
You must be signed in to change notification settings - Fork 748
Report stage with error in jobs #5784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe changes update the backup file copying and error reporting mechanisms. In the backup manager module, the Changes
Sequence Diagram(s)sequenceDiagram
participant API as API Client
participant BM as BackupManager
participant Copy as shutil.copy
participant ER as Error Handler
API->>BM: Initiate backup process
BM->>Copy: Copy backup file to additional locations
Copy-->>BM: Raises OSError (simulated)
BM->>ER: Report error with stage information
ER-->>BM: Log error, update backup paths
BM->>API: Return backup details (only original location)
sequenceDiagram
participant Client as API Client
participant Job as SupervisorJob
participant Inner as Nested Job Logic
participant SE as SupervisorJobError
Client->>Job: Start job execution
Job->>Job: Set stage "test" and call inner method
Inner-->>Job: Raise SupervisorError("bad")
Job->>SE: Capture error with stage "test"
SE-->>Job: Return error details with stage
Job->>Client: Provide job info including error and stage
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🧰 Additional context used🧬 Code Definitions (1)tests/backups/test_manager.py (3)
⏰ Context from checks skipped due to timeout of 90000ms (4)
🔇 Additional comments (4)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
efe19c5 to
7b93c4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/api/test_backups.py (1)
675-676: Consider usingside_effectparameter more specifically.While the current implementation correctly simulates a failure, you might want to make the test more specific to ensure it only affects copies to the additional location.
- with patch("supervisor.backups.manager.shutil.copy", side_effect=OSError): + with patch("supervisor.backups.manager.shutil.copy", side_effect=OSError("Permission denied")):This provides a more specific error message that would appear in logs, making debugging easier.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
supervisor/backups/manager.py(4 hunks)supervisor/jobs/__init__.py(2 hunks)tests/api/test_backups.py(1 hunks)tests/api/test_jobs.py(2 hunks)tests/jobs/test_job_manager.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/jobs/test_job_manager.py
- tests/api/test_jobs.py
- supervisor/backups/manager.py
- supervisor/jobs/init.py
🧰 Additional context used
🧬 Code Definitions (1)
tests/api/test_backups.py (4)
tests/conftest.py (3)
api_client(488-511)coresys(324-404)backups(654-673)supervisor/core.py (1)
set_state(76-92)supervisor/backups/backup.py (3)
slug(112-114)all_locations(211-213)location(206-208)supervisor/backups/manager.py (1)
get(106-108)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Run tests Python 3.13.2
- GitHub Check: Build armhf supervisor
- GitHub Check: Build aarch64 supervisor
🔇 Additional comments (3)
tests/api/test_backups.py (3)
661-709: Well-structured test for backup failures during copy stage.This test effectively validates the core objective of the PR - ensuring errors during the copy stage are correctly reported with stage information while preserving successful operations. The test appropriately:
- Simulates a failure during the copy process by patching
shutil.copy- Verifies the backup exists in the original location
- Confirms the error is reported with the correct stage information
- Checks that the backup's locations only include the successful copy
690-693: Great verification of the backup state after error.The test effectively verifies that despite the copy failure, the original backup file still exists and is tracked in the locations dictionary. This validates that successful operations are preserved even when an error occurs during the copy stage.
702-708: Excellent verification of error stage information.The test properly validates that the error includes the stage information (
copy_additional_locations), which is a key requirement from the PR objectives. This ensures that errors can be properly diagnosed by identifying exactly where in the backup process they occurred.
4f975e6 to
8e92e55
Compare
8e92e55 to
9f6d9e6
Compare
| "type": self.type_.__name__, | ||
| "message": self.message, | ||
| "stage": self.stage, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bends the job system to a specific requirement by the Backup system (so far we only have stages for backups). I wonder if it wouldn't be cleaner to just handle and return error "in-line" in Backup instead of relying on the Job system 🤔
But then, we already leaned in a lot into the Job system, and it seems to work. So 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't return it inline because core isn't listening for that. They do backups asynchronously. This is our only means of communicating status of the backup to core currently.
Also Backup wasn't the only intended consumer of stage, just the only one so far. For example we had plans to try and report status and progress better with docker actions as well, updates in particular. Just haven't gotten to it yet. Might be other places as well.
Proposed change
Report the stage an error occurred in during a job. This allows core to separate errors which occurred during the
copy_additional_locationsstage from errors during the process of creating the backup.Additionally tweak the logic of
_copy_to_additional_locationsso it does not lose track of the copies it successfully made. It will still stop on first error but any copies it made before that will be recorded in the backup rather then forgotten about like today.Type of change
Additional information
Checklist
ruff format supervisor tests)If API endpoints or add-on configuration are added/changed:
Summary by CodeRabbit
stagein the error handling classes for better tracking of job states.