Implement production-ready update system with security and reliability features #1180

amilcarlucas · 2026-01-08T23:44:45Z

Upgrade the software update subsystem from prototype to production-ready with comprehensive integrity verification, automatic retry, and resume capabilities.

Security Enhancements:

SHA256 checksum verification for all downloaded files
File format validation (PE headers for Windows .exe)
Source URL validation (GitHub-only)
File permission hardening (chmod 0o600)
Document security trade-offs (DOS header vs full PE validation)

Reliability Features:

Resume interrupted downloads using HTTP Range headers
Exponential backoff with jitter (3 retries, configurable)
Automatic retry on network/filesystem errors
Proper error classification and logging

New Capabilities:

Direct wheel installation from GitHub release assets
Automatic checksum discovery from multiple asset formats (SHA256SUMS, *.sha256, checksums.txt)
Fallback to release notes for checksums
Corporate proxy support via environment variables

Code Quality:

Extract magic values to named constants (DOWNLOAD_BLOCK_SIZE, PE_MAGIC_BYTES)
Replace generic exceptions with specific types
Use contextlib.suppress() for optional operations
Document non-cryptographic use of random.uniform() for timing jitter

Testing:

Add checksum parsing tests for various release asset formats
Update download tests to use allow_resume parameter
All existing tests remain passing (70+ tests)

Documentation:

Update ARCHITECTURE_1_software_update.md marking features as implemented
Change 14 items from ❌ TODO to ✅ IMPLEMENTED
Add Known Limitations section documenting security trade-offs

Closes: Security requirements for production deployment
Ref: ARCHITECTURE_1_software_update.md

Copilot

Pull request overview

This PR upgrades the software update subsystem from prototype to production-ready status by implementing comprehensive security features, reliability mechanisms, and extensive test coverage. The changes transform a basic update checker into a robust, enterprise-ready update system with SHA256 verification, automatic retry with exponential backoff, resume capabilities, and thorough validation.

Key Changes:

Security enhancements: SHA256 checksum verification, PE header validation, URL whitelisting, and file permission hardening
Reliability features: Download resume with HTTP Range headers, exponential backoff with jitter, automatic cleanup of corrupted files
Comprehensive test suite: 2,067 lines of new tests covering unit, integration, BDD, and acceptance scenarios

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
ardupilot_methodic_configurator/backend_internet.py	Major refactoring: adds checksum verification, retry logic with exponential backoff, download resume, and comprehensive validation helpers
ardupilot_methodic_configurator/data_model_software_updates.py	Integrates SHA256 verification into download workflow, fixes logging import bug, restructures error handling
tests/unit_frontend_tkinter_software_update.py	New comprehensive UI test suite (406 lines) covering dialog initialization, button behavior, progress updates, and window protocols
tests/unit_download_resume.py	New tests for download resume and retry mechanisms with mock responses
tests/unit_checksum_parsing.py	New tests for SHA256 checksum parsing from various release asset formats
tests/integration_software_update_github_api.py	New integration tests making real GitHub API calls (marked with @pytest.mark.integration)
tests/bdd_software_update.py	New BDD tests focusing on user workflows and business requirements
tests/acceptance_software_update.py	New acceptance tests validating complete end-to-end scenarios
tests/unit_data_model_software_updates.py	Updated to include expected_sha256 parameter in mock calls
tests/unit_backend_internet.py	Updated to include allow_resume parameter in download calls
ARCHITECTURE_1_software_update.md	Comprehensive documentation update marking 14 items as implemented, adds Known Limitations section

Copilot · 2026-01-08T23:53:56Z

ardupilot_methodic_configurator/backend_internet.py

+        # Using random.uniform for timing jitter (NOT for cryptographic purposes)
+        # This is safe because:
+        # 1. We only use it for sleep timing randomization, not security
+        # 2. The jitter prevents multiple clients from retrying simultaneously
+        # 3. Even if an attacker could predict retry times, there's no security impact
+        # 4. Cryptographic randomness would be overkill and slower for this use case
+        sleep_time = backoff_factor * (2 ** (attempt - 1))
+        sleep_time = sleep_time * random.uniform(0.8, 1.2)  # noqa: S311


The use of random.uniform for timing jitter is well-documented with an extensive comment explaining why it's safe for non-cryptographic use (lines 193-198). However, the comment is quite verbose. Consider condensing it while retaining the key justification that this is for retry timing, not security.

Copilot · 2026-01-08T23:53:56Z

ardupilot_methodic_configurator/backend_internet.py

+    """Validate that URL is from a trusted GitHub source."""
+    if not url.startswith("https://github.com/") and not url.startswith("https://objects.githubusercontent.com/"):
+        logging_error(_("Windows installer URL must be from github.com or githubusercontent.com: %s"), url)


The function _validate_windows_installer_url validates URLs against trusted GitHub domains. However, it only checks for specific prefixes ('https://github.com/' and 'https://objects.githubusercontent.com/'). Consider whether other GitHub CDN domains should be supported, or if this whitelist is intentionally restrictive for security reasons. If intentionally restrictive, consider adding a comment explaining the security rationale.

Suggested change

"""Validate that URL is from a trusted GitHub source."""

if not url.startswith("https://github.com/") and not url.startswith("https://objects.githubusercontent.com/"):

logging_error(_("Windows installer URL must be from github.com or githubusercontent.com: %s"), url)

"""Validate that URL is from a trusted GitHub source.

This check is intentionally strict: we only allow direct GitHub

release/download URLs and the GitHub Objects CDN used for release

assets. Other GitHub- or CDN-related hostnames are rejected to

reduce the risk of an attacker supplying a URL outside the official

ArduPilot/MethodicConfigurator GitHub release infrastructure.

If GitHub changes the domains used for release assets, this

whitelist must be updated explicitly.

"""

trusted_prefixes = (

"https://github.com/",

"https://objects.githubusercontent.com/",

)

if not url.startswith(trusted_prefixes):

logging_error(_("Windows installer URL must be from github.com or objects.githubusercontent.com: %s"), url)

Copilot · 2026-01-08T23:53:57Z

ardupilot_methodic_configurator/backend_internet.py

    ret = subprocess.check_call(  # noqa: S603
-        [sys.executable, "-m", "pip", "install", "--upgrade", "ardupilot_methodic_configurator"]
+        [
+            sys.executable,
+            "-m",
+            "pip",
+            "install",
+            "--upgrade",
+            "--index-url",
+            "https://pypi.org/simple",
+            "ardupilot_methodic_configurator",
+        ]


The pip installation now includes '--index-url' pointing to 'https://pypi.org/simple' (lines 477-485). While this ensures packages come from the official PyPI, it may break installations in environments with custom PyPI mirrors or internal package repositories. Consider whether this should be configurable or if there should be a fallback mechanism for corporate environments that require custom index URLs.

Copilot · 2026-01-08T23:53:57Z

tests/acceptance_software_update.py

+    def test_system_remains_stable_during_update_process(self) -> None:
+        """
+        System remains stable throughout update process.
+
+        GIVEN: User performs a software update
+        WHEN: Update operations execute (download, verify, install)
+        THEN: System handles interruptions gracefully
+          AND: Partial operations don't corrupt installation
+          AND: System can recover from failures
+          AND: Application stability is maintained
+        """
+        # This is validated through combination of other tests


The test function test_system_remains_stable_during_update_process is a placeholder with only a comment "This is validated through combination of other tests" (lines 556-567). Similar to the error recovery test, this provides no actual test implementation. Consider either implementing specific stability tests or removing this placeholder.

Copilot · 2026-01-08T23:53:57Z

ardupilot_methodic_configurator/data_model_software_updates.py

+        result = False
        if platform.system() == "Windows":
            try:
                # Look for .exe files first
                exe_assets = [
                    asset for asset in latest_release.get("assets", []) if asset.get("name", "").lower().endswith(".exe")
                ]

+                asset = None
                if exe_assets:
                    asset = exe_assets[0]  # Use the first .exe file
                elif latest_release.get("assets"):
                    asset = latest_release["assets"][0]  # Fallback to first asset
+
+                if asset is not None:
+                    expected_sha256 = get_expected_sha256_from_release(latest_release, asset["name"])
+                    result = download_and_install_on_windows(
+                        download_url=asset["browser_download_url"],
+                        file_name=asset["name"],
+                        progress_callback=self.dialog.update_progress if self.dialog else None,
+                        expected_sha256=expected_sha256,
+                    )
                else:
                    logging_error(_("No suitable assets found for Windows installation"))
-                    return False
-
-                return download_and_install_on_windows(
-                    download_url=asset["browser_download_url"],
-                    file_name=asset["name"],
-                    progress_callback=self.dialog.update_progress if self.dialog else None,
-                )
+                    result = False
            except (KeyError, IndexError) as e:
                logging_error(_("Error accessing release assets: %s"), e)
-                return False
+                result = False
            except Exception as e:  # pylint: disable=broad-exception-caught
                logging_error(_("Error during Windows download: %s"), e)
-                return False
+                result = False
+        else:
+            # For Linux/macOS, install from PyPI using pip
+            try:
+                result = (
+                    download_and_install_pip_release(progress_callback=self.dialog.update_progress if self.dialog else None)
+                    == 0
+                )
+            except Exception as e:  # pylint: disable=broad-exception-caught
+                logging_error(_("Error during pip installation: %s"), e)
+                result = False

-        try:
-            return (
-                download_and_install_pip_release(progress_callback=self.dialog.update_progress if self.dialog else None) == 0
-            )
-        except Exception as e:  # pylint: disable=broad-exception-caught
-            logging_error(_("Error during pip installation: %s"), e)
-            return False
+        return result


The variable 'result' is initialized but the subsequent code has multiple early return paths within the Windows block (lines 90, 93, 96) that don't set 'result' before returning. This means those error paths bypass the 'return result' statement at line 108. Consider refactoring to ensure all paths through the function use the 'result' variable consistently, or simplify by returning directly in error cases and only using 'result' for success paths.

Copilot · 2026-01-08T23:53:58Z

tests/acceptance_software_update.py

+    @patch("ardupilot_methodic_configurator.backend_internet.requests_get")
+    def test_user_experiences_graceful_recovery_from_all_error_types(self, mock_get) -> None:
+        """
+        User experiences graceful handling and recovery from various errors.
+
+        GIVEN: User is updating the application
+        WHEN: Various errors occur (network, disk, corruption)
+        THEN: Each error is handled gracefully
+          AND: User receives clear, actionable error messages
+          AND: System recovers automatically where possible
+          AND: Application remains stable
+        """
+        # Test will be expanded with specific error scenarios
+        # This is a placeholder for comprehensive error testing
+


The test function test_user_experiences_graceful_recovery_from_all_error_types is essentially a placeholder with no implementation (lines 422-436). The function docstring describes testing various error scenarios, but the function body only contains a comment saying "Test will be expanded with specific error scenarios". Consider either implementing the test or removing it until it can be properly implemented, as placeholder tests can give false confidence in test coverage.

Copilot · 2026-01-08T23:53:58Z

tests/acceptance_software_update.py

+    def test_update_process_is_intuitive_and_requires_minimal_intervention(self) -> None:
+        """
+        Update process is intuitive with minimal user intervention.
+
+        GIVEN: User encounters an available update
+        WHEN: User goes through update process
+        THEN: Process is self-explanatory
+          AND: Requires minimal clicks/decisions
+          AND: Provides clear guidance at each step
+          AND: Error messages are actionable when they occur
+          AND: Overall experience is positive
+        """
+        # This test validates UX through structure of UpdateDialog
+        # Real validation would require user testing
+        formatted = format_version_info("1.0.0", "2.0.0", "- Improvement")
+
+        # Assert - Information is clear
+        assert "1.0.0" in formatted
+        assert "2.0.0" in formatted
+        assert "Improvement" in formatted


The test function test_update_process_is_intuitive_and_requires_minimal_intervention has minimal implementation (lines 577-596). The docstring describes testing the intuitiveness of the update process, but the actual test only validates that version strings appear in formatted output. This doesn't test usability aspects like minimal clicks, clear guidance, or actionable error messages as described in the docstring. Consider either implementing proper usability tests or revising the docstring to match what the test actually validates.

Copilot · 2026-01-08T23:53:59Z

ardupilot_methodic_configurator/backend_internet.py

+        with open(path, "rb") as _fh:
+            sig = _fh.read(2)
+        if sig != PE_MAGIC_BYTES:
+            logging_error(_("Downloaded installer does not appear to be a Windows executable"))
+            with contextlib.suppress(OSError, FileNotFoundError):
+                os.remove(path)
+            return False


The magic bytes validation only checks for the DOS header 'MZ' signature (PE_MAGIC_BYTES = b"MZ"). As documented in ARCHITECTURE_1_software_update.md lines 265-270, this is a known limitation - full PE structure validation is not performed. While this is a documented trade-off, consider adding a comment in the code itself (near line 342) explaining why only DOS header validation is performed, referencing the architecture documentation for the full rationale.

Copilot · 2026-01-08T23:53:59Z

ardupilot_methodic_configurator/backend_internet.py

+    with subprocess.Popen(  # noqa: S602
+        [batch_file_path],
+        shell=True,
+        creationflags=subprocess.CREATE_NO_WINDOW | subprocess.DETACHED_PROCESS,  # type: ignore[attr-defined]
+    ) as _result:


The batch file execution uses shell=True with subprocess.Popen (line 378), which is marked with noqa: S602 to suppress the security warning. While the batch_file_path is created internally and should be safe, consider adding a comment explaining why shell=True is necessary here (likely because .bat files need to be executed through the shell on Windows).

Copilot · 2026-01-08T23:53:59Z

ardupilot_methodic_configurator/backend_internet.py

+    # Exit this process to allow the installer to run
+    os._exit(0)  # Force exit without cleanup


The function uses os._exit(0) which is a force exit without cleanup (line 396). While this is intentional and documented, using a private function (_exit) in production code is generally discouraged. Consider adding a more detailed comment explaining why this specific approach is necessary (to ensure the installer can run without file locks from the current process) and why sys.exit() would not work here.

Suggested change

# Exit this process to allow the installer to run

os._exit(0) # Force exit without cleanup

# Exit this process to allow the installer to run.

# NOTE:

# We intentionally use os._exit(0) instead of sys.exit() here.

# - sys.exit() raises SystemExit and runs Python-level cleanup such as

# atexit handlers, finally blocks, and waiting for non-daemon threads.

# Any of that cleanup may keep file handles open on the running

# executable or its resources.

# - On Windows, open file handles can prevent the just-downloaded

# installer from replacing files, causing the update to fail or

# require a manual retry.

# Using os._exit(0) terminates the interpreter immediately without

# running Python cleanup code, minimizing the chance of lingering

# file locks so the external installer/batch script can run reliably.

os._exit(0) # Force exit without cleanup (intentional)

github-actions · 2026-01-08T23:57:41Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
10944	10098	92%	89%	🟢

New Files

No new covered files...

Modified Files

File	Coverage	Status
ardupilot_methodic_configurator/backend_internet.py	86%	🟢
ardupilot_methodic_configurator/data_model_software_updates.py	100%	🟢
TOTAL	93%	🟢

updated for commit: 370ab46 by action🐍

github-actions · 2026-01-08T23:57:49Z

Test Results

3 files ± 0 3 suites ±0 33m 47s ⏱️ + 1m 41s
2 915 tests + 52 2 908 ✅ + 52 7 💤 ±0 0 ❌ ±0
8 745 runs +156 8 724 ✅ +156 21 💤 ±0 0 ❌ ±0

Results for commit 370ab46. ± Comparison against base commit 57684c1.

This pull request removes 99 and adds 151 tests. Note that renamed tests count towards both.

tests.test_backend_internet ‑ test_download_and_install_pip_release
tests.test_backend_internet ‑ test_download_and_install_windows_download_failure
tests.test_backend_internet ‑ test_download_file_empty_response
tests.test_backend_internet ‑ test_download_file_from_url_empty_params
tests.test_backend_internet ‑ test_download_file_from_url_invalid_url
tests.test_backend_internet ‑ test_download_file_from_url_proxy_handling[env_vars0]
tests.test_backend_internet ‑ test_download_file_from_url_proxy_handling[env_vars1]
tests.test_backend_internet ‑ test_download_file_from_url_proxy_handling[env_vars2]
tests.test_backend_internet ‑ test_download_file_from_url_proxy_handling[env_vars3]
tests.test_backend_internet ‑ test_download_file_http_error
…

tests.acceptance_software_update.TestAcceptanceCompleteEndToEndWorkflows ‑ test_complete_linux_user_update_journey_with_pip
tests.acceptance_software_update.TestAcceptanceCompleteEndToEndWorkflows ‑ test_complete_windows_user_update_journey
tests.acceptance_software_update.TestAcceptanceDownloadManagement ‑ test_user_downloads_update_with_full_reliability_features
tests.acceptance_software_update.TestAcceptanceErrorHandlingAndRecovery ‑ test_user_experiences_graceful_recovery_from_all_error_types
tests.acceptance_software_update.TestAcceptanceInstallationProcess ‑ test_linux_user_completes_pip_update_installation
tests.acceptance_software_update.TestAcceptanceInstallationProcess ‑ test_windows_user_completes_full_update_installation_workflow
tests.acceptance_software_update.TestAcceptancePerformanceRequirements ‑ test_large_downloads_use_efficient_streaming
tests.acceptance_software_update.TestAcceptancePerformanceRequirements ‑ test_version_check_completes_quickly_for_user
tests.acceptance_software_update.TestAcceptanceReliabilityRequirements ‑ test_system_remains_stable_during_update_process
tests.acceptance_software_update.TestAcceptanceSecurityRequirements ‑ test_user_protected_by_complete_security_measures
…

♻️ This comment has been updated with latest results.

… and reliability features Upgrade the software update subsystem from prototype to production-ready with comprehensive integrity verification, automatic retry, and resume capabilities. Security Enhancements: - SHA256 checksum verification for all downloaded files - File format validation (PE headers for Windows .exe) - Source URL validation (GitHub-only) - File permission hardening (chmod 0o600) - Document security trade-offs (DOS header vs full PE validation) Reliability Features: - Resume interrupted downloads using HTTP Range headers - Exponential backoff with jitter (3 retries, configurable) - Automatic retry on network/filesystem errors - Proper error classification and logging New Capabilities: - Direct wheel installation from GitHub release assets - Automatic checksum discovery from multiple asset formats (SHA256SUMS, *.sha256, checksums.txt) - Fallback to release notes for checksums - Corporate proxy support via environment variables Code Quality: - Extract magic values to named constants (DOWNLOAD_BLOCK_SIZE, PE_MAGIC_BYTES) - Replace generic exceptions with specific types - Use contextlib.suppress() for optional operations - Document non-cryptographic use of random.uniform() for timing jitter Testing: - Add checksum parsing tests for various release asset formats - Update download tests to use allow_resume parameter - All existing tests remain passing (70+ tests) Documentation: - Update ARCHITECTURE_1_software_update.md marking features as implemented - Change 14 items from ❌ TODO to ✅ IMPLEMENTED - Add Known Limitations section documenting security trade-offs Closes: Security requirements for production deployment Ref: ARCHITECTURE_1_software_update.md

Copilot AI review requested due to automatic review settings January 8, 2026 23:44

Copilot started reviewing on behalf of amilcarlucas January 8, 2026 23:45 View session

Copilot AI reviewed Jan 8, 2026

View reviewed changes

amilcarlucas force-pushed the robust_update branch from bfdfe5b to 370ab46 Compare January 8, 2026 23:58

-    """Validate that URL is from a trusted GitHub source."""
-    if not url.startswith("https://github.com/") and not url.startswith("https://objects.githubusercontent.com/"):
-        logging_error(_("Windows installer URL must be from github.com or githubusercontent.com: %s"), url)
+    """Validate that URL is from a trusted GitHub source.
+    This check is intentionally strict: we only allow direct GitHub
+    release/download URLs and the GitHub Objects CDN used for release
+    assets. Other GitHub- or CDN-related hostnames are rejected to
+    reduce the risk of an attacker supplying a URL outside the official
+    ArduPilot/MethodicConfigurator GitHub release infrastructure.
+    If GitHub changes the domains used for release assets, this
+    whitelist must be updated explicitly.
+    """
+    trusted_prefixes = (
+        "https://github.com/",
+        "https://objects.githubusercontent.com/",
+    )
+    if not url.startswith(trusted_prefixes):
+        logging_error(_("Windows installer URL must be from github.com or objects.githubusercontent.com: %s"), url)

		# Exit this process to allow the installer to run
		os._exit(0) # Force exit without cleanup

-    # Exit this process to allow the installer to run
-    os._exit(0)  # Force exit without cleanup
+    # Exit this process to allow the installer to run.
+    # NOTE:
+    # We intentionally use os._exit(0) instead of sys.exit() here.
+    # - sys.exit() raises SystemExit and runs Python-level cleanup such as
+    #   atexit handlers, finally blocks, and waiting for non-daemon threads.
+    #   Any of that cleanup may keep file handles open on the running
+    #   executable or its resources.
+    # - On Windows, open file handles can prevent the just-downloaded
+    #   installer from replacing files, causing the update to fail or
+    #   require a manual retry.
+    # Using os._exit(0) terminates the interpreter immediately without
+    # running Python cleanup code, minimizing the chance of lingering
+    # file locks so the external installer/batch script can run reliably.
+    os._exit(0)  # Force exit without cleanup (intentional)

Implement production-ready update system with security and reliability features #1180

Are you sure you want to change the base?

Implement production-ready update system with security and reliability features #1180

Uh oh!

Conversation

amilcarlucas commented Jan 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

Uh oh!

github-actions bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Jan 8, 2026 •

edited

Loading

github-actions bot commented Jan 8, 2026 •

edited

Loading