Testing: Management command for e2e test of ML jobs by carlosgjs · Pull Request #1143 · RolnickLab/antenna

carlosgjs · 2026-02-20T19:56:16Z

This pull request introduces a new management command for end-to-end testing of ML jobs in the Django project. The command allows users to create and monitor ML jobs using specified collections and pipelines, providing real-time progress updates and final results in the console.

New ML job testing command:

Added test_ml_job_e2e.py management command for end-to-end testing of ML job processing, including job creation, progress monitoring, and final results display.
Supports selection of SourceImageCollection, Pipeline, and JobDispatchMode via command-line arguments, with informative error handling for missing resources.
Provides real-time progress updates and displays final job results, including stage progress and parameters, in a user-friendly format.

Summary by CodeRabbit

New Features
- Added an end-to-end ML job test command that runs and enqueues jobs by collection/pipeline and dispatch mode.
- Provides real-time, single-line progress updates with per-stage progress and elapsed time.
- Handles interruptions gracefully (preserves job execution and surfaces the Job ID).
- Prints a final summary with status, total runtime, and per-stage results.

netlify · 2026-02-20T19:56:20Z

👷 Deploy request for antenna-ssec pending review.

Visit the deploys page to approve it

Name	Link
🔨 Latest commit	`f72a552`

netlify · 2026-02-20T19:56:21Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`f72a552`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/699deff64dc69d00085e9cd4

coderabbitai · 2026-02-20T19:56:22Z

📝 Walkthrough

Walkthrough

Adds a new Django management command to create, enqueue, and monitor an ML Job end-to-end by locating a SourceImageCollection and Pipeline, creating a Job with a dispatch mode, polling status every 2 seconds, and printing live progress and final results until completion or interruption.

Changes

Cohort / File(s)	Summary
New ML Job E2E Command `ami/jobs/management/commands/test_ml_job_e2e.py`	Adds a new management command class `Command` (subclassing `BaseCommand`) that: validates SourceImageCollection and Pipeline existence, creates and saves a Job with a dispatch mode, enqueues the job, monitors progress in a loop (2s poll), updates a compact single-line progress display, handles KeyboardInterrupt to stop monitoring while leaving the job running, and prints final results and Job ID.

Sequence Diagram

sequenceDiagram
    actor User
    participant CMD as test_ml_job_e2e Command
    participant DB as Database
    participant JobSvc as Job Creation/Model
    participant Queue as Dispatcher/Queue
    participant Monitor as Monitor Loop
    participant Out as Console Output

    User->>CMD: run with collection, pipeline, dispatch args
    CMD->>DB: query SourceImageCollection
    alt not found
        CMD->>Out: print error & exit
    end
    CMD->>DB: query Pipeline
    alt not found
        CMD->>Out: print error & exit
    end
    CMD->>JobSvc: create/save Job (with dispatch mode)
    JobSvc->>DB: persist Job
    CMD->>Queue: enqueue Job
    CMD->>Monitor: start _monitor_job(job, start_time)

    loop every 2s until final state or interrupt
        Monitor->>DB: poll job status & stages
        Monitor->>Out: update single-line progress (elapsed, status, per-stage)
        alt final state reached
            Monitor->>Out: call _show_final_results
            Monitor->>Monitor: exit loop
        end
    end

    alt KeyboardInterrupt
        Monitor->>Out: print Job ID and stop monitoring
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

mihow

Poem

🐰 I hopped in with code to test the run,

Queued a job beneath the sun.
Polling beats like little drum,
Stages tick until it's done.
Hop, report — the job's begun!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description covers the summary and key changes, but is missing several required template sections: Related Issues, Detailed Description, How to Test, Screenshots, Deployment Notes, and Checklist.	Add missing sections from the template: reference any related issues, provide detailed description with potential side effects, include testing instructions, and complete the checklist.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: a new management command for end-to-end testing of ML jobs.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

ami/jobs/management/commands/test_ml_job_e2e.py (2)

40-44: Same from None chaining needed here.

Per Ruff B904, chain the re-raised CommandError to suppress the noisy inner traceback.

Proposed fix

         except Pipeline.DoesNotExist:
-            raise CommandError(f"Pipeline '{pipeline_slug}' not found")
+            raise CommandError(f"Pipeline '{pipeline_slug}' not found") from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ami/jobs/management/commands/test_ml_job_e2e.py` around lines 40 - 44, In the
try/except that looks up Pipeline via Pipeline.objects.get(slug=pipeline_slug),
update the exception re-raise to suppress the inner traceback by chaining to
None: replace the current "raise CommandError(f\"Pipeline '{pipeline_slug}' not
found\")" with raising CommandError from None (i.e., raise CommandError(...)
from None) so the caught Pipeline.DoesNotExist does not emit a noisy inner
traceback when CommandError is thrown.

71-85: Consider adding an optional timeout to prevent indefinite monitoring.

For an E2E test utility, an unbounded polling loop is a minor concern (Ctrl+C works), but a --timeout argument would make this more robust for CI/scripted usage.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ami/jobs/management/commands/test_ml_job_e2e.py` around lines 71 - 85, Add an
optional timeout to the E2E job monitor so it cannot poll forever: update the
_monitor_job method to accept a timeout (e.g., timeout=None) and while looping
check if elapsed >= timeout (when timeout is set) and abort/raise a clear
TimeoutError or mark the job as failed and call _show_final_results; wire this
timeout into the management command by adding a --timeout CLI option in the
command class (parse as seconds, default None) and pass it into _monitor_job
when invoked so CI or scripts can limit wait time.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ami/jobs/management/commands/test_ml_job_e2e.py`:
- Around line 123-125: The loop that writes stage parameters skips falsy but
valid values because it tests "if param.value:"; change that check to explicitly
test for None so values like 0, False or empty string are preserved — update the
conditional in the loop iterating over stage.params (the block that calls
self.stdout.write(f"    {param.name}: {param.value}")) to use "if param.value is
not None:" so only absent values are skipped.
- Around line 109-114: The three self.stdout.write calls that log job outcomes
(the branches checking JobState.SUCCESS, JobState.FAILURE, and the else branch
using job.status) use f-strings with no interpolation; remove the extraneous
leading "f" from those string literals in the block where job.status is
evaluated so they become normal strings (e.g., change f"✅ Job completed
successfully" etc.); update the three occurrences in the function/command where
these self.stdout.write(...) calls appear (test_ml_job_e2e.py) to silence Ruff
F541.
- Around line 32-37: The current use of
SourceImageCollection.objects.get(name=collection_name) can raise
MultipleObjectsReturned; modify the lookup to handle duplicates by either
catching SourceImageCollection.MultipleObjectsReturned (or
MultipleObjectsReturned from django.core.exceptions) and raising a CommandError
chaining the original exception (raise CommandError(...) from e) with a clear
message, or replace .get(...) with
SourceImageCollection.objects.filter(name=collection_name).first() and then
raise CommandError if result is None or if you detect multiple matches and want
to surface that as an error; update the block around
SourceImageCollection.objects.get, the except path, and ensure you chain the
original exception when re-raising CommandError.

---

Nitpick comments:
In `@ami/jobs/management/commands/test_ml_job_e2e.py`:
- Around line 40-44: In the try/except that looks up Pipeline via
Pipeline.objects.get(slug=pipeline_slug), update the exception re-raise to
suppress the inner traceback by chaining to None: replace the current "raise
CommandError(f\"Pipeline '{pipeline_slug}' not found\")" with raising
CommandError from None (i.e., raise CommandError(...) from None) so the caught
Pipeline.DoesNotExist does not emit a noisy inner traceback when CommandError is
thrown.
- Around line 71-85: Add an optional timeout to the E2E job monitor so it cannot
poll forever: update the _monitor_job method to accept a timeout (e.g.,
timeout=None) and while looping check if elapsed >= timeout (when timeout is
set) and abort/raise a clear TimeoutError or mark the job as failed and call
_show_final_results; wire this timeout into the management command by adding a
--timeout CLI option in the command class (parse as seconds, default None) and
pass it into _monitor_job when invoked so CI or scripts can limit wait time.

coderabbitai · 2026-02-20T20:02:15Z

ami/jobs/management/commands/test_ml_job_e2e.py

+        try:
+            collection = SourceImageCollection.objects.get(name=collection_name)
+            self.stdout.write(self.style.SUCCESS(f"✓ Found collection: {collection.name}"))
+            self.stdout.write(f"  Project: {collection.project.name}")
+        except SourceImageCollection.DoesNotExist:
+            raise CommandError(f"SourceImageCollection '{collection_name}' not found")


⚠️ Potential issue | 🟡 Minor

SourceImageCollection.objects.get(name=...) may raise MultipleObjectsReturned.

The name field on SourceImageCollection is not unique. If multiple collections share the same name, this will throw an unhandled MultipleObjectsReturned exception with a raw traceback. Either catch MultipleObjectsReturned or use .filter(...).first() with an appropriate message.

Also, per static analysis (B904), chain the original exception for better diagnostics.

Proposed fix

try: collection = SourceImageCollection.objects.get(name=collection_name) self.stdout.write(self.style.SUCCESS(f"✓ Found collection: {collection.name}")) self.stdout.write(f" Project: {collection.project.name}") except SourceImageCollection.DoesNotExist: - raise CommandError(f"SourceImageCollection '{collection_name}' not found") + raise CommandError(f"SourceImageCollection '{collection_name}' not found") from None + except SourceImageCollection.MultipleObjectsReturned: + raise CommandError( + f"Multiple SourceImageCollections found with name '{collection_name}'. Use a unique name." + ) from None

🧰 Tools

🪛 Ruff (0.15.1)

[warning] 37-37: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

[warning] 37-37: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ami/jobs/management/commands/test_ml_job_e2e.py` around lines 32 - 37, The current use of SourceImageCollection.objects.get(name=collection_name) can raise MultipleObjectsReturned; modify the lookup to handle duplicates by either catching SourceImageCollection.MultipleObjectsReturned (or MultipleObjectsReturned from django.core.exceptions) and raising a CommandError chaining the original exception (raise CommandError(...) from e) with a clear message, or replace .get(...) with SourceImageCollection.objects.filter(name=collection_name).first() and then raise CommandError if result is None or if you detect multiple matches and want to surface that as an error; update the block around SourceImageCollection.objects.get, the except path, and ensure you chain the original exception when re-raising CommandError.

ami/jobs/management/commands/test_ml_job_e2e.py

coderabbitai · 2026-02-20T20:02:15Z

ami/jobs/management/commands/test_ml_job_e2e.py

+                for param in stage.params:
+                    if param.value:
+                        self.stdout.write(f"    {param.name}: {param.value}")


⚠️ Potential issue | 🟡 Minor

if param.value: silently hides falsy values like 0, False, or "".

If a stage parameter can legitimately hold 0 or another falsy value, it will be skipped. Use if param.value is not None: to only skip truly absent values.

Proposed fix

for param in stage.params: - if param.value: + if param.value is not None: self.stdout.write(f" {param.name}: {param.value}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for param in stage.params:

if param.value:

self.stdout.write(f" {param.name}: {param.value}")

for param in stage.params:

if param.value is not None:

self.stdout.write(f" {param.name}: {param.value}")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ami/jobs/management/commands/test_ml_job_e2e.py` around lines 123 - 125, The loop that writes stage parameters skips falsy but valid values because it tests "if param.value:"; change that check to explicitly test for None so values like 0, False or empty string are preserved — update the conditional in the loop iterating over stage.params (the block that calls self.stdout.write(f" {param.name}: {param.value}")) to use "if param.value is not None:" so only absent values are skipped.

Copilot

Pull request overview

This pull request adds a new Django management command test_ml_job_e2e for end-to-end testing of ML job processing. The command creates ML jobs with specified collections and pipelines, monitors their progress in real-time, and displays final results in the console. This is a useful debugging and testing tool for developers working with the ML job pipeline.

Changes:

Added test_ml_job_e2e.py management command with support for creating and monitoring ML jobs through different dispatch modes (internal, sync_api, async_api)
Implements real-time progress monitoring with console updates every 2 seconds showing job status and per-stage progress
Provides final results summary including stage completion status and parameter values

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ami/jobs/management/commands/test_ml_job_e2e.py

* fix: prevent NATS connection flooding and stale job task fetching - Add connect_timeout=5, allow_reconnect=False to NATS connections to prevent leaked reconnection loops from blocking Django's event loop - Guard /tasks endpoint against terminal-status jobs (return empty tasks instead of attempting NATS reserve) - IncompleteJobFilter now excludes jobs by top-level status in addition to progress JSON stages - Add stale worker cleanup to integration test script Found during PSv2 integration testing where stale ADC workers with default DataLoader parallelism overwhelmed the single uvicorn worker thread by flooding /tasks with concurrent NATS reserve requests. Co-Authored-By: Claude <noreply@anthropic.com> * docs: PSv2 integration test session notes and NATS flooding findings Session notes from 2026-02-16 integration test including root cause analysis of stale worker task competition and NATS connection issues. Findings doc tracks applied fixes and remaining TODOs with priorities. Co-Authored-By: Claude <noreply@anthropic.com> * docs: update session notes with successful test run #3 PSv2 integration test passed end-to-end (job 1380, 20/20 images). Identified ack_wait=300s as cause of ~5min idle time when GPU processes race for NATS tasks. Co-Authored-By: Claude <noreply@anthropic.com> * fix: batch NATS task fetch to prevent HTTP timeouts Replace N×1 reserve_task() calls with single reserve_tasks() batch fetch. The previous implementation created a new pull subscription per message (320 NATS round trips for batch=64), causing the /tasks endpoint to exceed HTTP client timeouts. The new approach uses one psub.fetch() call for the entire batch. Co-Authored-By: Claude <noreply@anthropic.com> * docs: add next session prompt * feat: add pipeline__slug__in filter for multi-pipeline job queries Workers that handle multiple pipelines can now fetch jobs for all of them in a single request: ?pipeline__slug__in=slug1,slug2 Co-Authored-By: Claude <noreply@anthropic.com> * chore: remove local-only docs and scripts from branch These files are session notes, planning docs, and test scripts that should stay local rather than be part of the PR. Co-Authored-By: Claude <noreply@anthropic.com> * feat: set job dispatch_mode at creation time based on project feature flags ML jobs with a pipeline now get dispatch_mode set during setup() instead of waiting until run() is called by the Celery worker. This lets the UI show the correct mode immediately after job creation. Co-Authored-By: Claude <noreply@anthropic.com> * fix: add timeouts to all JetStream operations and restore reconnect policy Add NATS_JETSTREAM_TIMEOUT (10s) to all JetStream metadata operations via asyncio.wait_for() so a hung NATS connection fails fast instead of blocking the caller's thread indefinitely. Also restore the intended reconnect policy (2 attempts, 1s wait) that was lost in a prior force push. Co-Authored-By: Claude <noreply@anthropic.com> * fix: propagate NATS timeouts as 503 instead of swallowing them asyncio.TimeoutError from _ensure_stream() and _ensure_consumer() was caught by the broad `except Exception` in reserve_tasks(), silently returning [] and making NATS outages indistinguishable from empty queues. Workers would then poll immediately, recreating the flooding problem. - Add explicit `except asyncio.TimeoutError: raise` in reserve_tasks() - Catch TimeoutError and OSError in the /tasks view, return 503 - Restore allow_reconnect=False (fail-fast on connection issues) - Add return type annotation to get_connection() Co-Authored-By: Claude <noreply@anthropic.com> * fix: address review comments (log level, fetch timeout, docstring) - Downgrade reserve_tasks log to DEBUG when zero tasks reserved (avoid log spam from frequent polling) - Pass timeout=0.5 from /tasks endpoint to avoid blocking the worker for 5s on empty queues - Fix docstring examples using string 'job123' for int-typed job_id Co-Authored-By: Claude <noreply@anthropic.com> * fix: catch nats.errors.Error in /tasks endpoint for proper 503 responses NoServersError, ConnectionClosedError, and other NATS exceptions inherit from nats.errors.Error (not OSError), so they escaped the handler and returned 500 instead of 503. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>

…olnickLab#1142) * feat: configurable NATS tuning and gunicorn worker management Rebase onto main after RolnickLab#1135 merge. Keep only the additions unique to this branch: - Make TASK_TTR configurable via NATS_TASK_TTR Django setting (default 30s) - Make max_ack_pending configurable via NATS_MAX_ACK_PENDING setting (default 100) - Local dev: switch to gunicorn+UvicornWorker by default for production parity, with USE_UVICORN=1 escape hatch for raw uvicorn - Production: auto-detect WEB_CONCURRENCY from CPU cores (capped at 8) when not explicitly set in the environment Co-Authored-By: Claude <noreply@anthropic.com> * fix: address PR review comments - Fix max_ack_pending falsy-zero guard (use `is not None` instead of `or`) - Update TaskQueueManager docstring with Args section - Simplify production WEB_CONCURRENCY fallback (just use nproc) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Michael Bunsen <notbot@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>

* fix: include pipeline_slug in MinimalJobSerializer (ids_only response) The ADC worker fetches jobs with ids_only=1 and expects pipeline_slug in the response to know which pipeline to run. Without it, Pydantic validation fails and the worker skips the job. Co-Authored-By: Claude <noreply@anthropic.com> * Update ami/jobs/serializers.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

ami/jobs/management/commands/test_ml_job_e2e.py (2)

32-37: ⚠️ Potential issue | 🟡 Minor

Handle duplicate collection names and chain CommandError.

SourceImageCollection.objects.get(name=...) can raise MultipleObjectsReturned when names aren’t unique; also, raising CommandError without chaining triggers B904. This was already flagged before.

✅ Proposed fix

-        except SourceImageCollection.DoesNotExist:
-            raise CommandError(f"SourceImageCollection '{collection_name}' not found")
+        except SourceImageCollection.DoesNotExist:
+            raise CommandError(f"SourceImageCollection '{collection_name}' not found") from None
+        except SourceImageCollection.MultipleObjectsReturned:
+            raise CommandError(
+                f"Multiple SourceImageCollections found with name '{collection_name}'. Use a unique name."
+            ) from None

#!/bin/bash
# Inspect SourceImageCollection.name uniqueness to confirm duplicate risk
rg -n "class SourceImageCollection" -C4

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ami/jobs/management/commands/test_ml_job_e2e.py` around lines 32 - 37, The
current lookup using SourceImageCollection.objects.get(name=collection_name) can
raise MultipleObjectsReturned and the CommandError is raised without exception
chaining; update the try/except to catch both SourceImageCollection.DoesNotExist
and SourceImageCollection.MultipleObjectsReturned (or Exception as e) and when
raising CommandError use "raise CommandError(... ) from e" so the original
exception is chained; reference the lookup call
(SourceImageCollection.objects.get) and the raised CommandError to locate and
modify the handling.

40-44: ⚠️ Potential issue | 🟡 Minor

Chain CommandError exceptions to avoid B904 violations.

Lines 37 and 44 should use raise ... from None to explicitly handle exception chaining per Ruff B904. When intentionally replacing a caught exception with a more informative one, explicit chaining syntax distinguishes intentional replacement from accidental context loss.

Affected locations

Line 37 (SourceImageCollection):

         except SourceImageCollection.DoesNotExist:
-            raise CommandError(f"SourceImageCollection '{collection_name}' not found")
+            raise CommandError(f"SourceImageCollection '{collection_name}' not found") from None

Line 44 (Pipeline):

         except Pipeline.DoesNotExist:
-            raise CommandError(f"Pipeline '{pipeline_slug}' not found")
+            raise CommandError(f"Pipeline '{pipeline_slug}' not found") from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ami/jobs/management/commands/test_ml_job_e2e.py` around lines 40 - 44, The
except handlers that replace underlying exceptions with CommandError
(specifically the block calling Pipeline.objects.get(slug=pipeline_slug) and the
similar SourceImageCollection lookup) must explicitly suppress implicit
exception chaining; change the raise statements to use "raise CommandError(...)
from None" so the new CommandError replaces the original exception intentionally
(use the exact symbols Pipeline.objects.get and SourceImageCollection
lookup/exception handlers to find the two places to update).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ami/jobs/management/commands/test_ml_job_e2e.py`:
- Around line 78-81: The loop termination misses the UNKNOWN state because
JobState.final_states() doesn't include it; update the check inside the
monitoring loop to treat UNKNOWN as terminal by changing the condition to check
job.status against both JobState.final_states() and JobState.failed_states()
(e.g., if job.status in JobState.final_states() or job.status in
JobState.failed_states(): call self._show_final_results(job, start_time) and
break) so the loop cannot spin forever when status becomes UNKNOWN.
- Line 26: The handle method currently defines an unused variadic positional
parameter as *args which violates Ruff ARG002; rename it to *_args in the
function signature for handle to mark it intentionally unused and satisfy the
linter (ensure any internal references to the old name are updated or removed).
Keep the rest of the signature and behavior unchanged.

---

Duplicate comments:
In `@ami/jobs/management/commands/test_ml_job_e2e.py`:
- Around line 32-37: The current lookup using
SourceImageCollection.objects.get(name=collection_name) can raise
MultipleObjectsReturned and the CommandError is raised without exception
chaining; update the try/except to catch both SourceImageCollection.DoesNotExist
and SourceImageCollection.MultipleObjectsReturned (or Exception as e) and when
raising CommandError use "raise CommandError(... ) from e" so the original
exception is chained; reference the lookup call
(SourceImageCollection.objects.get) and the raised CommandError to locate and
modify the handling.
- Around line 40-44: The except handlers that replace underlying exceptions with
CommandError (specifically the block calling
Pipeline.objects.get(slug=pipeline_slug) and the similar SourceImageCollection
lookup) must explicitly suppress implicit exception chaining; change the raise
statements to use "raise CommandError(...) from None" so the new CommandError
replaces the original exception intentionally (use the exact symbols
Pipeline.objects.get and SourceImageCollection lookup/exception handlers to find
the two places to update).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11f9fd7 and f72a552.

📒 Files selected for processing (1)

ami/jobs/management/commands/test_ml_job_e2e.py

coderabbitai · 2026-02-24T18:43:43Z

ami/jobs/management/commands/test_ml_job_e2e.py

+            help="Job dispatch mode",
+        )
+
+    def handle(self, *args, **options):


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if the file exists and view content around line 26 if [ -f "ami/jobs/management/commands/test_ml_job_e2e.py" ]; then echo "=== File exists, showing lines 20-35 ===" sed -n '20,35p' ami/jobs/management/commands/test_ml_job_e2e.py | cat -n else echo "File not found" fi

Repository: RolnickLab/antenna

Length of output: 849

🏁 Script executed:

#!/bin/bash # Find Ruff configuration files echo "=== Looking for Ruff configuration files ===" fd -t f '(pyproject\.toml|ruff\.toml|\.ruff\.toml|setup\.cfg|tox\.ini)' -x ls -lh {} echo -e "\n=== Searching for Ruff/ARG configuration ===" fd -t f '(pyproject\.toml|ruff\.toml|\.ruff\.toml)' -x rg -A 20 "^\[tool\.ruff\]|extend-ignore|ignore.*ARG|ARG002"

Repository: RolnickLab/antenna

Length of output: 330

🏁 Script executed:

#!/bin/bash echo "=== Content of ruff.toml ===" cat ruff.toml echo -e "\n=== Content of pyproject.toml ===" cat pyproject.toml echo -e "\n=== Content of setup.cfg ===" cat setup.cfg

Repository: RolnickLab/antenna

Length of output: 3366

Rename unused *args to *_args to satisfy Ruff ARG002 enforcement.

Line 26 contains an unused *args parameter that violates the ARG002 rule, which is actively enforced in this repository's Ruff configuration. Rename to *_args to indicate the parameter is intentionally unused.

Suggested fix

- def handle(self, *args, **options): + def handle(self, *_args, **options):

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def handle(self, *args, **options):

def handle(self, *_args, **options):

🧰 Tools

🪛 Ruff (0.15.2)

[warning] 26-26: Unused method argument: args

(ARG002)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ami/jobs/management/commands/test_ml_job_e2e.py` at line 26, The handle method currently defines an unused variadic positional parameter as *args which violates Ruff ARG002; rename it to *_args in the function signature for handle to mark it intentionally unused and satisfy the linter (ensure any internal references to the old name are updated or removed). Keep the rest of the signature and behavior unchanged.

coderabbitai · 2026-02-24T18:43:43Z

ami/jobs/management/commands/test_ml_job_e2e.py

+            # Check if finished
+            if job.status in JobState.final_states():
+                self._show_final_results(job, start_time)
+                break


⚠️ Potential issue | 🟠 Major

Terminate monitoring on UNKNOWN to avoid infinite loop.

JobState.final_states() excludes UNKNOWN, yet JobState.failed_states() includes it. If the job ever becomes UNKNOWN, this loop never ends. Treat UNKNOWN as terminal (or use failed_states() in addition to final_states()).

🔧 Suggested fix

- if job.status in JobState.final_states(): + if job.status in JobState.final_states() or job.status == JobState.UNKNOWN: self._show_final_results(job, start_time) break

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ami/jobs/management/commands/test_ml_job_e2e.py` around lines 78 - 81, The loop termination misses the UNKNOWN state because JobState.final_states() doesn't include it; update the check inside the monitoring loop to treat UNKNOWN as terminal by changing the condition to check job.status against both JobState.final_states() and JobState.failed_states() (e.g., if job.status in JobState.final_states() or job.status in JobState.failed_states(): call self._show_final_results(job, start_time) and break) so the loop cannot spin forever when status becomes UNKNOWN.

carlos-irreverentlabs and others added 8 commits January 16, 2026 11:25

merge

b60eab0

Merge remote-tracking branch 'upstream/main'

644927f

Merge remote-tracking branch 'upstream/main'

218f7aa

Merge remote-tracking branch 'upstream/main'

90da389

Merge remote-tracking branch 'upstream/main'

8618d3c

Merge remote-tracking branch 'upstream/main'

bd1be5f

Merge remote-tracking branch 'upstream/main'

b102ae1

ML Job testing

11f9fd7

carlosgjs requested a review from mihow February 20, 2026 19:57

carlosgjs marked this pull request as ready for review February 20, 2026 19:57

Copilot AI review requested due to automatic review settings February 20, 2026 19:57

Copilot started reviewing on behalf of carlosgjs February 20, 2026 19:58 View session

coderabbitai bot reviewed Feb 20, 2026

View reviewed changes

Copilot AI reviewed Feb 20, 2026

View reviewed changes

ami/jobs/management/commands/test_ml_job_e2e.py Show resolved Hide resolved

ami/jobs/management/commands/test_ml_job_e2e.py Outdated Show resolved Hide resolved

carlosgjs changed the title ~~Carlos/testmljob~~ Testing: Management command for e2e test of ML jobs Feb 20, 2026

carlosgjs mentioned this pull request Feb 20, 2026

TEMP: Support cancelling async ml jobs uw-ssec/antenna#11

Closed

mihow and others added 6 commits February 24, 2026 09:34

Merge remote-tracking branch 'upstream/main'

883c4f8

Merge branch 'main' into carlos/testmljob

b9034f4

Apply suggestions from code review

f72a552

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

	def handle(self, args, *options):
	def handle(self, _args, *options):

Comments

Conversation

carlosgjs commented Feb 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

netlify bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👷 Deploy request for antenna-ssec pending review.

Uh oh!

netlify bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

coderabbitai bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

carlosgjs commented Feb 20, 2026 •

edited by coderabbitai bot

Loading

netlify bot commented Feb 20, 2026 •

edited

Loading

netlify bot commented Feb 20, 2026 •

edited

Loading

coderabbitai bot commented Feb 20, 2026 •

edited

Loading