|
| 1 | +--- |
| 2 | +title: Diagnosing test failures |
| 3 | +shortTitle: Diagnose test failures |
| 4 | +intro: '{% data variables.copilot.copilot_chat_short %} can help you understand why a test is failing and suggest how to fix it.' |
| 5 | +versions: |
| 6 | + feature: copilot |
| 7 | +category: |
| 8 | + - Debugging code |
| 9 | + - Author and optimize with Copilot |
| 10 | +complexity: |
| 11 | + - Intermediate |
| 12 | +octicon: bug |
| 13 | +topics: |
| 14 | + - Copilot |
| 15 | +contentType: tutorials |
| 16 | +--- |
| 17 | + |
| 18 | +{% data variables.copilot.copilot_chat_short %} can analyze test failures and help identify potential causes. |
| 19 | + |
| 20 | +## Example scenario: Tests passing locally but failing in CI |
| 21 | + |
| 22 | +Consider a scenario where you have a test that passes on your local machine but sometimes fails in CI. {% data variables.copilot.copilot_chat_short %} can help identify the reason for the failure. |
| 23 | + |
| 24 | +In this example, the code being tested defines a simple order service (`order.py`), and there's a corresponding test that checks if an order was created today (`test_order_service.py`). |
| 25 | + |
| 26 | +## Example prompt |
| 27 | + |
| 28 | +The prompt below provides {% data variables.product.prodname_copilot_short %} with the relevant code and test files (using `#file:`) and includes a copy/paste of the relevant excerpt from the CI failure. |
| 29 | + |
| 30 | +```copilot copy |
| 31 | +Please take a look at this CI failure message. The test passes locally, but intermittently fails in CI. Can you help me figure out if this looks like a code bug, environment issue, or a flaky test? |
| 32 | +
|
| 33 | +Failure: |
| 34 | +
|
| 35 | +___ TestOrderService.test_order_created_today ___ |
| 36 | +> assert order["created_date"] == date.today() |
| 37 | +E AssertionError: assert datetime.date(2024, 1, 15) == datetime.date(2024, 1, 16) |
| 38 | +
|
| 39 | +test_order_service.py:45: AssertionError |
| 40 | +
|
| 41 | +#file:order.py |
| 42 | +#file:test_order_service.py |
| 43 | +
|
| 44 | +``` |
| 45 | + |
| 46 | +## Example response |
| 47 | + |
| 48 | +{% data reusables.copilot.example-prompts.response-is-an-example %} |
| 49 | + |
| 50 | +{% data variables.copilot.copilot_chat_short %} notices that the dates are exactly one day apart and identifies that this could be a **timezone** or **time-boundary** issue. |
| 51 | + |
| 52 | +The local machine and CI runner may be using different timezone settings or deriving `today` from different clocks (UTC vs. local time), so when the test runs near midnight, `date.today()` can return different dates in each environment. |
| 53 | + |
| 54 | +{% data variables.copilot.copilot_chat_short %} suggests treating the failure as test flakiness caused by environment/time assumptions (and not a logic bug), and fixing it by standardizing how `today` is computed across environments. |
| 55 | + |
| 56 | +## Example scenario 2: Intermittent test failures |
| 57 | + |
| 58 | +Consider a scenario where a test sometimes passes and sometimes fails on the same machine. {% data variables.copilot.copilot_chat_short %} can compare logs from passing and failing runs to help identify the cause. |
| 59 | + |
| 60 | +In this example, the code under test uses a background job in `order_service.py` to update an order's status asynchronously, and a test in `test_order_service.py` asserts that the final status is `"processed"`. |
| 61 | + |
| 62 | +## Example prompt |
| 63 | + |
| 64 | +The prompt below provides {% data variables.product.prodname_copilot_short %} with the failure message, the log excerpts from both a passing and failing run, and the relevant code files (using `#file:`). |
| 65 | + |
| 66 | +```copilot copy |
| 67 | +This test passes sometimes and fails sometimes. Can you compare the logs and help me figure out why? |
| 68 | +
|
| 69 | +Failure message: |
| 70 | +
|
| 71 | +> assert order.status == "processed" |
| 72 | +E AssertionError: assert "pending" == "processed" |
| 73 | +
|
| 74 | +test_order_service.py:62: AssertionError |
| 75 | +
|
| 76 | +Logs from a passing run: |
| 77 | +[DEBUG] Created order #1234 |
| 78 | +[DEBUG] Background job started for order #1234 |
| 79 | +[DEBUG] Background job completed (52ms) |
| 80 | +[DEBUG] Checking order status |
| 81 | +[DEBUG] Order #1234 status: processed |
| 82 | +
|
| 83 | +Logs from the failing run: |
| 84 | +[DEBUG] Created order #1234 |
| 85 | +[DEBUG] Background job started for order #1234 |
| 86 | +[DEBUG] Checking order status |
| 87 | +[DEBUG] Order #1234 status: pending |
| 88 | +
|
| 89 | +#file:order_service.py |
| 90 | +#file:test_order_service.py |
| 91 | +``` |
| 92 | + |
| 93 | +## Example response |
| 94 | + |
| 95 | +{% data reusables.copilot.example-prompts.response-is-an-example %} |
| 96 | + |
| 97 | +{% data variables.copilot.copilot_chat_short %} compares the two logs and notices that in the passing run, the background job completed *before* the status check, while in the failing run, the status was checked while the job was still running. {% data variables.copilot.copilot_chat_short %} notes that this is a **race condition**, as the test doesn't wait for the background job to finish. |
| 98 | + |
| 99 | +{% data variables.copilot.copilot_chat_short %} suggests adding a mechanism to ensure the job completes before asserting, such as running the job synchronously, awaiting completion (for example, via a callback), or polling. |
| 100 | + |
| 101 | +## Further reading |
| 102 | + |
| 103 | +{% data reusables.copilot.example-prompts.further-reading-items %} |
0 commit comments