Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ l0_rtx_pro_6000:
# - accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4_4gpus_online_eplb[fp8kv=True] # Verify GDRCopy availability on Blossom pods
# - accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_bfloat16_4gpus_online_eplb[mtp_nextn=2] # Verify GDRCopy availability on Blossom pods
# - accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[False] # hopper only
# - accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[True]
- accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[True]
Comment on lines 98 to +99
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add context explaining why the flaky test is being re-enabled.

The uncommented test at line 99 lacks an inline comment explaining the fix or rationale, unlike nearby entries (e.g., line 98's "# hopper only" or lines 96–97's GDRCopy notes). This makes it unclear whether the underlying flaky condition has been resolved or if this is experimental re-enablement. Add a comment clarifying the fix strategy.

Suggested fix:

- - accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[True]
+ - accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[True] # https://nvbugs/5543035 - flaky issue resolved
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# - accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[False] # hopper only
# - accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[True]
- accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[True]
# - accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[False] # hopper only
- accuracy/test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[True] # https://nvbugs/5543035 - flaky issue resolved
🤖 Prompt for AI Agents
In tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml around lines 98 to
99, the uncommented test entry at line 99 re-enables a previously flaky test but
lacks an explanatory inline comment; add a short inline comment after that line
describing why the test was re-enabled (e.g., what root cause was fixed, what
environment or config change stabilizes it, and reference to the related
issue/PR or commit) so future readers know this is intentional and not
experimental.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the scope of changes matches the AI summary.

The AI summary claims the test is uncommented "across two locations," but the annotated code shows only one marked change at line 99. Confirm whether this is the only modification or if another location was changed outside the provided context.


🏁 Script executed:

#!/bin/bash
# Description: Verify all uncommented occurrences of TestQwen3_8B::test_auto_dtype[True] in l0_rtx_pro_6000.yml

# Find all non-commented occurrences of the test
rg -n 'TestQwen3_8B::test_auto_dtype\[True\]' tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml | grep -v '^\s*#'

# Also show surrounding context to confirm uncommenting
rg -B2 -A2 'TestQwen3_8B::test_auto_dtype\[True\]' tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml

Length of output: 624


🏁 Script executed:

#!/bin/bash
# Get precise line numbers and full context for all occurrences
cat -n tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml | grep -A5 -B5 'TestQwen3_8B::test_auto_dtype\[True\]'

Length of output: 88



---

🏁 Script executed:

```shell
#!/bin/bash
# Get precise line numbers and full context for all occurrences of the test
sed -n '90,110p' tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml | cat -n

Length of output: 3083


🏁 Script executed:

#!/bin/bash
# Search entire file for all uncommented occurrences of the test
echo "=== All lines matching test_auto_dtype[True] (excluding comments) ==="
rg -n 'TestQwen3_8B::test_auto_dtype\[True\]' tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml | grep -v '^\s*#'

echo ""
echo "=== Total count of uncommented occurrences ==="
rg -n 'TestQwen3_8B::test_auto_dtype\[True\]' tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml | grep -v '^\s*#' | wc -l

Length of output: 202


Remove incorrect AI summary claim and add required explanation for the uncommented test.

Verification confirms the AI summary is incorrect: there is only ONE uncommented instance at line 99, not "across two locations." However, this uncommented test lacks explanation or context. Related code (line 98) includes a helpful inline comment ("# hopper only"), but the newly uncommented test has no annotation explaining why it's being re-enabled in response to NVBUG 5543035. Add an inline comment clarifying the reason for uncommenting this test case.

🤖 Prompt for AI Agents
In tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml around line 99, the
review incorrectly stated two uncommented instances but there is only ONE
uncommented test at line 99 and it lacks context; update the file by adding an
inline comment immediately above or beside the uncommented test line that
explains why this specific test was re-enabled (eg. reference NVBUG-5543035 and
any scope/conditions like "hopper only"), and remove or correct any text
elsewhere claiming multiple uncommented locations so the file and comment
accurately reflect the single re-enabled test and its rationale.

- accuracy/test_disaggregated_serving.py::TestDeepSeekV3Lite::test_guided_decoding[xgrammar-mtp_nextn=0]
- accuracy/test_disaggregated_serving.py::TestDeepSeekV3Lite::test_guided_decoding[xgrammar-mtp_nextn=2]
- accuracy/test_disaggregated_serving.py::TestDeepSeekV3Lite::test_guided_decoding[llguidance-mtp_nextn=0]
Expand Down
Loading