Skip to content

Conversation

@mohammad-arif662
Copy link
Contributor

@mohammad-arif662 mohammad-arif662 commented Feb 5, 2026

Problem

When a simulation was running and the system lost internet connectivity, the connector would terminate unexpectedly, leaving the run stuck in "running" status indefinitely. It happens because PublishConnectorStatus was declared as async void, which cannot be awaited by the caller and causes unhandled exception type HttpRequestException to crash the process.

Fix

Changed PublishConnectorStatus from async void to async Task and added proper await calls. Now network exceptions propagate to the outer try/catch block which allows the connector to continue running and recover when connectivity is restored and marks the run as "failure".

@mohammad-arif662 mohammad-arif662 changed the title replace async void with async Task in PublishConnectorStatus fix: PROSPER connector doesn't recover from failure after network issue Feb 5, 2026
@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.97%. Comparing base (6108d05) to head (5da2537).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #328      +/-   ##
==========================================
+ Coverage   75.48%   75.97%   +0.48%     
==========================================
  Files          38       38              
  Lines        4083     4083              
  Branches      487      488       +1     
==========================================
+ Hits         3082     3102      +20     
+ Misses        817      796      -21     
- Partials      184      185       +1     
Files with missing lines Coverage Δ
Cognite.Simulator.Utils/SimulationRunnerBase.cs 76.04% <100.00%> (+3.47%) ⬆️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@abdullah-cognite
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses a potential stability issue by changing the PublishConnectorStatus method from async void to async Task. This prevents unhandled exceptions from crashing the application and allows for proper error propagation. My review includes a few suggestions to align the naming of this asynchronous method with the project's style guide by adding the Async suffix. Overall, this is a solid and important fix.

@mohammad-arif662 mohammad-arif662 marked this pull request as ready for review February 5, 2026 09:11
@mohammad-arif662 mohammad-arif662 requested a review from a team as a code owner February 5, 2026 09:11
@abdullah-cognite
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses a critical issue where a network failure during a status update could crash the connector. By changing PublishConnectorStatus from async void to async Task and awaiting the calls, unhandled exceptions are now properly propagated and managed, allowing the connector to recover gracefully. A comprehensive test case has been added to validate this fix. My review includes one high-severity suggestion to improve exception handling consistency in PublishConnectorStatusAsync by rethrowing the caught ResponseException, in line with the repository's style guide.

@abdullah-cognite
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses a critical issue where an unhandled exception from an async void method could crash the connector during network failures. Changing PublishConnectorStatus to return async Task and awaiting it is the right approach, ensuring exceptions are properly propagated and handled. The addition of an integration test to simulate and verify this scenario is a great improvement. I've identified one issue in the test utility code that could affect the reliability of the new test.

@abdullah-cognite
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical issue where a network failure during a simulation run could crash the connector due to an unhandled exception in an async void method. The change to use async Task and properly await the operation is the correct approach to fix this.

The addition of a new unit test to specifically cover this network failure scenario is excellent and greatly improves the robustness of the codebase.

I've found one critical issue in the implementation where an exception is caught but not re-thrown. This prevents the error from being handled correctly by the calling method, which contradicts the stated goal of the PR. I have left a specific comment with a suggestion to fix this.

@polomani
Copy link
Collaborator

polomani commented Feb 11, 2026

why does this PR say "PROSPER"? this is a generic library

@mohammad-arif662 mohammad-arif662 changed the title fix: PROSPER connector doesn't recover from failure after network issue fix: connector doesn't recover from failure after network issue Feb 11, 2026
Copy link
Collaborator

@abdullah-cognite abdullah-cognite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@mohammad-arif662 mohammad-arif662 added the waiting-for-risk-review Waiting for a member of the risk review team to take an action label Feb 12, 2026
@polomani polomani added risk-review-ongoing Risk review is in progress waiting-for-team Waiting for the submitter or reviewer of the PR to take an action and removed waiting-for-risk-review Waiting for a member of the risk review team to take an action labels Feb 12, 2026
@polomani polomani self-assigned this Feb 12, 2026
Copy link
Collaborator

@polomani polomani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

risk-review-ongoing Risk review is in progress waiting-for-team Waiting for the submitter or reviewer of the PR to take an action

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants