Skip to content

Conversation

@surayya-MS
Copy link
Member

Fixes #12660

Context

The following assert fails:
/mt /m sometimes crashes with error:

Microsoft.Build.Framework.InternalErrorException: MSB0001: Internal MSBuild Error: Why are we trying to disconnect from a context that we already disconnected from?  Did we call DisconnectFromHost twice?
   at Microsoft.Build.Shared.ErrorUtilities.ThrowInternalError(String message, Exception innerException, Object[] args) in C:\Users\alinama\work\msbuild\src\Shared\ErrorUtilities.cs:line 69
   at Microsoft.Build.Shared.ErrorUtilities.VerifyThrow(Boolean condition, String unformattedMessage) in C:\Users\alinama\work\msbuild\src\Shared\ErrorUtilities.cs:line 195
   at Microsoft.Build.BackEnd.NodeProviderOutOfProcTaskHost.DisconnectFromHost(Int32 nodeId) in C:\Users\alinama\work\msbuild\src\Build\BackEnd\Components\Communications\NodeProviderOutOfProcTaskHost.cs:line 607
   at Microsoft.Build.BackEnd.TaskHostTask.Execute() in C:\Users\alinama\work\msbuild\src\Build\Instance\TaskFactories\TaskHostTask.cs:line 382
   at Microsoft.Build.BackEnd.TaskExecutionHost.Execute() in C:\Users\alinama\work\msbuild\src\Build\BackEnd\TaskExecutionHost\TaskExecutionHost.cs:line 636

Changes Made

Changed _nodeIdToPacketFactory and _nodeIdToPacketHandler from Dictionary to ConcurrentDictionary.

Testing

Manually tested to make sure the error is not there.

Notes

@surayya-MS surayya-MS requested review from AR-May and Copilot and removed request for Copilot October 30, 2025 11:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves thread safety in NodeProviderOutOfProcTaskHost by converting two dictionary fields from IDictionary<int, T> to ConcurrentDictionary<int, T>.

  • Changed field type declarations for _nodeIdToPacketFactory and _nodeIdToPacketHandler to use ConcurrentDictionary
  • Updated initialization from Dictionary to ConcurrentDictionary
  • Replaced Remove() calls with TryRemove() to use the concurrent dictionary API

Copy link
Member

@rainersigwald rainersigwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I see how this solves the problem. Can you elaborate please?

@AR-May
Copy link
Member

AR-May commented Oct 31, 2025

I'm not sure I see how this solves the problem. Can you elaborate please?

oh, that's a mystery here. This fix solves the race condition in the prototype branch - I can attest to that fact. But neither me nor @surayya-MS (please correct me if you figured it out since our last sync) figured yet why it solves it, more precisely, why it runs into a race condition in the first place.

We need to investigate this bug a bit more

@AR-May AR-May marked this pull request as draft October 31, 2025 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race condition in NodeProviderOutOfProcTaskHost

3 participants