|
| 1 | +# Source Selection Algorithm |
| 2 | + |
| 3 | +When the source indexer processes multiple builds for the same assembly (e.g., generic builds, platform-specific builds, or builds with different target frameworks), it uses a scoring algorithm to select the "best" implementation to include in the final source index. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The deduplication process groups all compiler invocations by `AssemblyName` and then calculates a score for each build. The build with the highest score is selected and included in the generated solution file. |
| 8 | + |
| 9 | +## Scoring Priorities |
| 10 | + |
| 11 | +The scoring algorithm evaluates builds using the following criteria, ordered by priority from highest to lowest: |
| 12 | + |
| 13 | +### 1. UseForSourceIndex Property (Highest Priority) |
| 14 | +- **Score**: `int.MaxValue` (2,147,483,647) |
| 15 | +- **Description**: When a project explicitly sets the `UseForSourceIndex` property to `true`, it receives the maximum possible score, ensuring it will always be selected regardless of other factors. |
| 16 | +- **Use Case**: Provides an escape hatch for projects that should definitely be included in the source index. |
| 17 | + |
| 18 | +### 2. Platform Support Status (Second Priority) |
| 19 | +- **Score**: `-10,000` penalty for platform-not-supported assemblies |
| 20 | +- **Description**: If a project has the `IsPlatformNotSupportedAssembly` property set to `true`, it receives a heavy penalty. |
| 21 | +- **Use Case**: Ensures that stub implementations containing mostly `PlatformNotSupportedException` are avoided in favor of real implementations. |
| 22 | + |
| 23 | +### 3. Target Framework Version (Third Priority) |
| 24 | +- **Score**: `Major * 1000 + Minor * 100` |
| 25 | +- **Description**: Newer framework versions receive higher scores. For example: |
| 26 | + - .NET 8.0 = 8,000 + 0 = 8,000 points |
| 27 | + - .NET 6.0 = 6,000 + 0 = 6,000 points |
| 28 | + - .NET Framework 4.8 = 4,000 + 80 = 4,080 points |
| 29 | +- **Use Case**: Prefers more recent implementations that are likely to contain the latest features and bug fixes. |
| 30 | + |
| 31 | +### 4. Platform Specificity (Fourth Priority) |
| 32 | +- **Score**: `+500` for platform-specific frameworks |
| 33 | +- **Additional**: `+100` bonus for Linux platforms, `+50` bonus for Unix platforms |
| 34 | +- **Description**: Platform-specific builds (e.g., `net8.0-linux`, `net8.0-windows`) receive bonuses over generic builds. |
| 35 | +- **Use Case**: Platform-specific implementations often contain more complete functionality than generic implementations. |
| 36 | + |
| 37 | +### 5. Source File Count (Lowest Priority) |
| 38 | +- **Score**: `+1` per source file |
| 39 | +- **Description**: Builds with more source files receive higher scores. |
| 40 | +- **Use Case**: Acts as a tiebreaker when other factors are equal, assuming more source files indicate a more complete implementation. |
| 41 | + |
| 42 | +## Example Scoring |
| 43 | + |
| 44 | +Consider these hypothetical builds for `System.Net.NameResolution`: |
| 45 | + |
| 46 | +| Build | UseForSourceIndex | IsPlatformNotSupported | Framework | Platform | Source Files | Total Score | |
| 47 | +|-------|-------------------|------------------------|-----------|----------|--------------|-------------| |
| 48 | +| Generic Build | false | true | net8.0 | none | 45 | -1,955 | |
| 49 | +| Linux Build | false | false | net8.0-linux | linux | 127 | 8,727 | |
| 50 | +| Windows Build | false | false | net8.0-windows | windows | 98 | 8,598 | |
| 51 | +| Override Build | true | false | net6.0 | none | 23 | 2,147,483,647 | |
| 52 | + |
| 53 | +In this example: |
| 54 | +- The **Override Build** would be selected due to `UseForSourceIndex=true` |
| 55 | +- Without the override, the **Linux Build** would be selected with the highest score |
| 56 | +- The **Generic Build** receives a massive penalty for being platform-not-supported |
| 57 | + |
| 58 | +## Implementation Details |
| 59 | + |
| 60 | +The scoring logic is implemented in the `CalculateInvocationScore` method in `BinLogToSln/Program.cs`. The method: |
| 61 | + |
| 62 | +1. Reads project properties from the binlog file |
| 63 | +2. Applies scoring rules in priority order |
| 64 | +3. Handles parsing errors gracefully |
| 65 | +4. Returns a base score of 1 for builds that fail scoring to avoid complete exclusion |
| 66 | + |
| 67 | +## Configuration |
| 68 | + |
| 69 | +The algorithm can be influenced through MSBuild project properties: |
| 70 | + |
| 71 | +- **UseForSourceIndex**: Set to `true` to force selection of this build |
| 72 | +- **IsPlatformNotSupportedAssembly**: Set to `true` to indicate this is a stub implementation |
| 73 | +- **TargetFramework**: Automatically detected from the project file |
| 74 | + |
| 75 | +These properties are captured from the binlog during the build analysis phase. |
0 commit comments