-
-
Notifications
You must be signed in to change notification settings - Fork 277
Description
Hello! First off, thank you for creating and maintaining MemoryPack. It's an incredibly powerful and high-performance serialization library.
I'm opening this issue to discuss a potential performance improvement in the source generator that would greatly enhance the developer experience within an IDE like Visual Studio.
The Problem
I've noticed that the MemoryPackGenerator seems to regenerate source files on every single keystroke, even when the code changes are completely unrelated to any [MemoryPackable] types. This frequent regeneration can lead to noticeable UI lag and a suboptimal development workflow, especially in larger solutions.
Root Cause Analysis
The root cause appears to be how the incremental pipeline is constructed. The transform delegate within SyntaxProvider.ForAttributeWithMetadataName directly returns the raw TypeDeclarationSyntax from the context.
The code returns the raw syntax node, which is not durable across compilations:
MemoryPack/src/MemoryPack.Generator/MemoryPackGenerator.cs
Lines 81 to 84 in bbe6e75
| transform: static (context, token) => | |
| { | |
| return (TypeDeclarationSyntax)context.TargetNode; | |
| }) |
Because SyntaxNode objects are not durable (a new tree is created for each new Compilation, which happens on every keystroke), the incremental pipeline's caching mechanism is effectively bypassed. The pipeline sees a "new" input object every time and is forced to re-trigger the final RegisterSourceOutput stage, even when the semantic meaning of the target type hasn't changed at all.
Proposed Solution
To make the generator truly incremental and leverage Roslyn's caching, the transform step should extract the required information into a custom, immutable, and equatable data model (a record is perfect for this). This model would act as a stable representation of the type to be generated.
Here is a conceptual example of the recommended pattern:
// 1. Define a cacheable, equatable model to hold generation data.
public record TypeToGenerateInfo(string Namespace, string TypeName, /* ... other needed properties */);
// 2. Update the transform delegate to create and return this model.
transform: static (context, token) =>
{
var typeNode = (TypeDeclarationSyntax)context.TargetNode;
var symbol = context.SemanticModel.GetDeclaredSymbol(typeNode, token) as INamedTypeSymbol;
if (symbol == null) return null;
// Extract all necessary data from the symbol and its attributes here...
// Return the new, stable model.
return new TypeToGenerateInfo(
Namespace: symbol.ContainingNamespace.ToDisplayString(),
TypeName: symbol.Name
// ... populate other properties
);
},For a concrete, real-world example of this pattern in practice, my FourSer.Gen source generator (to my knowledge) implements this correctly:
-
The
ForAttributeWithMetadataNamecall passes a method group (TypeInfoProvider.GetSemanticTargetForGeneration) as the transform: -
The transform function (
GetSemanticTargetForGeneration) then extracts all necessary data and returns a newTypeToGeneraterecord, which is the cacheable model:
Impact of This Change
By adopting this pattern, the generator would only execute the final, expensive code-generation step when the semantic information of an attributed type actually changes. A keystroke inside a method body, for example, would produce an identical TypeToGenerateInfo model, and the pipeline would correctly halt execution early.
This would significantly improve IDE responsiveness and provide the full performance benefits intended by the incremental generator architecture.
Thank you for your consideration! I hope this feedback is helpful.