Skip to content

Commit a632542

Browse files
committed
More docs
1 parent b7d9d1e commit a632542

File tree

1 file changed

+307
-4
lines changed

1 file changed

+307
-4
lines changed

docs/UnderTheHood.md

Lines changed: 307 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,320 @@
11
## Under the hood
22

3-
TODO: Write!
3+
IncrementalCompiler for Unity3D was my fun-project for 2016 new year's holiday.
4+
At that time I was thinking about slow compilation speed of unity3d-mono and
5+
wondering if it is possible to make it faster with the minimum effort.
6+
After a couple of hours digging, it turned out feasible to make it.
7+
8+
## Making incremental compiler
9+
10+
Microsoft / Mono C# compiler doesn't support incremental compilation until now.
11+
They had "/incremental" option once in old times but lost it already.
12+
Because C# compiler is really fast and C# language itself is good to
13+
keep compiler run fast, incremental compilation has not been at the first priority.
14+
Also community keep there project small and like to separate big project into smaller ones
15+
to handle this problem indirectly.
16+
17+
Unity3D uses mono C# infrastructure. It seems hard to make compiler work faster
18+
because Unity3D still uses old slow Mono 3 and doesn't provide an easy way to divide
19+
big project into small one. So I decided to make incremental compiler for Unity3D.
420

521
### Unity build process
622

23+
Unity3D makes three projects from Assets directory if your project has only C# sources.
24+
725
- Assembly-CSharp-firstpass
26+
- Consists of scripts in Plugins directory.
27+
- No dependency.
28+
- It's rare to build this project because these sources are not modified.
829
- Assembly-CSharp
30+
- Consists of almost sources in Assets.
31+
(except for sources in Plugins and Editor directory)
32+
- Depends on Assembly-CSharp-firstpass
33+
- This project is always built because these sources are main place for common work.
934
- Assembly-CSharp-Editor
35+
- Consists of scripts in Editor directory.
36+
- Depends on Assembly-CSharp
37+
- This project is always built not because these sources are modified but by building
38+
dependent Assembly-CSharp.
39+
40+
[Unity Manual: Special Folders and Script Compilation Order](http://docs.unity3d.com/Manual/ScriptCompileOrderFolders.html)
41+
42+
### How to replace a builtin compiler with a new one.
43+
44+
At first, Replacing mono compiler in unity directory with new one was considered.
45+
But it is not an easy way for users and also intrusive way to affect whole projects.
46+
However [alexzzzz](https://bitbucket.org/alexzzzz/unity-c-5.0-and-6.0-integration/src)
47+
found a smart way to workaround this as following:
1048

11-
### How to replace a builtin compiler with an incremental one.
49+
```csharp
50+
[InitializeOnLoad]
51+
public static class CSharp60SupportActivator {
52+
static CSharp60SupportActivator() {
53+
var list = GetSupportedLanguages();
54+
list.RemoveAll(language => language is CSharpLanguage);
55+
list.Add(new CustomCSharpLanguage());
56+
}
57+
private static List<SupportedLanguage> GetSupportedLanguages() {
58+
var fieldInfo = typeof(ScriptCompilers).GetField("_supportedLanguages", BindingFlags.NonPublic | BindingFlags.Static);
59+
var languages = (List<SupportedLanguage>)fieldInfo.GetValue(null);
60+
return languages;
61+
}
62+
}
63+
```
1264

13-
TODO: Plugin & UniversalCompiler
65+
It is an internal feature for Unity3D and cannot be accessed from external DLLs.
66+
But to make it, he renamed plugin DLL to one of internal friend DLL names,
67+
`Unity.PureCSharpTests`.
68+
69+
### Roslyn
70+
71+
[Roslyn](https://github.com/dotnet/roslyn) is new open-source project providings
72+
C# and Visual Basic compilers. With this project it's really easy to use features
73+
that compiler can provide like parsing, analysing and even compiling itself.
74+
75+
Just with [Microsoft.CodeAnalysis.CSharp](https://www.nuget.org/packages/Microsoft.CodeAnalysis/),
76+
compiler can be written without any hard work.
77+
78+
```csharp
79+
// Minimal C# compiler
80+
Assembly Build(string[] sourcePaths, string[] referencePaths, string[] defines) {
81+
var assemblyName = Path.GetRandomFileName();
82+
var syntaxTrees = sourcePaths.Select(file => CSharpSyntaxTree.ParseText(File.ReadAllText(file), path: file)).ToArray();
83+
var references = referencePaths.Select(file => MetadataReference.CreateFromFile(file)).ToArray();
84+
var compilation = CSharpCompilation.Create(assemblyName, syntaxTrees, references);
85+
using (var ms = new MemoryStream())
86+
compilation.Emit(ms);
87+
}
88+
```
1489

1590
### Incremental compiler
1691

17-
TODO: Roslyn, Server, MdbWriter, Reuse DLLs
92+
Roslyn C# compiler does two steps to compile from API view.
93+
First one is parsing step. It loads sources and parse it.
94+
Second one is emitting step. From compiler's view, most of work is done here
95+
such as semantic analysis, optimization and code generation.
96+
Because library users cannot access internal phase in emitting step,
97+
incremental compiler is written in a simple way like:
98+
99+
Full compilation for first time build.
100+
```csharp
101+
var syntaxTrees = sourcePaths.Select(file => /* PARSE */);
102+
var references = referencePaths.Select(file => /* LOAD */);
103+
var compilation = CSharpCompilation.Create(syntaxTrees, references);
104+
compilation.Emit(ms);
105+
```
106+
107+
Incremental compilation for subsequent builds.
108+
```csharp
109+
compliation = compliation.RemoveReferences(oldLoadedReferences)
110+
.AddReferences(load(newReference))
111+
.RemoveSyntaxTrees(oldSourceSyntaxTrees)
112+
.AddSyntaxTrees(parse(newSource))
113+
compilation.Emit(ms);
114+
```
115+
116+
By telling changes in a project to compilation object, rolsyn can use
117+
pre-parsed syntax trees and some informations that I wish.
118+
119+
### Compile server
120+
121+
Ok. Keeping compilation object and reusing can make an incremental compiler.
122+
But where can we put this object on? Everytime Unity3D want to build DLL, it
123+
invokes C# compiler.
124+
C# compiler is running awhile, exits and leaves built DLL for Unity3D, which means
125+
it should throw away compilation object.
126+
127+
We have two options to keep this object permanent:
128+
129+
1. Save all information to disc and load them at next invocation.
130+
1. Make a compiler server. Let it alive while Unity3D is alive and
131+
every compile invocation will be forwared to it.
132+
133+
Because first one involves an IO intensive process which make work slow
134+
and I don't know how to (de)serialize a compilation object of Roslyn,
135+
second one is chosen.
136+
137+
When incremental compiler is requested to compile assembly,
138+
it find a compiler server process. If there is no one, it spawns a compile server.
139+
Compiler forwards a compilation request to this server and waits for results.
140+
Compile server processes this request, return it to requester and keep
141+
this intermediate object for subsequent requests.
142+
143+
To communicate between compile client and server,
144+
[WCF on Named pipe](https://msdn.microsoft.com/en-us/library/ms733769%28v=vs.110%29.aspx) is used
145+
to make this tool quickly even Mono doesn't support it.
146+
147+
## Modification for Unity3D/Mono
148+
149+
### Reuse prebuilt DLLs
150+
151+
When sources in Assembly-CSharp are edited, Unity3D builds Assembly-CSharp.
152+
But at the same time Unity3D also build Assembly-CSharp-Editor
153+
because it is dependent on Assembly-CSharp. It's a natural building protocol.
154+
155+
But most of time it is not useful.
156+
If there is no changed in sources of Assembly-CSharp-Editor, it is considered
157+
relatively safe to skip rebuilding Assembly-CSharp-Editor.
158+
159+
TODO: TEST & SHOW ERROR CASE
160+
TODO: DETAILED DESCRIPTION OF OPTION
161+
162+
PrebuiltOutputReuse in IncrementalCompiler.xml
163+
164+
- WhenNoChange :
165+
When nothing changed in sources and references, reuse prebuilt results.
166+
- WhenNoSourceChange :
167+
When nothing changed in sources and references (except update of some references),
168+
reuse prebuilt results.
169+
170+
### MDB instead of PDB
171+
172+
Roslyn emits PDB file as a debugging symbol. But Unity3D cannot understand PDB file
173+
because it's based on mono compiler. To make unity3D get proper debugging information,
174+
MDB file should be constructed and they already provided a tool to convert pdb to mdb and
175+
jbevain update support output of visual studio 2015 [pdb2mdb](https://gist.github.com/jbevain/ba23149da8369e4a966f)
176+
177+
So simple process supporting unity3d is
178+
- Emit pdb via roslyn
179+
- Convert pdb to mdb with pdb2mdb tool
180+
181+
But how about emitting mdb from roslyn directly? it can save time for generating and converting pdb?
182+
A guy at Xamarain already tried it but it is not updated now. So I grab his work and update it to work latest roslyn.
183+
184+
### Renaming symbol for UnityVS debugging
185+
186+
[UnityVS](https://www.visualstudio.com/features/unitytools-vs)
187+
does a lot of hard works to support Unity3D application debugging in Visual Studio.
188+
.NET Assembly is required to provide debugging information to debugger such as
189+
variable names, source information for IL code and etc but Visual Studio cannot
190+
understand Mono assembly well enough to support debugging.
191+
To deal with this problem, UnityVS examines assemblies carefully
192+
and making use of common pattern of mono DLLs by itself.
193+
194+
.NET assembly built with Roslyn, however, is different with one with Mono.
195+
Therefore UnityVS misses some information and cannot give enough debugging
196+
information to users.
197+
198+
For example,
199+
200+
```csharp
201+
IEnumerator TestCoroutine(int a, Func<int, string> b) {
202+
var v = a;
203+
yield return null;
204+
GetComponent<Text>().text = v.ToString();
205+
v += 1;
206+
yield return null; // Breakpoint here
207+
GetComponent<Text>().text = b(v);
208+
}
209+
```
210+
211+
UnityVS can show variable v in watch for DLL from Mono3.
212+
213+
```csharp
214+
this = "Text01 (Test01)"
215+
v = 11
216+
a = 10
217+
b = (trimmed)
218+
```
219+
220+
UnityVS cannot show variable v in watch for DLL from Roslyn.
221+
222+
```csharp
223+
this = {Test01+<TestCoroutine>d__1}
224+
Current = null
225+
a = 10
226+
b = (trimmed)
227+
```
228+
229+
There is a difference between Mono and Roslyn for making name of local variables in iterator class.
230+
231+
```csharp
232+
// Mono3
233+
private sealed class <TestCoroutine>c__Iterator0 : IEnumerator<object>, IEnumerator, IDisposable {
234+
internal int a;
235+
internal int <v>__0;
236+
internal Func<int, string> b;
237+
internal Test01 <>f__this;
238+
// trimmed
239+
240+
// Roslyn
241+
private sealed class <TestCoroutine>d__1 : IEnumerator<object>, IEnumerator, IDisposable {
242+
public int a;
243+
public Func<int, string> b;
244+
public Test01 <>4__this;
245+
private int <v>5__1;
246+
// trimmed
247+
```
248+
249+
From disassembled UnityVS DLL, we can know how UnityVS determines which variable
250+
need to be watched in debugging:
251+
252+
```csharp
253+
// SyntaxTree.VisualStudio.Unity.Debugger.Properties.CSharpGeneratorEnvironment
254+
public static bool IsCSharpGenerator(Value thisValue) {
255+
TypeMirror typeMirror = thisValue.GetTypeMirror();
256+
return typeMirror.Name.StartsWith("<") && typeMirror.Name.Contains("__Iterator");
257+
}
258+
public override UnityProperty[] VariableProperties() {
259+
return (from f in base.Type.GetInstanceFields()
260+
where !f.Name.StartsWith("<>") && !f.Name.StartsWith("<$")
261+
where f.Name.StartsWith("<")
262+
where f.Name.Contains(">__")
263+
select f into field
264+
select base.Field(field, field.Name.Substring(1, field.Name.IndexOf(">__", 1, System.StringComparison.InvariantCulture) - 1))).ToArray<UnityProperty>();
265+
}
266+
```
267+
268+
UnityVS has following rules for looking for iterator.
269+
270+
- The name of iterator class is form of <.\*__Iterator.\*
271+
- The name of local variable is form of <.+>__?
272+
273+
Only thing to do is renaming name of iterator class and field variable.
274+
But how can we do it? There're two ways:
275+
276+
1. Renaming those after building DLL with vanila roslyn.
277+
1. Building DLL with modified roslyn following UnityVS rules.
278+
279+
Because renaming after building need time-consuming DLL analysis process,
280+
second one was chosen.
281+
282+
Check roslyn source related with this naming.
283+
```csharp
284+
internal static string MakeHoistedLocalFieldName(SynthesizedLocalKind kind, int slotIndex, string localNameOpt = null) {
285+
var result = PooledStringBuilder.GetInstance();
286+
var builder = result.Builder;
287+
builder.Append('<');
288+
if (localNameOpt != null)
289+
builder.Append(localNameOpt);
290+
builder.Append('>');
291+
if (kind == SynthesizedLocalKind.LambdaDisplayClass)
292+
builder.Append((char)GeneratedNameKind.DisplayClassLocalOrField);
293+
else if (kind == SynthesizedLocalKind.UserDefined)
294+
builder.Append((char)GeneratedNameKind.HoistedLocalField); // '5'
295+
else
296+
builder.Append((char)GeneratedNameKind.HoistedSynthesizedLocalField);
297+
builder.Append("__");
298+
builder.Append(slotIndex + 1);
299+
return result.ToStringAndFree();
300+
}
301+
```
302+
303+
Removing `builder.Append((char)GeneratedNameKind.HoistedLocalField);` can solve
304+
the problem and Roslyn source is open and easy to modify.
305+
But I don't want to maintain modified Roslyn source because it's too big.
306+
Instead of roslyn source, rolsyn DLL is modified with Mono.Cecil like:
307+
308+
```csharp
309+
static void Fix_GeneratedNames_MakeHoistedLocalFieldName(MethodDefinition method) {
310+
var il = method.Body.GetILProcessor();
311+
for (var i = 0; i < il.Body.Instructions.Count; i++) {
312+
var inst = il.Body.Instructions[i];
313+
if (inst.OpCode.Code == Code.Ldc_I4_S && (sbyte)inst.Operand == 53) {
314+
for (int j = 0; j < 4; j++)
315+
il.Remove(il.Body.Instructions[i + 2 - j]);
316+
break;
317+
}
318+
}
319+
}
320+
```

0 commit comments

Comments
 (0)