|
1 | 1 | ## Under the hood |
2 | 2 |
|
3 | | -TODO: Write! |
| 3 | +IncrementalCompiler for Unity3D was my fun-project for 2016 new year's holiday. |
| 4 | +At that time I was thinking about slow compilation speed of unity3d-mono and |
| 5 | +wondering if it is possible to make it faster with the minimum effort. |
| 6 | +After a couple of hours digging, it turned out feasible to make it. |
| 7 | + |
| 8 | +## Making incremental compiler |
| 9 | + |
| 10 | +Microsoft / Mono C# compiler doesn't support incremental compilation until now. |
| 11 | +They had "/incremental" option once in old times but lost it already. |
| 12 | +Because C# compiler is really fast and C# language itself is good to |
| 13 | +keep compiler run fast, incremental compilation has not been at the first priority. |
| 14 | +Also community keep there project small and like to separate big project into smaller ones |
| 15 | +to handle this problem indirectly. |
| 16 | + |
| 17 | +Unity3D uses mono C# infrastructure. It seems hard to make compiler work faster |
| 18 | +because Unity3D still uses old slow Mono 3 and doesn't provide an easy way to divide |
| 19 | +big project into small one. So I decided to make incremental compiler for Unity3D. |
4 | 20 |
|
5 | 21 | ### Unity build process |
6 | 22 |
|
| 23 | +Unity3D makes three projects from Assets directory if your project has only C# sources. |
| 24 | + |
7 | 25 | - Assembly-CSharp-firstpass |
| 26 | + - Consists of scripts in Plugins directory. |
| 27 | + - No dependency. |
| 28 | + - It's rare to build this project because these sources are not modified. |
8 | 29 | - Assembly-CSharp |
| 30 | + - Consists of almost sources in Assets. |
| 31 | + (except for sources in Plugins and Editor directory) |
| 32 | + - Depends on Assembly-CSharp-firstpass |
| 33 | + - This project is always built because these sources are main place for common work. |
9 | 34 | - Assembly-CSharp-Editor |
| 35 | + - Consists of scripts in Editor directory. |
| 36 | + - Depends on Assembly-CSharp |
| 37 | + - This project is always built not because these sources are modified but by building |
| 38 | + dependent Assembly-CSharp. |
| 39 | + |
| 40 | +[Unity Manual: Special Folders and Script Compilation Order](http://docs.unity3d.com/Manual/ScriptCompileOrderFolders.html) |
| 41 | + |
| 42 | +### How to replace a builtin compiler with a new one. |
| 43 | + |
| 44 | +At first, Replacing mono compiler in unity directory with new one was considered. |
| 45 | +But it is not an easy way for users and also intrusive way to affect whole projects. |
| 46 | +However [alexzzzz](https://bitbucket.org/alexzzzz/unity-c-5.0-and-6.0-integration/src) |
| 47 | +found a smart way to workaround this as following: |
10 | 48 |
|
11 | | -### How to replace a builtin compiler with an incremental one. |
| 49 | +```csharp |
| 50 | +[InitializeOnLoad] |
| 51 | +public static class CSharp60SupportActivator { |
| 52 | + static CSharp60SupportActivator() { |
| 53 | + var list = GetSupportedLanguages(); |
| 54 | + list.RemoveAll(language => language is CSharpLanguage); |
| 55 | + list.Add(new CustomCSharpLanguage()); |
| 56 | + } |
| 57 | + private static List<SupportedLanguage> GetSupportedLanguages() { |
| 58 | + var fieldInfo = typeof(ScriptCompilers).GetField("_supportedLanguages", BindingFlags.NonPublic | BindingFlags.Static); |
| 59 | + var languages = (List<SupportedLanguage>)fieldInfo.GetValue(null); |
| 60 | + return languages; |
| 61 | + } |
| 62 | +} |
| 63 | +``` |
12 | 64 |
|
13 | | -TODO: Plugin & UniversalCompiler |
| 65 | +It is an internal feature for Unity3D and cannot be accessed from external DLLs. |
| 66 | +But to make it, he renamed plugin DLL to one of internal friend DLL names, |
| 67 | +`Unity.PureCSharpTests`. |
| 68 | + |
| 69 | +### Roslyn |
| 70 | + |
| 71 | +[Roslyn](https://github.com/dotnet/roslyn) is new open-source project providings |
| 72 | +C# and Visual Basic compilers. With this project it's really easy to use features |
| 73 | +that compiler can provide like parsing, analysing and even compiling itself. |
| 74 | + |
| 75 | +Just with [Microsoft.CodeAnalysis.CSharp](https://www.nuget.org/packages/Microsoft.CodeAnalysis/), |
| 76 | +compiler can be written without any hard work. |
| 77 | + |
| 78 | +```csharp |
| 79 | +// Minimal C# compiler |
| 80 | +Assembly Build(string[] sourcePaths, string[] referencePaths, string[] defines) { |
| 81 | + var assemblyName = Path.GetRandomFileName(); |
| 82 | + var syntaxTrees = sourcePaths.Select(file => CSharpSyntaxTree.ParseText(File.ReadAllText(file), path: file)).ToArray(); |
| 83 | + var references = referencePaths.Select(file => MetadataReference.CreateFromFile(file)).ToArray(); |
| 84 | + var compilation = CSharpCompilation.Create(assemblyName, syntaxTrees, references); |
| 85 | + using (var ms = new MemoryStream()) |
| 86 | + compilation.Emit(ms); |
| 87 | +} |
| 88 | +``` |
14 | 89 |
|
15 | 90 | ### Incremental compiler |
16 | 91 |
|
17 | | -TODO: Roslyn, Server, MdbWriter, Reuse DLLs |
| 92 | +Roslyn C# compiler does two steps to compile from API view. |
| 93 | +First one is parsing step. It loads sources and parse it. |
| 94 | +Second one is emitting step. From compiler's view, most of work is done here |
| 95 | +such as semantic analysis, optimization and code generation. |
| 96 | +Because library users cannot access internal phase in emitting step, |
| 97 | +incremental compiler is written in a simple way like: |
| 98 | + |
| 99 | +Full compilation for first time build. |
| 100 | +```csharp |
| 101 | +var syntaxTrees = sourcePaths.Select(file => /* PARSE */); |
| 102 | +var references = referencePaths.Select(file => /* LOAD */); |
| 103 | +var compilation = CSharpCompilation.Create(syntaxTrees, references); |
| 104 | +compilation.Emit(ms); |
| 105 | +``` |
| 106 | + |
| 107 | +Incremental compilation for subsequent builds. |
| 108 | +```csharp |
| 109 | +compliation = compliation.RemoveReferences(oldLoadedReferences) |
| 110 | + .AddReferences(load(newReference)) |
| 111 | + .RemoveSyntaxTrees(oldSourceSyntaxTrees) |
| 112 | + .AddSyntaxTrees(parse(newSource)) |
| 113 | +compilation.Emit(ms); |
| 114 | +``` |
| 115 | + |
| 116 | +By telling changes in a project to compilation object, rolsyn can use |
| 117 | +pre-parsed syntax trees and some informations that I wish. |
| 118 | + |
| 119 | +### Compile server |
| 120 | + |
| 121 | +Ok. Keeping compilation object and reusing can make an incremental compiler. |
| 122 | +But where can we put this object on? Everytime Unity3D want to build DLL, it |
| 123 | +invokes C# compiler. |
| 124 | +C# compiler is running awhile, exits and leaves built DLL for Unity3D, which means |
| 125 | +it should throw away compilation object. |
| 126 | + |
| 127 | +We have two options to keep this object permanent: |
| 128 | + |
| 129 | + 1. Save all information to disc and load them at next invocation. |
| 130 | + 1. Make a compiler server. Let it alive while Unity3D is alive and |
| 131 | + every compile invocation will be forwared to it. |
| 132 | + |
| 133 | +Because first one involves an IO intensive process which make work slow |
| 134 | +and I don't know how to (de)serialize a compilation object of Roslyn, |
| 135 | +second one is chosen. |
| 136 | + |
| 137 | +When incremental compiler is requested to compile assembly, |
| 138 | +it find a compiler server process. If there is no one, it spawns a compile server. |
| 139 | +Compiler forwards a compilation request to this server and waits for results. |
| 140 | +Compile server processes this request, return it to requester and keep |
| 141 | +this intermediate object for subsequent requests. |
| 142 | + |
| 143 | +To communicate between compile client and server, |
| 144 | +[WCF on Named pipe](https://msdn.microsoft.com/en-us/library/ms733769%28v=vs.110%29.aspx) is used |
| 145 | +to make this tool quickly even Mono doesn't support it. |
| 146 | + |
| 147 | +## Modification for Unity3D/Mono |
| 148 | + |
| 149 | +### Reuse prebuilt DLLs |
| 150 | + |
| 151 | +When sources in Assembly-CSharp are edited, Unity3D builds Assembly-CSharp. |
| 152 | +But at the same time Unity3D also build Assembly-CSharp-Editor |
| 153 | +because it is dependent on Assembly-CSharp. It's a natural building protocol. |
| 154 | + |
| 155 | +But most of time it is not useful. |
| 156 | +If there is no changed in sources of Assembly-CSharp-Editor, it is considered |
| 157 | +relatively safe to skip rebuilding Assembly-CSharp-Editor. |
| 158 | + |
| 159 | +TODO: TEST & SHOW ERROR CASE |
| 160 | +TODO: DETAILED DESCRIPTION OF OPTION |
| 161 | + |
| 162 | +PrebuiltOutputReuse in IncrementalCompiler.xml |
| 163 | + |
| 164 | +- WhenNoChange : |
| 165 | + When nothing changed in sources and references, reuse prebuilt results. |
| 166 | +- WhenNoSourceChange : |
| 167 | + When nothing changed in sources and references (except update of some references), |
| 168 | + reuse prebuilt results. |
| 169 | + |
| 170 | +### MDB instead of PDB |
| 171 | + |
| 172 | +Roslyn emits PDB file as a debugging symbol. But Unity3D cannot understand PDB file |
| 173 | +because it's based on mono compiler. To make unity3D get proper debugging information, |
| 174 | +MDB file should be constructed and they already provided a tool to convert pdb to mdb and |
| 175 | +jbevain update support output of visual studio 2015 [pdb2mdb](https://gist.github.com/jbevain/ba23149da8369e4a966f) |
| 176 | + |
| 177 | +So simple process supporting unity3d is |
| 178 | + - Emit pdb via roslyn |
| 179 | + - Convert pdb to mdb with pdb2mdb tool |
| 180 | + |
| 181 | +But how about emitting mdb from roslyn directly? it can save time for generating and converting pdb? |
| 182 | +A guy at Xamarain already tried it but it is not updated now. So I grab his work and update it to work latest roslyn. |
| 183 | + |
| 184 | +### Renaming symbol for UnityVS debugging |
| 185 | + |
| 186 | +[UnityVS](https://www.visualstudio.com/features/unitytools-vs) |
| 187 | +does a lot of hard works to support Unity3D application debugging in Visual Studio. |
| 188 | +.NET Assembly is required to provide debugging information to debugger such as |
| 189 | +variable names, source information for IL code and etc but Visual Studio cannot |
| 190 | +understand Mono assembly well enough to support debugging. |
| 191 | +To deal with this problem, UnityVS examines assemblies carefully |
| 192 | +and making use of common pattern of mono DLLs by itself. |
| 193 | + |
| 194 | +.NET assembly built with Roslyn, however, is different with one with Mono. |
| 195 | +Therefore UnityVS misses some information and cannot give enough debugging |
| 196 | +information to users. |
| 197 | + |
| 198 | +For example, |
| 199 | + |
| 200 | +```csharp |
| 201 | +IEnumerator TestCoroutine(int a, Func<int, string> b) { |
| 202 | + var v = a; |
| 203 | + yield return null; |
| 204 | + GetComponent<Text>().text = v.ToString(); |
| 205 | + v += 1; |
| 206 | + yield return null; // Breakpoint here |
| 207 | + GetComponent<Text>().text = b(v); |
| 208 | +} |
| 209 | +``` |
| 210 | + |
| 211 | +UnityVS can show variable v in watch for DLL from Mono3. |
| 212 | + |
| 213 | +```csharp |
| 214 | +this = "Text01 (Test01)" |
| 215 | +v = 11 |
| 216 | +a = 10 |
| 217 | +b = (trimmed) |
| 218 | +``` |
| 219 | + |
| 220 | +UnityVS cannot show variable v in watch for DLL from Roslyn. |
| 221 | + |
| 222 | +```csharp |
| 223 | +this = {Test01+<TestCoroutine>d__1} |
| 224 | +Current = null |
| 225 | +a = 10 |
| 226 | +b = (trimmed) |
| 227 | +``` |
| 228 | + |
| 229 | +There is a difference between Mono and Roslyn for making name of local variables in iterator class. |
| 230 | + |
| 231 | +```csharp |
| 232 | +// Mono3 |
| 233 | +private sealed class <TestCoroutine>c__Iterator0 : IEnumerator<object>, IEnumerator, IDisposable { |
| 234 | + internal int a; |
| 235 | + internal int <v>__0; |
| 236 | + internal Func<int, string> b; |
| 237 | + internal Test01 <>f__this; |
| 238 | + // trimmed |
| 239 | +
|
| 240 | +// Roslyn |
| 241 | +private sealed class <TestCoroutine>d__1 : IEnumerator<object>, IEnumerator, IDisposable { |
| 242 | + public int a; |
| 243 | + public Func<int, string> b; |
| 244 | + public Test01 <>4__this; |
| 245 | + private int <v>5__1; |
| 246 | + // trimmed |
| 247 | +``` |
| 248 | + |
| 249 | +From disassembled UnityVS DLL, we can know how UnityVS determines which variable |
| 250 | +need to be watched in debugging: |
| 251 | + |
| 252 | +```csharp |
| 253 | +// SyntaxTree.VisualStudio.Unity.Debugger.Properties.CSharpGeneratorEnvironment |
| 254 | +public static bool IsCSharpGenerator(Value thisValue) { |
| 255 | + TypeMirror typeMirror = thisValue.GetTypeMirror(); |
| 256 | + return typeMirror.Name.StartsWith("<") && typeMirror.Name.Contains("__Iterator"); |
| 257 | +} |
| 258 | +public override UnityProperty[] VariableProperties() { |
| 259 | + return (from f in base.Type.GetInstanceFields() |
| 260 | + where !f.Name.StartsWith("<>") && !f.Name.StartsWith("<$") |
| 261 | + where f.Name.StartsWith("<") |
| 262 | + where f.Name.Contains(">__") |
| 263 | + select f into field |
| 264 | + select base.Field(field, field.Name.Substring(1, field.Name.IndexOf(">__", 1, System.StringComparison.InvariantCulture) - 1))).ToArray<UnityProperty>(); |
| 265 | +} |
| 266 | +``` |
| 267 | + |
| 268 | +UnityVS has following rules for looking for iterator. |
| 269 | + |
| 270 | + - The name of iterator class is form of <.\*__Iterator.\* |
| 271 | + - The name of local variable is form of <.+>__? |
| 272 | + |
| 273 | +Only thing to do is renaming name of iterator class and field variable. |
| 274 | +But how can we do it? There're two ways: |
| 275 | + |
| 276 | + 1. Renaming those after building DLL with vanila roslyn. |
| 277 | + 1. Building DLL with modified roslyn following UnityVS rules. |
| 278 | + |
| 279 | +Because renaming after building need time-consuming DLL analysis process, |
| 280 | +second one was chosen. |
| 281 | + |
| 282 | +Check roslyn source related with this naming. |
| 283 | +```csharp |
| 284 | +internal static string MakeHoistedLocalFieldName(SynthesizedLocalKind kind, int slotIndex, string localNameOpt = null) { |
| 285 | + var result = PooledStringBuilder.GetInstance(); |
| 286 | + var builder = result.Builder; |
| 287 | + builder.Append('<'); |
| 288 | + if (localNameOpt != null) |
| 289 | + builder.Append(localNameOpt); |
| 290 | + builder.Append('>'); |
| 291 | + if (kind == SynthesizedLocalKind.LambdaDisplayClass) |
| 292 | + builder.Append((char)GeneratedNameKind.DisplayClassLocalOrField); |
| 293 | + else if (kind == SynthesizedLocalKind.UserDefined) |
| 294 | + builder.Append((char)GeneratedNameKind.HoistedLocalField); // '5' |
| 295 | + else |
| 296 | + builder.Append((char)GeneratedNameKind.HoistedSynthesizedLocalField); |
| 297 | + builder.Append("__"); |
| 298 | + builder.Append(slotIndex + 1); |
| 299 | + return result.ToStringAndFree(); |
| 300 | +} |
| 301 | +``` |
| 302 | + |
| 303 | +Removing `builder.Append((char)GeneratedNameKind.HoistedLocalField);` can solve |
| 304 | +the problem and Roslyn source is open and easy to modify. |
| 305 | +But I don't want to maintain modified Roslyn source because it's too big. |
| 306 | +Instead of roslyn source, rolsyn DLL is modified with Mono.Cecil like: |
| 307 | + |
| 308 | +```csharp |
| 309 | +static void Fix_GeneratedNames_MakeHoistedLocalFieldName(MethodDefinition method) { |
| 310 | + var il = method.Body.GetILProcessor(); |
| 311 | + for (var i = 0; i < il.Body.Instructions.Count; i++) { |
| 312 | + var inst = il.Body.Instructions[i]; |
| 313 | + if (inst.OpCode.Code == Code.Ldc_I4_S && (sbyte)inst.Operand == 53) { |
| 314 | + for (int j = 0; j < 4; j++) |
| 315 | + il.Remove(il.Body.Instructions[i + 2 - j]); |
| 316 | + break; |
| 317 | + } |
| 318 | + } |
| 319 | +} |
| 320 | +``` |
0 commit comments