CoreCLR - 114 projects with mammoth # of cyclic dependencies into projects' global namespace "dumping grounds" #81413
Replies: 8 comments 19 replies
-
@craigajohnson There aren't any restrictions per se. We always appreciate community contributions that help improve the code base. The unmanaged portion of code is very old in parts and has disparate conventions that can make understanding very difficult. The big break down is 3 areas - JIT, GC, and the VM. The JIT and GC are largely internally consistent and have clear contracts with the VM. The VM itself is indeed a "big ball of mud". Namespaces are a great tool, but can create chaos with massive refactors for little practical benefit - readability is important and so is ensuring non-runtime developers can contribute but the majority of development is happening in C#. We are continually pushing to share more code with the other runtimes - NativeAOT and mono, so C# is the preferred sharing solution. Sharing is also possible in pure C but that requires a lot more justification and finesse to implement correctly. That said, understanding the short, medium and long term goal for a refactor is key. Once that is understood it should be incremental and ideally tracked in a larger issue with nice checkboxes - example 1 example 2. As a suggestion, I would start with a small project and experiment with the impact and design. It is much easier to champion a prototype/experiment than a general question of "am I allowed to make this code more readable?". We look forward to all contributions. |
Beta Was this translation helpful? Give feedback.
-
@AaronRobinsonMSFT and @jkotas - that makes rational sense, thank you. And #78852 makes sense why you would value that as a refactor. The sheer scope of unpicking the cyclic dependencies (EDIT - At the namespace level) so there is a one-way flow may make it challenging to justify the effort. My first exploration would be to just experiment with finding some dependency leaf nodes and seeing what the impact might be to "layerize" them. If things look promising, I would build out a larger comprehensive checklist, just wanting to get a sense of whether this type of effort would be useful. I think the value add in any of this effort would not be functionality per se but rather improved intelligibility of some very complex processes, ease of maintenance, ease of layer replacement, etc. For instance, even with the VM <-> JIT and VM <-> GC interfaces in place, there are still static calls back into the VM and other higher order layers. I imagine that would make efforts like migrating various bits from unmanaged to managed more of a challenge? Here's a fun example (see diagram) - the Thread class takes a dependency on EEContract, which makes logical sense. However, EEContract -also- takes a dependency on Thread. Ideally, EEContract would be near the end of the dependency chain and not be reaching back to Thread? If I take ECMA-335 itself as a guide, there seems to be some natural cut-points with the primitives there and it would be excellent if the code represented all of that in a super clean way. Mapping the extremely well-defined concepts there back to the enmeshed soup is a bit of a grok. |
Beta Was this translation helpful? Give feedback.
-
Another example - gcHelpers->AllocateObject. This appears to be super-important but it breaks -so- many rules :) The larger effort would probably be to build out a context and have all of the internal services attached to it with all of the encapsulation guarantees one would expect, eliminating as much as possible the global static helpers/utils/cyclic dependency intricacies. But that would be a super massive undertaking and everyone would hate the merge churn. As it stands though there must be perpetual concern of insidious bugs happening in these lower levels. Spooky stuff :-/ |
Beta Was this translation helpful? Give feedback.
-
@jkotas am I hallucinating, or am I seeing the NativeAOT stuff, along with the [UnmanagedCallersOnly], along with the "Isolated VM" POC concept above, where there would be no reason I couldn't write my own corerunner in C# itself and then host an isolated VM "instance", and then from there begin to hybridize out the C++ stuff and move more and more into a managed environment? Meta dogfooding to eat its own dogfood to make more dogfood to then eat the new dogfood. IsolatedVM POC instance The IsolatedVM would be the root context, and we break free of the static soup but rather instantiate/isolate the subsystems Monkeying in process... |
Beta Was this translation helpful? Give feedback.
-
*** Isolated VM - Throwaway Monkeying *** Naive Flailing
Naive Monkeying - "pal" is static global state with conditional typedefs, heaven help us all
Naive Flailing - Templated IsolatedVM class, move pal here as first attempt
|
Beta Was this translation helpful? Give feedback.
-
@jkotas is this good perf for a dev workstation for tests in debug configuration? |
Beta Was this translation helpful? Give feedback.
-
From https://github.com/dotnet/designs/blob/main/accepted/2020/form-factors.md:
Gotta say - From initial perusal of this repo, I completely and totally disagree with this and would consider the above almost strangely defeatist. I think there are reasonable ways to build this out, but there would be important prerequisites, the most important being (1) isolation in place of statics, and; (2) continuing the good work that started with GC and JIT, building out all interop between subsystems arising from an instantiated root. From there, the runtime can be hybridized safely, retaining the massive implementations of key subsystems where they are right now, but allowing for migrations/cleanups/ports/replacements as needed. This could go on as long as necessary, with little disruption. |
Beta Was this translation helpful? Give feedback.
-
@craigajohnson If your goal is understand the components of a .NET runtime, I've found the CoreCLR NativeAOT runtime to be much more approachable. It shares the GC and the JIT components with CoreCLR (albeit the JIT is used as an ahead-of-time compiler). But the native runtime part is much smaller and simpler. The layering of the runtime is easier to understand as well. Components are roughly, from lowest level to higher level:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Just wrapping my head around the CoreCLR source and the various interdependencies. I am awestruck.
Patient step-debugging (native and managed) along with reading ECMA-335 has yielded at least an initial picture of how the major components/features hang together.
Next step - I used CppDepend (like NDepend - glorious tool) which highlighted a few things:
Observations:
Ideally, there would be a "left to right" flow of higher-order namespaces dependent on lower-order namespaces without the lower-order flowing back to any higher-order dependencies. Based on the big ball of mud / Singleton ball of goo as it stands, this refactor would be somewhat of a gnarly beast.
Are there restrictions on PR contributions on these types of more structural mods? I assume yes.
Beta Was this translation helpful? Give feedback.
All reactions