-
Notifications
You must be signed in to change notification settings - Fork 225
PICMI Species handling refactoring #5578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PICMI Species handling refactoring #5578
Conversation
08d0708 to
854c131
Compare
|
@PrometheusPi @pordyna @ikbuibui Should be ready for full review now. Please note (and read!) the updated description. |
pordyna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review, I just went over the files in my IDE, like suggested. Still need to look deeper into it.
| append_this = True | ||
| for rhs in result: | ||
| check_for_conflict(lhs, rhs) | ||
| merge_success = try_update_with(rhs, lhs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if try_update_with updates rhs into a state that changes the outcome of check_for_conflict that was already checked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting question. Let's make an assumption to start off: try_update_with does not create new conflicts by itself, i.e., any conflict that the update-rhs has was already present in original-rhs or lhs. We can later revisit that assumption.
Control flow:
- Starting point:
resultcontains a partial list and (the instance that in your question will be)rhsis supposed to be added toresult. rhschecks for conflicts with everything already existingresultand doesn't find any.resultgets filled up further and every new entry will check for conflicts with the already addedrhs.lhsis supposed to be added and checks for conflicts with everything inresultup torhswhere it is merged.resultis filled up to completion while every new addition checks against updated-rhsfor conflicts.- Finally, we arrive at
result = [..., updated-rhs, ..., <position where lhs would have been added>, ...].
This creates the following checking situation:
rhswas checked against everything because- original-
rhswas checked against everything left of would-be-lhsposition. - updated-
rhswas checked against everything right of updated-rhs. - We assumed that
try_update_withdoesn't create new conflicts.
- original-
lhswas checked against everything left of updated-rhsand right of would-be-lhsposition because- it was checked against everything left of updated-
rhsbefore merging withrhs - updated-
rhswas checked against everything right of would-be-lhsposition.
- it was checked against everything left of updated-
This leaves a hole between updated-rhs and would-be-lhs position that lhs was never checked against. This hole could be closed by removing the short circuiting.
Is that assumption reasonable? We can just make it a requirement and revisit it if that's wrong. Checking for conflicts is anyhow more of a convenience feature. If there's conflicts it would typically blow up at some point. The main functionality of check_for_conflict is to fail early with a clear error message and not something cryptic about ambiguous linking or the like during compilation.
What are your thoughts about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm really not happy about this function. It's a difficult protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A different approach would be to maintain a list of possible actions, conflicts, etc. that we apply to the requirements:
# Each of these returns a new list.
# The agreement is that they act on all pertinent entries
# and leave the rest as-is.
_RESOLUTION_ACTIONS = [
# Checks for conflicts before dropping:
DropDuplicateConstantsAndAttributes,
MergeSimpleDensities,
MergeIonizationModels,
...
]
# I'd probably rename this to resolve_requirements or the like
# but for the sake of consistency I'm sticking with _make_unique for now:
def _make_unique(requirements, actions=_RESOLUTION_ACTIONS):
if len(actions) == 0:
return requirements
return _make_unique(actions[0](requirements), actions[1:])Advantages:
- With this approach it is certainly easier to understand what actions are performed to resolve the requirements.
- The process is stateless./The style is functional which I like.
Disadvantages:
- Customisation of this pipeline would be centralised, i.e., if anybody wants to add a new action they'd have to hack something into this list. (Also, we've got a mutable default argument which is always dangerous.)
- I don't like relying on a global constant.
- One could instead hard-code the actions one-after-another into the function body but that's even less customisable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further designs with the spirit of shifting responsibilities to an earlier stage:
Species.register_requirementscould already run the checking and the merging, etc. That would look similar to one of the above but it would be one loop each time a new element is registered instead of one big N^2 loop at the end. That doesn't sound like a significant gain but would already simplify things a little.- We could expose the
Species._requirementsvariable and allow registration via direct modification of that. So, the registering class is responsible for keeping requirements in a consistent state. Species.register_requirementscould let the freshly registered requirement handle this:
# trait to allow custom actions to be performed and provide some defaults otherwise:
def resolve(new_req, requirements):
if hasattr(new_req, "resolve"):
return new_req.resolve(requirements)
if isinstance(new_req, Constant):
if not is_unique_in(new_req, Constant):
raise ValueError(...)
return requirements + [new_req]
...
def register_requirements(self, new_requirements):
for req in requirements:
self.requirements = resolve(req, self.requirements)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pordyna @PrometheusPi You might want to add opinions on this design decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look it up in the GoF again at home.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ill take a closer look. As an aside for some interesting reading this problem reminds me of this paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the check_for_conflict(lhs, rhs) simply be moved into a separate loop after the merging is done with try_update_with(rhs, lhs)? Is the check before merging somehow different from after merging? Is one check at the end not sufficient?
PrometheusPi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some minor request for changes and several questions, some of them might indicate further changes.
I did not yet review:
lib/python/picongpu/pypicongpu/simulation.pylib/python/picongpu/picmi/species_requirements.pylib/python/picongpu/picmi/species.pylib/python/picongpu/picmi/simulation.py
and will do this (hopefully) tomorrow.
| return ( | ||
| isinstance(other, SimpleDensityOperation) | ||
| and other.metadata.kwargs["profile"] == self.metadata.kwargs["profile"] | ||
| and other.metadata.kwargs["layout"] == self.metadata.kwargs["layout"] | ||
| and (self.metadata.kwargs["species"].extend(other.metadata.kwargs["species"]) or True) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really find this syntax confusing every time I see it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is admittedly a tiny bit code-golfy that there is a mutating operation in there. Would you be more comfortable with an explicit form?
| return ( | |
| isinstance(other, SimpleDensityOperation) | |
| and other.metadata.kwargs["profile"] == self.metadata.kwargs["profile"] | |
| and other.metadata.kwargs["layout"] == self.metadata.kwargs["layout"] | |
| and (self.metadata.kwargs["species"].extend(other.metadata.kwargs["species"]) or True) | |
| ) | |
| if ( | |
| isinstance(other, SimpleDensityOperation) | |
| and other.metadata.kwargs["profile"] == self.metadata.kwargs["profile"] | |
| and other.metadata.kwargs["layout"] == self.metadata.kwargs["layout"] | |
| ): | |
| self.metadata.kwargs["species"].extend(other.metadata.kwargs["species"]) | |
| return True | |
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ikbuibui what is your opinion on @chillenzer's suggestion?
| picmistandard.PICMI_Simulation.__init__(self, **keyword_arguments) | ||
|
|
||
| if picongpu_typical_ppc is not None and picongpu_typical_ppc <= 0: | ||
| raise ValueError(f"Typical ppc should be > 0, not {self.picongpu_typical_ppc=}.") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the PICMI_Simulation.__init__ modify picongpu_typical_ppc? else maybe the check should be called before the base class init
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.
Fine with me. Do you have a particular reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| append_this = True | ||
| for rhs in result: | ||
| check_for_conflict(lhs, rhs) | ||
| merge_success = try_update_with(rhs, lhs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the check_for_conflict(lhs, rhs) simply be moved into a separate loop after the merging is done with try_update_with(rhs, lhs)? Is the check before merging somehow different from after merging? Is one check at the end not sufficient?
| mi, ma = reduce(lambda lhs, rhs: (min(lhs[0], rhs), max(lhs[1], rhs)), iterable, (1000, 0)) | ||
| return (ma - mi) // 2 + mi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the (1000,0)? and why the integer division?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1000, 0) is the neutral element of the minmax operation on the interval [0, 1000]. You might have follow-up question like: "Why on earth would you want to put 'the neutral element of the minmax operation on the interval [0, 1000]' in there?" To which I reply: "Okay, fine! I'll choose a less arbitrary range like (-np.inf, np.inf)." On a more serious note: This is currently used for typical ppc, so the values under consideration are on the order of 1-10 and they are all integer. The latter made me try not to reach for floats here.
Integer division because I want to have an integer. Maybe I should rename that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ikbuibui What is your opinion on @chillenzer's reply.
And @chillenzer what is the status of your "Maybe I should rename that function."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the need for a neutral element by using the first element as the starting point.
I've made the function "_private" and added a small docstring clarifying the intent.
aedc4d4 to
ff535dd
Compare
|
Switched to eager requirement resolution and added some not-at-all polished tests. We'll have to get back to this and properly reconsider the internal protocol. I think the existing code has sufficient encapsulation to not really affect other changes. Check if you're happy now! |
PrometheusPi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions that need clarification. Looks otherwiese good.
| mi, ma = reduce(lambda lhs, rhs: (min(lhs[0], rhs), max(lhs[1], rhs)), iterable, (1000, 0)) | ||
| return (ma - mi) // 2 + mi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ikbuibui What is your opinion on @chillenzer's reply.
And @chillenzer what is the status of your "Maybe I should rename that function."?
e48b5d8
into
ComputationalRadiationPhysics:dev
Species Refactoring
This PR is a complete rewrite of our
picmi.Speciesclass and the related protocols. It is somewhat related to #5500, although not quite aligned with the original vision of how that's going to proceed. This PR removes PyPIConGPU as an officially supported interface and removes all pypicongpu unit tests.New Species protocol
The previous protocol of how species were handled in PyPIConGPU/PICMI had several significant drawbacks. The two main ones are:
InitManagerinstance that could not be re-used after having been "baked" once.The main problem that this construction tried address is the fact that an inversion of some dependencies happens from the PICMI level to the PyPIConGPU level. For example, on the PICMI level the
Speciescontains theinitial_distributionbut on the PyPIConGPU level we have aSimpleDensityoperation that contains a (list of)Species. Also, various entities in the code impose requirements onto theSpeciesthat it must be informed about.The new species protocol is as follows:
picmi.Speciesprovides aregister_requirementspublic method. Consumers of the species are allowed to call this before anyget_as_pypicongpu()calls to register requirements such as "you need to have a mass (optionally: of value =...)". They should typically do so in their__init__.Constants andAttributes are considered currently)DelayedConstructionof any PyPIConGPU class that cannot be instantiated directly. This typically happens when that class depends on a species itself.DependsOn(species=...)instance which (hopefully unsurprisingly) signals a dependency on another species (e.g. an ion speciesDependsOn(species=electrons)which they emit upon ionisation).Simulation.get_as_pypicongpuall information is available by definition and a partial ordering is defined viaDependsOn(cyclic dependencies are currently not checked). The translation can callget_as_pypicongpuon any entity, including theSpeciesinstances. This call is stateless, i.e., it can be repeated and the order of execution doesn't matter. Thus any entity can independently construct any PyPIConGPU class instances it needs.As a side remark, one should note that it is undecided as of now if
register_requirementsshould be considered public interface. It is designed as an implementation detail of our internal species handling. But it might provide an unintended resolution to some feature requests from @pordyna and @PrometheusPi. Their experience with the new protocol will show if the user experience is sufficiently intuitive to support its public usage. It is unlikely to make it into the PICMI standard.For the documentation sprint:
Dropping PyPIConGPU as an officially supported interface
I'm dropping PyPIConGPU as an officially supported interface. Its design goal was to be a Python representation of the input parameters of PIConGPU. It currently fails to meet this goal for no apparent reason. But it seems quite natural to move more into that direction. This would also resolve some of the above-mentioned unnecessary inter-species relationships. As we are also moving to Pydantic, this layer will literally contain no logic in itself, just a plain representation of data. In that state it will need no testing (beyond for typos which we can already test through PICMI). Once we have reached this state, it will be trivial to add this as an official interface again. For now, supporting, testing and thereby fixing its interface is a maintenance burden because there's thousands of lines of code that need adaption to the new interface. I have removed all PyPIConGPU unit tests. We are now free to change any behaviour unobservable from within PICMI.
In a future refactoring:
Other fixes and features
Some minor fixes, features, etc. sneaked in because they were on my to-do list for quite a while now:
pypicongpu.Runner. It used to justcdaround to its liking which internally messed up the working directory and made it hard to reason about where exactly something was executed (particularly while debugging). I have provided a context manager that makes sure to always return back to where it came from after changing directory for, e.g., runningpic-build.copy_attributes.pybecauseinspect.getmembersaccesses__fields__which is deprecated in Pydantic v2. I'm now checking forpydantic.BaseModeland use themodel_fieldsproperty instead in that case.methodattribute topicmi.Speciesaccording to PICMI standard. It indicates the pusher method used for this particle species. Our species class should now be standard-compliant.BaseModels now.Autodiagnostic.Testing status
I have performed the following testing:
quickandcompilingtests pass (and will be checked in CI again).end-to-endtests pass (and will not yet be checked in CI). Please run them again during review. They should run on any machine with a working PIConGPU environment including openPMD.mainline/devoutput. The result is identical up to whitespaces and the two known features/fixes (typical ppc and pusher method).Implementation details
The main algorithm of requirement resolution is implemented in$\mathcal{O}(N^2)$ for-loop that on every pair of requirements
evaluate_requirementsinpicmi/species.py. It basically filters for the type of requirement it's looking for, evaluates/resolves/??? that filtered collection and runs the delayed constructions. That intermediate "evaluates/resolves/???" step is a nasty, big,These steps are implemented as traits (free functions that run some default behaviour unless the given object(s) override that).
The
DelayedConstructionis an interesting object, make sure you read its doc string. It is my solution to the conflict of requiring the freedom of arbitrary functions while at the same time having enough information/metadata about what is run to meaningfully reason about the intent, uniqueness, etc. that is encapsulated in that instance.For review
Just despair. There's no hope! It's thousands of lines of code and you'll never gonna make it!
...
Just kidding.
To the extent applicable to your situation:
end-to-endtests. (They are not yet run in CI.)mainline/devwith this branch. A convenient way of doing so for me was:I'd not primarily look at the new code in Github's diff view because most of it is a complete rewrite and there's little meaning to the diff. Start exploring it in your normal editor, jump around and get a feel for the code: Start from the
picmi.Simulationclass, specifically itsget_as_pypicongpumethod. Following the control flow will lead you through the algorithm of requirement resolution. Check the species internal requirements added frompicmi.Species.__init__and the ionisation-related changes for examples of registering requirements. Afterwards you'll have covered the main functional code changes.Back in the Github diff view, the remainder of the changes will hopefully feel intuitive and self-explanatory as part of one of the following categories:
There's certainly a lot of bikeshedding applicable. Some namings and placements of code have not yet experienced the kind of careful consideration that they might deserve. Feel free to speak up about those. Some of the major ones I'm aware of:
_make_uniqueinspecies_requirements.pyis poorly named but I'm struggling to come up with a better name because it's quite an overload of responsibilities. That's probably a code smell but it's complex correlations being modeled here.DelayedConstructioncloser to their use cases. Currently, they are all bundled up inspecies_requirements.py. I was uncertain as to how much re-use there was to be expected among the code base when I started out, so I put them all into one central place. That will not scale well I'm afraid._DensityImplis poorly named and badly placed.I have not cleaned up git history in any particular way and I'm uncertain what would be a safe and intuitive way to group these changes. It is admittedly quite a messy history but I'm afraid that the risk of introducing bugs by trying to disentangle and squash individual aspects into a small number of commits is quite high. And a single commit of ~8kloc of changes might also not be what we want.