-
Notifications
You must be signed in to change notification settings - Fork 69
Description
What new feature would you like to see?
We want to perform a simple workflow with Quacc:
- Generate structures.
- Do a simple loop in a sub-flow to run them all.
- Collect results.
It seems easy, but in fact there are plenty of things that could go wrong here. Very often, computational chemistry is complex and things will not go as expected, e.g. calculations will crash, will run out of time, or simply won't converge.
In this case, the above workflow when using Quacc gets much more complicated, if users want to be proactive they can put all kinds of checks in place, i.e. try, except, but with workflow engines it is often not ideal. Then, the easiest route to avoid this issue seems to pre-filter which calculations you want to run based on which ones are already finished.
However, this route is a little bit complicated still:
- There is no way to systematically label calculations in Quacc, some jobs allow to pass "additional_fields" but not all of them, as a result, finding which calculation is which is not easy as you don't have label criteria, you can use atoms.info but that's not ideal.
- Finding which calculation crashed for what reason in the sea of "quacc-xxxxx-...." folders is less than ideal. I strongly believe that the reason for the crash should be somewhere in the results dict.
What would be ideal is that for a given project you have a given database/results folder, and then for each calculation with a given label, Quacc would check if this calculation was already done, and converged, either from the database or from the "quacc_results.json. Settings or keywords might manage this behavior? Other things can be implemented to keep flexibility as well.
Maybe I am mistaken here and that's not what Quacc was made for? Although, for our practical case it is really problematic, and we often end up spending a lot of time look into Quacc's folders.