Skip to content

Conversation

@esoteric-ephemera
Copy link
Collaborator

@esoteric-ephemera esoteric-ephemera commented Oct 4, 2024

Summary

Adding NEB workflows and schemas for VASP / ASE / MLFF. This hopefully brings atomate2 to full parity with the feature set of atomate

@hmlli is leading the port of ApproxNEB flows for VASP

Very open to suggestions, particularly for document schema structure

Key updates:

  • Generic NEB schemas (one for single-barrier/hop NEB, one for combined pathways)
  • Different VASP run logic / makers / validators for NEB jobs
  • Parsing of VASP output handled by new NebTaskDoc class in emmet

Miscellaneous

@esoteric-ephemera esoteric-ephemera changed the title [WIP] Add NEB jobs / workflows [WIP] Add NEB, ApproxNEB jobs / workflows Oct 14, 2024
@esoteric-ephemera esoteric-ephemera changed the title [WIP] Add NEB, ApproxNEB jobs / workflows ådd NEB, ApproxNEB jobs / workflows Jul 3, 2025
@esoteric-ephemera esoteric-ephemera changed the title ådd NEB, ApproxNEB jobs / workflows Add NEB, ApproxNEB jobs / workflows Jul 3, 2025
@esoteric-ephemera
Copy link
Collaborator Author

@JaGeo should be ready to merge in. There are a few simple tests of the workflows in the PR. From more extensive tests (especially of the VASP Approx / NEB) in our group, these seem to be performant. Avoiding adding more for now since these are longer-running flows

Over time, we may want to modify the document schema a bit (I've already found at least one spot where metadata could be improved) but I would lean towards merging in since these are only additions

@davidwaroquiers I'll release a new version after merging

Copy link
Contributor

@gpetretto gpetretto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @esoteric-ephemera, thanks a lot for the great work implementing this.
I did not go through the code in details, but testing it I came across a couple of potential issues that I mentioned in the comments below.

As an additional point I wanted to check if I am getting right that at the moment there is no equivalent of the NebFromEndpointsMaker for the ASE NEB. It is easy to create such a flow manually, but it may be convenient to have something like this as well.

"You must pip install `pymatgen-analysis-diffusion` "
"to generate images with IDPP."
) from exc
return IDPPSolver.from_endpoints(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that here an instance of IDPPSolver is returned. I think that the output of this function should be the output of the IDPPSolver.run() method.

Copy link
Collaborator Author

@esoteric-ephemera esoteric-ephemera Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Will add some logic for separating out the IDPPSolver.run and constructor kwargs

"""
if interpolation_method == NebInterpolation.LINEAR:
return endpoints[0].interpolate(
endpoints[1], nimages=num_images, **interpolation_kwargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen that the pymatgen interpolate method behaves in a somewhat weird way: nimages+1 structures are returned: https://github.com/materialsproject/pymatgen/blob/f9d9fe8e0ce09ef30cc03bcc4e9937d27afd5a6a/src/pymatgen/core/structure.py#L2443
So you get a set of structures which is not nimages in total nor nimages plus the initial and final structure. Which I found quite unexpected.
In addition this is at variance with what the IDPPSolver does. In that case a total of nimages+2 structures is returned.
So the meaning of nimages varies depending on the inteprolation method.

I would suggest passing num_images+1 to interpolate. This would maintain consistency among the methods and would also seem more reasonable.

@JaGeo
Copy link
Member

JaGeo commented Jul 3, 2025

@gpetretto thanks for reviewing and testing!

@esoteric-ephemera
Copy link
Collaborator Author

esoteric-ephemera commented Jul 3, 2025

Thanks @gpetretto! Have drafted up the ASE NEB from endpoints maker, still needs to be incorporated with the forcefields. I'll get back to this next week

@davidwaroquiers
Copy link
Contributor

@JaGeo should be ready to merge in. There are a few simple tests of the workflows in the PR. From more extensive tests (especially of the VASP Approx / NEB) in our group, these seem to be performant. Avoiding adding more for now since these are longer-running flows

Over time, we may want to modify the document schema a bit (I've already found at least one spot where metadata could be improved) but I would lean towards merging in since these are only additions

@davidwaroquiers I'll release a new version after merging

Hi @esoteric-ephemera

Thanks a lot for putting all of this together!

Copy link
Contributor

@gpetretto gpetretto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if this comes as a second round of comments, but I have now tested also the VASP and ApproxNEB and encountered a few issues there as well.

See also the related PR on emmet: materialsproject/emmet#1255

},
}
)
lclimb: bool = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is supposed to be propagated to the NebSetGenerator or if it is a leftover of a previous option, but it looks like it is not used anywhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a holdover from earlier logic, thanks for the catch

}
)
lclimb: bool = True
kpoints_kludge: Kpoints | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be good to add the docstrings for this option, as its use is quite obscure without looking at the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll move it into the main docstr (the explanation was buried in the maker) - will also change the default behavior to use this fix

num_sites = len(images[0])

tags = [os.getcwd()]
is_force_conv = all(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not try to do the math myself to see what the problem could be with this point, but in one of my tests I had an issue. The optimizer considered the forces converged and stopped the optimization after few loops, but this check was False and thus the task was marked as failed.
More in general, is there a need for this check specifically? optimize.run already returns a bool to specify if the optimization converged or not, why not using that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably has to do with the difference in NEB forces (interatomic + spring) vs plain interatomic forces. I'll take this out - I like the idea of using optimize.run to determine the task state, I'll update the other jobs as well

run_vasp(**self.run_vasp_kwargs)

# parse vasp outputs
task_doc = get_vasp_task_document(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a generic comment on the NebFromImagesMaker, but I wanted to mention that I found somewhat confusing the fact that this job returns a NebTaskDoc, because by construction this will only contain the energies of the intermediate images and not those of the endpoints. As a consequence all the barrier analysis and values reported in the output document are wrong.
Even assuming that this is always used inside the NebFromEndpointsMaker the output database will contain two NebTaskDoc coming from the same flow, one of which is incomplete.

I am afraid I don't have a much better way of proceeding to propose. One potential suggestion would be that the output of this job should only be a minimal document with the list of folders and some information about the vasp calculation, relying on the fact that a collect_neb_output Job will be run as a subsequent step.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe there should be an NebIntermediateImageResult class that has this minimal info. Wouldn't need to be in emmet since this is just for book-keeping on the atomate2 side

"PREC": "Normal",
"NSW": 99,
"LCHARG": False,
"IBRION": 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it was just because my example was too trivial, but with the default EDIFF the forces were not converging even for a large number of steps, probably due to the SCF not being properly converged. It instead converged was quite fast as soon as I lowered EDIFF. I don't have many test cases, but if you also encountered this issue it may be worth considering using a lower value as a default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always hard to say with NEB but EDIFF = 1e-6 or 1e-7 might be a safer default

)
run_vasp_kwargs: dict = field(
default_factory=lambda: {
"job_type": JobType.DOUBLE_RELAXATION,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am missing something but it seems that with set_type="image" the INCAR containts ISIF=2 and thus volume is not changing. So what is the point of the double relaxation here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly for continuity / reproducibility of the original atomate version. Might be good to have a legacy / current split of the input sets, just like we do for the EOS workflows

# compatibility with legacy input (list)
if isinstance(inserted_coords_dict, list):
inserted_coords_dict = dict(enumerate(inserted_coords_dict))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a check here to verify that inserted_coords_dict and inserted_coords_combo are consistent. I had a typo in my inserted_coords_combo and this caused an error only late in the Flow execution, with no obvious link to the cause of the error.

the mobile species in ApproxNEB
inserted_coords_dict: dict or list
a dictionary containing site coords (endpoints) for working ions
in the simulation cell
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better specify that these should be fractional coordinates. I had to look at the code to know if they had to be fractional or cartesian.

"""
ep_jobs: list[Job] = []
ep_output: dict[str, dict[str, Any]] = {}
for idx, ep in enumerate(end_point_structures):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mentioned in the docstrings, but I would add a check that only two structures are actually passed in the end_point_structures.
In principle I suppose there would be no particular issue implementing this Flow accepting more than 2 endpoints and calculating all the subsequent jumps. Before checking the docstrings I was expecting it would have been possible to pass more than 2.

ep_relax_output = {}
ep_relax_jobs = []
for ep_index, ep_coords in endpoint_coords.items():
if int(ep_index) in ep_distinct:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is redundant with the other check I am suggesting in CommonApproxNebMaker, but in case this Job ends up being used outside that flow it may be good to make a check here as well. In this loop if one of the keys in ep_distinct is not endpoint_coords the code will just pass without creating the relaxation job for that endpoint, but fail later on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having both is a good fallback

name: str = "Forcefield NEB from images"
force_field_name: str | MLFF = MLFF.Forcefield

@job(data=_FORCEFIELD_DATA_OBJECTS, schema=NebResult)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just realized that here and in all the other jobs the schema is set through the schema argument, but it should be output_schema. Jobflow does not raise an error since a job accepts kwargs.

@esoteric-ephemera
Copy link
Collaborator Author

@gpetretto @JaGeo I think I've addressed all comments - any objections to merging?

@esoteric-ephemera
Copy link
Collaborator Author

Also @fraricci I've added the checks you suggested for NPTBerendsen, plus a test for correctly setting the MD kwargs

@esoteric-ephemera
Copy link
Collaborator Author

@JaGeo noticed I had some typos in the MACE + explicit D3 calculator, fixed those, added a better test, and am going to merge this PR

@JaGeo
Copy link
Member

JaGeo commented Aug 11, 2025

Sound good @esoteric-ephemera !

@esoteric-ephemera esoteric-ephemera merged commit 4b0c0c6 into materialsproject:main Aug 11, 2025
21 checks passed
@esoteric-ephemera esoteric-ephemera deleted the neb branch August 11, 2025 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Dispersion correction cannot be added to forcefield calculations with MACE when potentials are stored locally

5 participants