Skip to content

Substitute per_atom for sample_kind, which is more flexible#165

Closed
pfebrer wants to merge 1 commit intomainfrom
sample_kind
Closed

Substitute per_atom for sample_kind, which is more flexible#165
pfebrer wants to merge 1 commit intomainfrom
sample_kind

Conversation

@pfebrer
Copy link

@pfebrer pfebrer commented Feb 26, 2026

This is part of the path to having electronic structure targets. In that case we need to define targets and outputs that are "per-pair of atoms". In metatrain, ModelOutput is used all the time, so we can't introduce per pair targets in metatrain without modifying metatomic.

There will be an accompanying metatrain PR.

This will of course break everything, so we will likely need to bump the major version.

📚 Download documentation for this pull-request

@pfebrer pfebrer marked this pull request as draft February 26, 2026 10:32
@ceriottm
Copy link
Contributor

I think that if we're going to bump the major version number it'd be good to merge this to a "release branch" and work on that branch for all ml-4-es developments. That way we can keep a stable API, and only do the breaking change at once when we are sure everything is working well.

@Luthaf
Copy link
Member

Luthaf commented Feb 26, 2026

I was even thinking of introducing this in metatrain only, and then mapping sample_kind="atom_pair" to per_atom=True when converting from metatrain format to metatomic format.

I'm also fine if you want to do a next branch on this repo and have everything there as clean as possible, but then you'll need to recompile metatomic from source every time.

@pfebrer
Copy link
Author

pfebrer commented Feb 26, 2026

One thing to note is that there is nothing electronic structure specific here or in the metatrain PR that I'm preparing. It is just about enabling data that is not per-atom or per-system. So in that sense, I think the usefulness of this is quite independent of the success of the ml-4-es efforts.

Having this in a metatomic branch (Michele's suggestion) for a long time means that we also need to have a metatrain branch that supports per-pair targets, and base the electronic structure developments on that branch. That metatrain branch will be a pain to maintain up to date with main.

Mocking ModelOutput in metatrain (Guillaume's suggestion) could be done I guess, but it would require some hacky stuff and lots of refactoring that will need to be undone at some point. Because of the reasons in my first paragraph here, I think this would be inflicting unnecessary pain on us 😅 And it will make things more confusing for developers since metatrain wouldn't match metatomic.

In any case, even if we agree not to merge it soon, I'll keep this PR here since it contains the work that needs to be done at some point.

@pfebrer
Copy link
Author

pfebrer commented Feb 26, 2026

Alternatively if you want something that doesn't introduce breaking changes we could temporarily have an extra per_pair field.

I just did sample_kind because this is what Guillaume liked (and I also think that in the long-term this will be the best).

@pfebrer
Copy link
Author

pfebrer commented Feb 26, 2026

Ok, maybe I just try to mock ModelOutput in metatrain for now and see how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants