Skip to content

ordering of lists in Molecule schema #55

@loriab

Description

@loriab

If the rules are:

  • order of atoms in topology schema is absolute and may not be reshuffled and
  • lists are inherently ordered

then array fields like "masses" are fine b/c atoms are in order.

What of bonds and fragments, then?

"connectivity": {
"description": "A list describing bonds within a molecule. Each element is a (atom1, atom2, order) tuple.",
"type": "array",
"items": {
"type": "array",
"minItems": 3,
"maxItems": 3,
"items": {
"type": "number",
"minimum": 0,
"maximum": 5,
}
}
},
"fragments": {
"description":
"(nfr, -1) list of indices (0-indexed) grouping atoms into molecular fragments within the topology.",
"type": "array",
"items": {
"type": "array",
"items": {
"type": "number",
"multipleOf": 1.0
}
}
},

I very much recognize that fragmenters and n-body drivers will have their own systems for defining and indexing fragments (and that fragment_multiplicities and fragment_charges must follow along) that must not be disturbed, but is there any reason not to require this field be sorted (e.g., [[5, 0], [4, 1, 3], [2]]) for ease of comparison? Same (and stronger, imo) case for sorting "connectivity" field.

Maybe beyond QC there's a good reason to leave these free-ordered? Should two json molecules whose schema differ by only [[5, 0], [4, 1, 3], [2]] vs. [[2], [5, 0], [4, 1, 3]] resolve to the same hash?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions