Skip to content

Conversation

@TKanX
Copy link

@TKanX TKanX commented Jan 15, 2026

Purpose

I am a high school student currently working at Caltech, leading the development of several Rust-based libraries for computational chemistry and biology. As the Rust ecosystem for scientific computing grows, we are seeing an increasing number of high-performance libraries that do not fit neatly into the existing bioinformatics (which focuses on genomics/omics) or general simulation categories.

To better organize these projects and foster a dedicated community, I propose expanding the [science] category with two new major sections: Chemistry and Structural Biology. These additions aim to provide elegant and abstract classifications for libraries dealing with molecular dynamics, quantum chemistry, and biomolecular structure analysis.

Proposed Changes

1. New Category: Chemistry

This category captures the study of matter, bridging physics and chemical informatics.

[science.categories.chemistry]
name = "Chemistry"
description = "Crates related to the study of matter, computational chemistry, and cheminformatics."

[science.categories.chemistry.categories.physical]
name = "Physical Chemistry"
description = "Molecular dynamics, quantum chemistry, and statistical thermodynamics."

[science.categories.chemistry.categories.informatics]
name = "Cheminformatics"
description = "Chemical graph theory, molecular descriptors, and file formats."

2. New Subcategory: Structural Biology

This is added under bioinformatics to distinguish 3D structural work from sequence-based bioinformatics.

[science.categories.bioinformatics.categories.structural]
name = "Structural Biology"
description = "Analysis, modeling, and simulation of 3D biomolecular structures."

Motivation & Examples

The following existing projects in the ecosystem (developed by myself and peers) would find a natural home in these new categories:

Physical Chemistry

  • cheq: Dynamic partial charge calculation using the QEq method.
  • sto-ns: Exact evaluation of Slater-Type Orbital integrals.
  • lumol: Universal extensible molecular simulation engine.
  • zelll: Implementation of the cell lists algorithm for particle simulations.
  • potentials: High-performance classical molecular dynamics potentials (LJ, bonds, angles).

Cheminformatics

  • dreid-typer: DREIDING atom typing and topology perception.
  • dreid-forge: Automated force field parameterization.
  • ffcharge: Force field partial charge assignment.

Structural Biology

  • molar: Molecular analysis and modeling library for trajectories.
  • bio-forge: Preparation and standardization of biomolecular structures (PDB/mmCIF).
  • screampp: Protein side-chain prediction and placement.

Acknowledgements

This proposal is motivated by the work of the following contributors to the Rust scientific ecosystem:

@yesint
Copy link

yesint commented Jan 15, 2026

This is definitely a good idea and there are probably many more rust libraries not falling into existing categories. I'm not sure about structural biology tag, however. This historically refers to software for resolving structures from X-ray, NMR or cryoEM, while the crates you listed are more computational biology and/or computational chemistry.

@TKanX TKanX marked this pull request as draft January 15, 2026 09:18
@Luthaf
Copy link
Contributor

Luthaf commented Jan 15, 2026

Another related category to consider would be "material science" (most of my work these days is closer to material science than chemistry or biology).

- Structural Bioinformatics: Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA.
- Structural Biology: Structural biology deals with structural analysis of living material (formed, composed of, and/or maintained and refined by living cells) at every level of organization. The most prominent techniques are X-ray crystallography, nuclear magnetic resonance, and electron microscopy.

Co-authored-by: yesint <[email protected]>
@TKanX
Copy link
Author

TKanX commented Jan 15, 2026

This is definitely a good idea and there are probably many more rust libraries not falling into existing categories. I'm not sure about structural biology tag, however. This historically refers to software for resolving structures from X-ray, NMR or cryoEM, while the crates you listed are more computational biology and/or computational chemistry.

Good catch, @yesint. You’re right! Historically, "Structural Biology" is heavily tied to the experimental side (X-ray/Cryo-EM).

I purposefully avoided the "Computational" prefix because it feels like a tautology in a software ecosystem; every crate is computational by definition. However, I agree we need better precision.

Looking at the distinction between Structural Biology and Structural Bioinformatics, the latter is definitely the more accurate home for these 3D tools. I’ll update the category name to reflect that. Thanks for the nudge!

Fixed: c745940

@TKanX
Copy link
Author

TKanX commented Jan 15, 2026

Another related category to consider would be "material science" (most of my work these days is closer to material science than chemistry or biology).

c903826

@Luthaf

@TKanX TKanX marked this pull request as ready for review January 15, 2026 10:12
Updates descriptions in chemistry and bioinformatics categories to explicitly mention "crates".
@yesint
Copy link

yesint commented Jan 15, 2026

@TKanX we are entering the thin ice of terminology here, but to me structural bioinformatics sounds somewhat weird because, again, historically bioinformatics is about genes, sequences and evolution. I'd recommend sticking to computational biology / computational chemistry since these are established terms, which are natural to search for.

@TKanX TKanX marked this pull request as draft January 15, 2026 20:04
@Turbo87 Turbo87 moved this to For next meeting in crates.io team meetings Jan 16, 2026
@Turbo87
Copy link
Member

Turbo87 commented Jan 19, 2026

FWIW we discussed this PR at the crates.io team meeting last week and we are generally open to adding such categories. let me know once you have reached consensus on which categories to add :)

@fncnt
Copy link

fncnt commented Jan 20, 2026

I agree that these are good suggestions!
I hope my two cents don't complicate the matter:

@TKanX we are entering the thin ice of terminology here, but to me structural bioinformatics sounds somewhat weird because, again, historically bioinformatics is about genes, sequences and evolution. I'd recommend sticking to computational biology / computational chemistry since these are established terms, which are natural to search for.

I'm comfortable with the term structural bioinformatics but I guess that's a matter of one's background and I agree that the listed crates might fit other categories better or at least just as well.
There are at least two crates that would fit this category (but are not concerned with 3D structures):

Maybe there's no need for it being a subcategory because of just two crates but one could make the case for it since there are already a couple of subcategories for bioinformatics and the structural component is not yet represented.

In the end, it's up to crate authors anyway and I think it's often hard enough to categorize software accurately in these fields.
Personally, I would be happy to have some categories to choose from in order to mix and match.
E.g. if there wasn't a structural bioinformatics subcategory for librna-sys, I wouldn't mind categorizing it as structural biology and bioinformatics; there's always the option to use keywords to make it more precise.

Also, I'm not sure whether I would consider computational biology a subcategory of bioinformatics or vice versa.
That being said, I don't really have objections to any of this. Thanks for taking the initiative, @TKanX!
There's really no end to finding the best taxonomy :)

@TKanX
Copy link
Author

TKanX commented Jan 21, 2026

Thanks everyone.

We have consensus on Chemistry and Material Science. The only blocker is the Biology category name.

The Goal:
We need a sub-category to distinguish 3D/Physics-based tools (MD, protein folding, PDB handling) from the existing 1D/Sequence-based tools (genomics, string parsing).

The Conflict:

  • Computational Biology: Suggested by @yesint.
    • Pro: Standard terminology.
    • Con: Too broad. Technically, the entire parent category is "Computational Biology." It doesn't help users filter 3D tools from sequence tools.
  • Structural Biology:
    • Pro: Clearly targets 3D structure and folding.
    • Con: As noted, historically implies experimental work (X-ray/CryoEM) rather than software.
  • Or perhaps as a subtype of chemistry?
  • Or any other suggestions? Alternatively, we can take a middle ground: change Structural Biology to Structural Bioinformatics:
    • Structural Biology (Wikipedia): Structural biology deals with structural analysis of living material (formed, composed of, and/or maintained and refined by living cells) at every level of organization. The most prominent techniques are X-ray crystallography, nuclear magnetic resonance, and electron microscopy.
    • Structural Bioinformatics (Wikipedia): Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA.

I need a decision from the team: Do we prioritize precise categorization (Structural/Structural Bioinformatics) or broad terminology (Computational)?

@Turbo87 @Luthaf @yesint @fncnt please advise so I can update and merge.

@fncnt
Copy link

fncnt commented Jan 23, 2026

Here's a suggestion:

Introduce Computational Biology at the same level as Bioinformatics and (optionally) the subcategory Structure Modeling (rather than Structural Biology, which carries a wet-lab connotation):

[science.categories.compbio]
name = "Computational Biology"
description = "..."

[science.categories.compbio.categories.structural]
name = "Structure Modeling"
description = "..."

I believe Structure Modeling is broad enough to include structure prediction and structure determination.

Also optionally, introduce Structural Bioinformatics as a bioinformatics subcategory.
IMO, the description should reflect that this is not only about 3D structures (e.g. there is RNA secondary structure prediction).
This is optional because you could argue that structural bioinformatics belongs to structure modeling, in which case I would suggest omitting this new subcategory to simplify the decision.

[science.categories.bioinformatics.categories.structural]
name = "Structural Bioinformatics"
description = "..."

I think this would address many of your concerns. There's a new category that's broad enough and potentially two new subcategories which are more precise.

@TKanX
Copy link
Author

TKanX commented Jan 25, 2026

Hi everyone,

I've updated the PR to add the proposed categories and subcategories based on your feedback.

Here are the changes I've made:

# ...

[science]
name = "Science"
description = """
Crates related to solving problems involving physics, chemistry,
biology, machine learning, geoscience, and other scientific fields.
"""

# ...

[science.categories.computational-chemistry]
name = "Computational Chemistry"
description = """
Crates for computational methods in chemistry, including
electronic-structure calculations, molecular simulation, and
cheminformatics.
"""

[science.categories.computational-chemistry.categories.electronic-structure]
name = "Electronic Structure"
description = """
Crates for quantum chemistry and electronic-structure methods such as
DFT, ab initio, and correlated techniques.
"""

[science.categories.computational-chemistry.categories.molecular-simulation]
name = "Molecular Simulation"
description = """
Crates for molecular dynamics, Monte Carlo, force fields, and
statistical mechanics simulations.
"""

[science.categories.computational-chemistry.categories.cheminformatics]
name = "Cheminformatics"
description = """
Crates for molecular representations, descriptors, chemical graph
algorithms, file format parsing, and QSAR tooling.
"""

[science.categories.computational-biology]
name = "Computational Biology"
description = """
Crates for computational modeling and simulation of biological systems,
including structural modeling and systems-level modeling.
"""

[science.categories.computational-biology.categories.structural-modeling]
name = "Structural Modeling"
description = """
Crates for protein and biomolecular structure prediction, docking,
model refinement, and physics-based biomolecular simulation.
"""

[science.categories.computational-biology.categories.systems-biology]
name = "Systems Biology"
description = """
Crates for network modeling, pathway and metabolic modeling, and
whole-system simulations.
"""

[science.categories.computational-biology.categories.structural-informatics]
name = "Structural Informatics"
description = """
Crates for representing, parsing, analyzing, and manipulating
macromolecular structure data, including PDB/mmCIF formats,
biomolecular topologies, and assembly models.
"""

[science.categories.materials]
name = "Materials Science"
description = """
Crates for the study, characterization, and simulation of condensed matter
and materials, including crystallography and solid-state physics.
"""

# ...

Please review the changes, confirm their accuracy, and let me know if any additional adjustments are needed. Once approved, @Turbo87 can merge this PR.

Thank you all for your valuable feedback and for helping improve this proposal!

@yesint @Luthaf @fncnt

@TKanX TKanX changed the title Add Chemistry and Structural Biology Categories to Science Add Computational Chemistry, Computational Biology, and Materials Science Categories to Science Jan 25, 2026
@fncnt
Copy link

fncnt commented Jan 26, 2026

Personally, I would not add this subcategory:

[science.categories.computational-biology.categories.structural-informatics]
name = "Structural Informatics"
description = """
Crates for representing, parsing, analyzing, and manipulating
macromolecular structure data, including PDB/mmCIF formats,
biomolecular topologies, and assembly models.
"""

I think this would be too fine-grained and potentially lead to confusion when deciding between subcategories.
The term structural informatics is not very common, I believe, and I would consider the description also being part of the overall idea of structural modeling.
If in doubt, subcategories can always be added future PRs.

Other than that, I'm okay with it (although I don't have much insight into computational chemistry adjacent terminology). 👍

@TKanX
Copy link
Author

TKanX commented Jan 26, 2026

Personally, I would not add this subcategory:

[science.categories.computational-biology.categories.structural-informatics]
name = "Structural Informatics"
description = """
Crates for representing, parsing, analyzing, and manipulating
macromolecular structure data, including PDB/mmCIF formats,
biomolecular topologies, and assembly models.
"""

I think this would be too fine-grained and potentially lead to confusion when deciding between subcategories. The term structural informatics is not very common, I believe, and I would consider the description also being part of the overall idea of structural modeling. If in doubt, subcategories can always be added future PRs.

Other than that, I'm okay with it (although I don't have much insight into computational chemistry adjacent terminology). 👍

Great suggestion!

874ec72

@fncnt

@TKanX
Copy link
Author

TKanX commented Jan 26, 2026

@yesint @Luthaf Any suggestions?

@TKanX TKanX marked this pull request as ready for review January 28, 2026 03:44
@TKanX
Copy link
Author

TKanX commented Jan 31, 2026

@Turbo87 Ready for review! 🚀

Thank you!

Comment on lines +543 to +548
[science.categories.computational-biology]
name = "Computational Biology"
description = """
Crates for computational modeling and simulation of biological systems,
including structural modeling and systems-level modeling.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, we already have the "Bioinformatics" category below. having a "Computational Biology" category next to it seems a little confusing to me... 😅

is there some way we can resolve this? can we merge them? rename the new one in some way?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually started with that exact approach (see 140fbf3). However, during the discussion, we realized it’s better to distinguish between sequence biology (2D) and structural biology (3D).

As noted in this feedback:

@TKanX we are entering the thin ice of terminology here, but to me structural bioinformatics sounds somewhat weird because, again, historically bioinformatics is about genes, sequences and evolution. I'd recommend sticking to computational biology / computational chemistry since these are established terms, which are natural to search for.

@yesint @Luthaf Do you have any suggestions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fncnt @Turbo87 Do you have any suggestions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants