Skip to content

[SUBMISSION] ChemInformant – Python client for PubChem #254

@HzaCode

Description

@HzaCode

Submitting Author: Zhiang He (@HzaCode)
All current maintainers: Zhiang He (@HzaCode)
Package Name: ChemInformant
One-Line Description of Package: A robust, high-throughput Python client for retrieving chemical information from the PubChem API; it returns analysis-ready Pandas/SQL outputs, handles caching, rate-limiting and retries, and includes convenient CLI tools.
Repository Link: https://github.com/HzaCode/ChemInformant
Version submitted: 2.4.0 (released 2025-07-30)
EiC: TBD
Editor: TBD
Reviewer 1: TBD
Reviewer 2: TBD
Archive: TBD
JOSS DOI: TBD
Version accepted: TBD
Date accepted (month/day/year): TBD


Code of Conduct & Commitment to Maintain Package

Description

ChemInformant is a workflow-centric Python client for the PubChem REST API.
It transforms raw API responses into clean Pandas DataFrames or direct SQL tables, supports mixed identifiers (CIDs, names, SMILES), provides persistent on-disk caching, automatic rate-limiting and retries, and exposes both an object-oriented API and terminal-ready CLI tools (chemfetch, chemdraw).
The library leverages Pydantic v2 for runtime data validation and integrates seamlessly into machine-learning or QSAR pipelines that require large-scale descriptor retrieval.

Scope

  • Data retrieval
  • Data extraction
  • Data processing/munging
  • Data deposition
  • Data validation and testing
  • Data visualization[^1]
  • Workflow automation
  • Citation management and bibliometrics
  • Scientific software wrappers
  • Database interoperability

Domain Specific

  • Geospatial
  • Education

Community Partnerships

If your package is associated with an existing community please check below:

Scope Explanation

  • Target audience & applications: Computational chemists, cheminformatics researchers, drug-discovery scientists and data-science practitioners who require reliable, programmatic access to PubChem for large-scale descriptor retrieval and downstream ML/QSAR workflows.
  • Comparison to similar packages: Existing clients such as PubChemPy or ChemSpiPy lack built-in caching, mixed-identifier handling and robust fault-tolerance; in benchmarks, ChemInformant achieves up to a 48× speed-up versus PubChemPy through batched queries and cache reuse.
  • Pre-submission enquiry: None.

Technical checks

This package:

  • does not violate the Terms of Service of any service it interacts with.
  • uses an OSI approved license (MIT).
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a tutorial with examples of its essential functions and uses.
  • has a test suite (pytest) located in tests/.
  • has continuous integration setup via GitHub Actions.

Publication Options

JOSS Checks (leave blank since box above is unchecked)

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

  • Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

  • I have read the author guide.
  • I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    pre-review-checks

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions