This is the official fork and continuation of the ChemIC project, which was originally developed by Dr. Aleksei Krasnov. The original repository can be found at https://github.com/ontochem/ChemIC
- Project Description
- Requirements
- Prepare Workspace Environment with Conda
- Model Construction
- Models Download
- Usage: Web Service for Chemical Image Classification
- Jupyter Notebook
- Author
- Citation
- References
- License
You can try out the user frontend web interface at https://chemic-ai.streamlit.app/
The Chemical Image Classifier (ChemIC) project provides a solution for classifying chemical images using a Convolutional Neural Network (CNN). The model categorizes images into one of four predefined classes:
- Images containing a single chemical structure.
- Images depicting chemical reactions.
- Images featuring multiple chemical structures.
- Images with no identifiable chemical structures.
The package consists of three main components:
A) CNN Model for Image Classification (chemic_train_eval.py)
- Trains a deep learning model to classify images into the four predefined classes.
- Utilizes a pre-trained ResNet-50 model and includes steps for data preparation, model training, evaluation, and testing.
B) Web Service for Chemical Image Classification (app.py)
- Provides a FastAPI web application for classifying chemical images using the trained ResNet-50 model.
- Exposes an endpoint
/classify_imagesfor accepting chemical images and returning the predicted class.
C) Image Classification Client (client.py)
-
Interacts with the ChemIC web server. The client can send to the server:
- The path to an individual image file
- The path to a directory with multiple images
- Base64 encoded image data
The server classifies the images and returns the recognition results to the client.
# 1. Create and activate the conda environment
conda create --name chemic "python<3.13"
conda activate chemic
# 2. Install ChemIC-ml
# 2.1 From PyPI
pip install ChemIC-ml
# 2.2 Or, install from the GitHub repository
pip install git+https://github.com/alexey-krasnov/ChemIC.git
# 2.3 Or, install in editable mode from the GitHub repository
git clone https://github.com/alexey-krasnov/ChemIC.git
cd ChemIC
pip install -r requirements.txt
pip install -e .- Where -e means "editable" mode.
First, download the archive with manually labeled images, available as part of the supplementary materials from Zenodo: dataset_for_image_classifier.zip. Unzip the archive:
unzip dataset_for_image_classifier.zipTo perform model training, validation, and testing, as well as saving your trained model, run the following command in the CLI:
python chemic_train_eval.py --dataset_dir /path/to/data --checkpoint_path /path/to/checkpoint.pth --models_dir /path/to/models--dataset_dir: Directory containing the dataset (with train, test, and validation subdirectories).--checkpoint_path: Path to the existing model checkpoint file.--models_dir: Directory to save newly trained models.
This command executes the training and evaluation using the specified paths.
Download the pre-trained models from Zenodo as an archive: models.zip.
Unzip it into the chemic/models directory. The models directory should contain the pre-trained model chemical_image_classifier_resnet50.pth for chemical image classification.
Run the following command in terminal:
uvicorn chemic.app:app --host 127.0.0.1 --port 5010 --workers 1 --timeout-keep-alive 3600--workers 1: Specifies the number of worker processes. Adjust based on your server's capabilities.--host 127.0.0.1 --port 5010: Binds the application to the specified address and port. Modify as needed.--timeout-keep-alive 3600: Sets the maximum allowed request processing time in seconds. Adjust as necessary.
In another terminal window, run the following command:
streamlit run chemic_frontendapp.py --server.address=0.0.0.0 --server.port=5009This command will launch the ChemIC user web interface.
python chemic/client.py --image_path /path/to/images --export_dir /path/to/exportOR
python chemic/client.py --image_data <base64_encoded_string> --export_dir /path/to/export--image_pathis the path to the image file or directory with images for classification.--image_datais the base64 encoded image data.--export_diris the export directory for the results.
from chemic.client import ChemClassifierClient
client = ChemClassifierClient(server_url='http://127.0.0.1:5010')
# Check the health of the server
health_status = client.healthcheck().get('status')
print(f"Health Status: {health_status}")
# Use image path or directory. Replace with the actual path to your image file
image_path = '<path to the image file or directory with images for classification>'
recognition_results = client.classify_images(image_path)
# OR use base64-encoded image data. Replace with your base64-encoded image data:
base64_data = b'iVBORw0KGgoAAAANSUhEUgA....'
recognition_results = client.classify_images(image_data=base64_data)
# Recognition results will be returned in the form of a list of dictionaries
print(recognition_results)
[
{
'image_id': 'image_name_1.png',
'predicted_label': 'single chemical structure',
'classifier_package': 'ChemIC-ml_1.3.1',
'classifier_model': 'ResNet_50',
},
{
'image_id': 'image_name_2.png',
'predicted_label': 'multiple chemical structures',
'classifier_package': 'ChemIC-ml_1.3.1',
'classifier_model': 'ResNet_50',
},
...
]The client_image_classifier.ipynb notebook in the notebooks directory provides an easy-to-use interface for classifying images. Follow the steps outlined in the notebook to perform image classification.
Dr. Aleksei Krasnov dr.aleksei.krasnov@gmail.com
- A. Krasnov, S. Barnabas, T. Böhme, S. Boyer, L. Weber, Comparing software tools for optical chemical structure recognition, Digital Discovery (2024). https://doi.org/10.1039/D3DD00228D
- L. Weber, A. Krasnov, S. Barnabas, T. Böhme, S. Boyer, Comparing Optical Chemical Structure Recognition Tools, ChemRxiv. (2023). https://doi.org/10.26434/chemrxiv-2023-d6kmg-v2
- A. Krasnov, Images dataset for Chemical Images Classifier model. https://zenodo.org/records/13378718
- A. Krasnov, Chemical Image Classifier Model. https://zenodo.org/records/10709886
This project is licensed under the MIT - see the LICENSE.md file for details.