Skip to content

krentzd/ai4ab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI for Antibiotics (AI4AB)

Deep learning recognises antibiotic mode of action from brightfield images

Paper github github

Overview

This repository contains the source code to reproduce the analysis from "Deep learning recognises antibiotic mode of action from brightfield images". width=6

Installation

Install dependencies within conda environment

First, create and activate a conda environment:

conda create -n ai4ab_env python=3.9
conda activate ai4ab_env

Then, clone this repository and install the required dependencies:

git clone https://github.com/krentzd/ai4ab.git
cd ai4ab
pip install torch==1.12.1 torchvision==0.13.1  # Install PyTorch
pip install -r requirements.txt

Singularity image

Alternatively, you can build a Singularity image using the provided recipe as follows:

git clone https://github.com/krentzd/ai4ab.git
cd ai4ab
singularity build ai4ab.sif ai4ab.def

After building the singularity image, you can directly run the training and testing scripts from the terminal:

singularity exec --bind PATH_TO_AI4AB:PATH_TO_AI4AB --nv ai4ab.sif python model/run_training.py --data_dir DATA_DIR --save_dir SAVE_DIR --train_dir TRAIN_DIR --test_dir TEST_DIR

# or

singularity exec --bind PATH_TO_AI4AB:PATH_TO_AI4AB --nv ai4ab.sif python model/run_training.py --save_dir SAVE_DIR

Usage

Datset preparation

Your dataset must obey the following folder structure:

DATA_DIR
└── Plate_1
│   └── Compound_1_Concentration_1
│   │   ├── img_1.tiff
│   │   ├── img_2.tiff
│   │   └── ...
│   ├── ...
│   └── Compound_N_Concentration_M
├── ...
└── Plate_K

Use the following script to preprocess a TIFF dataset acquired on a Revvity Opera Phenix high-content screening system according to the above-described folder structure:

python preprocessing/dataset_preprocessing.py \
    --im_dir IMAGE_DIR \             # IMAGE_DIR must containt TIFF files and an Index.xml file
    --target_dir DATA_DIR \ 
    --plate_map_path PLATE_MAP.csv   # PLATE_MAP.csv must contain 'cond' and 'Destination well' columns

Model training

To train a model from scratch, run the following command within your conda environment:

python model/run_training.py \
    --data_dir DATA_DIR \
    --save_dir SAVE_DIR \ 
    --train_dir Plate_1 Plate_2 \
    --test_dir Plate_N \

Model testing

To test the model on the Plate defined in test_dir, run the following command within your conda environment:

python model/run_testing.py \
    --save_dir SAVE_DIR \
    --ckpt -1 \                       # -1 selects the checkpoint with the lowest validation loss

Inference on a different dataset

To obtain embeddings and predictions on a different dataset, run the following command within your conda environment:

python model/run_testing.py \
    --save_dir SAVE_DIR \
    --data_dir DATA_DIR \             # This is the path to the data directory
    --test_dir Plate_1 Plate_2 \      # This specifies what plate(s) should be tested 
    --ckpt -1 \                       # -1 selects the checkpoint with the lowest validation loss

Tracking model training

You can track model training with Tensorboard by running:

tensorboard --logdir=SAVE_DIR/tensorboard

Loading a pretrained model

Clone this repository with git clone https://github.com/krentzd/ai4ab.git and navigate to the model folder. Then you can reuse a model trained on images of drug-treated E coli bacteria like this:

import torch
from models import AvgPoolCNN
from data import get_test_transforms

model = AvgPoolCNN.from_pretrained()

img_tensor = torch.rand(1, 2160, 2160)              # Image of size (C, H, W)
input_tensor = get_test_transforms()(img_tensor).unsqueeze(0) # Tensor of size (batch_size, n_crops, C, H, W)

pred = model.predict(input_tensor)                  # Input: (batch_size, n_crops, C, H, W) → Output: Predicted class as list of str
feat_vecs = model.feat_vecs(input_tensor)           # Output: feature vector as numpy.ndarray with dimensions (batch_size, 1280) 

Specific models trained on different sets of plates, imaging channels or bacterial species can be loaded from Hugging Face like this:

model = AvgPoolCNN.from_pretrained(
            species='Ecoli',                       # Select bacterial species
            channels=['BF', 'Hoechst'],            # Select imaging channels
            replicate=2,                           # Model replicate (different training/testing plates)
        )

Reproduce figures from manuscript

  1. Download embedding data here
  2. Unzip file and move embedding data to directory DATA in ai4ab
  3. Run analysis notebooks in the analysis folder

How to cite

@article{krentzel2025deep,
  title={Deep learning recognises antibiotic modes of action from brightfield images},
  author={Krentzel, Daniel and Kho, Kelvin and Petit, Julienne and Mahtal, Nassim and Chiaravalli, Jeanne and Shorte, Spencer L and Wehenkel, Anne Marie and Boneca, Ivo G and Zimmer, Christophe},
  journal={bioRxiv},
  pages={2025--03},
  year={2025},
  publisher={Cold Spring Harbor Laboratory}
}

About

Source code for AI4AB project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors