The Blackbird is a highthroughput phenomics imaging platform developed through collaboration of scientists and engineers at Cornell AgriTech, the USDA-ARS Grape Genetics Research Unit (GGRU), and Moblanc Robotics. Most code/scripts in this repository build off of Tian Qiu's Grape PM Saliency mapping repository (used for this paper).
This repo is still in progress as I'm still actively improving our code and models; alas, feel free to email me with any questions or clarifications: wisemami@oregonstate.edu
The code in this repository primarily uses Pytorch pretrained models to train and subsequently make inferences on leaf disks with or without powdery mildew.
Overview of the training and inference process:
CUDA is required for GPU usage; currently it's only available for PCs. Please check your GPU to figure out which version you need. If running on Apple Silicon, MPS is necessary to take advantage of accelerated Pytorch.
Package Requirements:
To install the required packages via conda, simply run conda env create -f environment.yml and then conda activate mildewVision to activate the environment.
If running on Google Colab, check out a GPU (preferably A100 or better when training) and run: !pip install optuna==3.1.0 termcolor as the other packages should already be installed (as of 11/25/2025).
To train your own model, you need:
-
A labeled image patch dataset to build the necessary train/test/val .hdf5 files
- you can make image patches using code/preprocessing/makePatches.py. It's easiest to sort these patches into different directories according to the label (e.g. if infected, put in the "infected" directory. If not infected, put in the "healthy" directory)
- In subsequent models, I would make patches by using the
--save_infected,--save_healthyor--save_discardedtags when running plot_leaf_sal.py, this way I could correct and add previously missclassified patches to my new dataset in hopes the next model iteration would learn the features better. - you can then make a train/test/val hdf5 files (or k-fold splits) using code/preprocessing/images_to_test_train_hdf5.py
-
To determine mean rgb chanel values for your test/train/val sets using code/preprocessing/get_mean_std.py and plug those into your code/scripts/train.sh script under
--meansand--stds(super important...this dramatically effects your model performance). -
Customize other training parameters such as the model, learning rate, etc. within the code/scripts/train.sh script. See the argparse section in code/classification/run.py to see full list of customizable variables.
Note: You can start with the default values, but your model will likely perform better if you try different base models and hyperparamter values (e.g. by using Optuna hyperparameter optimization as shown below). Always cross-validate and test to ensure you're not overfitting though.
Once you have downloaded our example two-class hop powdery mildew model to results/ResNet_Feb14_15-53-04_2024, you should be able to activate the conda environment (conda activate mildewVision) and run either bash ./code/scripts/plot_leaf_correlation_all.sh or bash ./code/scripts/plot_leaf_sal_map.sh as a minimal working example.
Once that's working, you can customize the argparse arguments in the leaf_correleation_all.sh bash script to run inference on multiple datasets in parallel (adjust the MAX_JOBS parameter according to your computational power). In the example here, I have included commands for calling either plot_leaf_sal_map.py or leaf_correlation.py. Both code/scripts return the same .csv file that provides metadata about your run parameters, disease severity estimates, saliency metrics, etc., but plot_leaf_sal_map.py also returns visual outputs of patch disease severity as well as saliency maps (if you include the optional saliency tags, see example below). If you are running standard inference you may opt to call the code/leaf_correlation.py script instead as it runs 5-10x faster.
Example raw and deeplift saliency map (--sal_deeplift) output of from plot_sal_map.py:

Coming soon...
Coming soon...
1 cm leaf disks were excised using ethanol disinfested leather punches and subsequently arrayed adaxial side onto up on 1% water agar plates. Image acquisition was performed using the Blackbird CNC Imaging Robot (version 1 "Blackbird-Green", developed by Cornell University, USDA-ARS Grape Genetics Research Unit, and Moblanc Robotics). The Blackbird is a G-code driven CNC that positions a Nikon Z 7II mirrorless camera equipped with a 2.5x zoom ultra-macro lens (Venus Optics Laowa 25mm) in the X/Y position and then the camera captures images in a z-stack every 200 µM in Z-height. Blackbird datasheets can be prepared using the generateBlackbirdDatasheet.py script. The image stacking process is automated using the stackPhotosParallel.py Python script. Helicon Focus software (Helicon Software, version 8.1) was utilized to perform the focus stacking, with the parameters set to method B (depth map radius: 1, smoothing radius: 4, and sharpness: 2).
Example images can be viewed here. Models, images, and training data to be released with manuscript.

