This repository contains fine-tuned YOLO models for detecting vials in images, and is a part of the interface detection program in the autoHSP project.
The models are trained on a custom dataset of vials with two object categories: vial
and vial_body
. Check the releases for more details.
This repository also includes a Flask app to host an API for detecting vials in images. The app can be run locally or deployed on a server.
See the autoHSP project for citation information.
.
├── datasets/ # custom vial dataset, not shared
├── runs/ # the pre-trained model weights
│ ├── README.md
│ ├── yolov8n.pt # needs download
│ └── yolo{...}.pt # needs download
├── templates/ # static HTML files for the Flask app
├── vial_detection/ # the main code for vial detection
│ ├── __init__.py
│ └── detect.py # the main detection code
├── .gitignore
├── download.py # script to download models to `runs/`
├── Flask_utils.py # utility functions for Flask app
├── Flask.py # the Flask app
├── info.py # configuration for the Flask app
├── LICENSE
├── README.md
├── requirements.txt # Python dependencies
├── run.sh # script to run the Flask app (API hosting)
├── train_vial_detection.py # train script (single model)
├── train_vial_detection.sh # train script (multiple models)
└── webcame_vial_detection.py # script to run a live webcam detection
Python 3.12 is recommended for this project. The following instructions assume you have conda
installed. You can also use other Python package managers.
-
Clone the repository:
git clone https://github.com/SijieFu/vial-detection.git cd vial-detection
-
Create a new conda environment:
conda create -n vial-detection python==3.12.10 -y conda activate vial-detection
-
Install the required packages:
pip install -r requirements.txt
-
Download the pre-trained YOLO weights:
pip install gdown==5.2.0 python download.py --model yolov8n
Use
python download.py --help
to see all available options for downloading models. Usepython download.py --all
to download all available models.If you have previously downloaded the resources, you can skip this step. Otherwise, you may need to use
--overwrite
to overwrite the existing files.
The function vial_detection.detect.detect_vials
is the main function for detecting vials in images. It returns the results in a dictionary of:
{
"vials": [
{
"xyxy": [x1, y1, x2, y2], # bounding box coordinates
"confidence": 0.95, # confidence score
"name": "vial", # class name
},
...
],
"nvials": 2, # number of vials detected
"xyxy": [x1, y1, x2, y2], # bounding box to include all vials
"dimensions": [height, width], # dimensions of the image
"parameters": {
"model": "yolov8n.pt", # model used for detection
"confidence": 0.7, # confidence threshold
...
}
}
To host the detection function as an API, you can use the provided Flask app in Flask.py
. The app is configured to run on port 5002 by default.
The run.sh
script can be used to start the Flask app:
bash run.sh # or "./run.sh"
POST requests can be made to the root endpoint (/
) with the following format:
Endpoint
POST http://localhost:5002/
Hearders
Content-Type: multipart/form-data
Body: The body should include the following fields:
- File Upload
- Key:
file
- Value: The image file to be processed (e.g.,
image.jpg
)
- Key:
- Optional Parameters
filename
: Custom filename for the uploaded image.returnjson
: Whether to respond with JSON data. Default isTrue
.returnimage
: Whether to return the analyzed/annotated image. Ifreturnjson
isFalse
andreturnimage
isTrue
, the response will be an annotated image in base64 format (e.g.,image/png
).weights
: The model weights to use for detection. Default isyolov8n.pt
.conf
: Confidence threshold for detection. Default is0.7
.iou
: IoU threshold for non-max suppression. Default is0.5
.max_det
: Maximum number of detections to return. Default is300
.
When returnjson
is True
, the response will be a JSON object containing the detection results, similar to the output of detect_vials
. The structure of the JSON response is as follows:
{
"vials": [
{
"xyxy": [x1, y1, x2, y2],
"confidence": 0.95,
"name": "vial"
},
...
],
"nvials": 2,
"xyxy": [x1, y1, x2, y2],
"dimensions": [height, width],
"parameters": {
"model": "yolov8n.pt",
"confidence": 0.7,
...
},
"md5_hash": "abc123", # a custom MD5 hash without EXIF data
"annotated_image": {
"file": "base64_encoded_image_data",
"filename": "annotated_image.jpg",
"filetype": "image/jpg",
"mimetype": "image/jpg"
} # optional, only if returnimage is True
}
templates/img_upload_get.html
is a simple HTML form to upload images and test the API. You can access it by navigating to http://localhost:5002/
in your web browser after starting the Flask app.
To run a live webcam detection, you can use the script webcam_vial_detection.py
. This script captures video from your webcam and performs real-time vial detection. Run python webcam_vial_detection.py --help
to see the available options.
python webcam_vial_detection.py --model yolov8n --webcam 0
If you are testing the script in WSL2, you need to install
usbipd
in Windows and attach the webcam to WSL2. You might need to rebuilt the WSL2 kernel with additional driver support. See the guide here.
The custom vial dataset for fine-tuning the YOLO models is not shared in this repository. To fine-tune models yourself, you can use the script train_vial_detection.py
or train_vial_detection.sh
for multiple models.
You will need to prepare your own dateset in the YOLO format. For example, the dataset structure should look like this:
datasets/ # under the repository root
├── images/
│ ├── train/
│ │ ├── image1.jpg
│ │ ├── image2.jpg
│ │ └── ...
│ └── val/
│ ├── image1.jpg
│ ├── image2.jpg
│ └── ...
├── labels/
│ ├── train/
│ │ ├── image1.txt
│ │ ├── image2.txt
│ │ └── ...
│ └── val/
│ ├── image1.txt
│ ├── image2.txt
│ └── ...
└── vial.yaml # dataset configuration file
In the vial.yaml
file, you need to specify the paths to the training and validation images and labels, as well as the class names. For example:
names:
0: vial
1: vial_body
nc: 2
path: .
train: ./images/train
val: ./images/val
The dataset only contains images of vials in relatively controlled environments, such as a lab or a photo booth. The detections are not limited to a specific type of vial (e.g., glass scintillation vials), but rather to vials/small cylindrical containers in general.
False positives may occur since the model is trained on a limited dataset. The live webcam detection example may show examples of false positives. For deployment, tune the confidence threshold and the maximum number of detections to reduce false positives.
This project is licensed under the AGPL-3.0 License, in accordance with Ultralytics' license. See the LICENSE file for details.