blender_colonoscopy

Project 3: Using synthetic surgical data from Blender to train a polyp detection model

Blender Randomiser Add-on tutorials on Moodle

These are the areas to focus on for the Blender part of the project:

Example of how to render data using Blender by running the run_RandomiserBlender.sh script:

bash run_RandomiserBlender.sh -blend ../../datasets/blend_files/colon.blend -json_in ../input_files/input_bounds_hackathon.json -json_out ../output_files/test_docker.json -seed 32 -frame 10 -basename ../output_files/docker_output_frame -render_main

Project details

Image Guided Surgery (IGS) researchers use machine learning in some form (registration, segmentation, stereo reconstruction, classification etc.) with polyp detection in image guided colonoscopy surgery (project 3) an example. However, often there is limited real data that can be used to train machine learning models in IGS research, and moreover any large datasets would involve extensive time consuming manual labelling from a trained clinician [1].

One potential solution to address this clinical challenge is synthetic data generation which can provide us with large amounts and a variety of realistic ground truth data, which in real data would be very challenging to generate or would involve extensive time consuming manual labelling. Blender is a free and open-source software that can be used for 3D geometry rendering. Uses include synthetic datasets generation and by creating large amounts of synthetic but realistic data, we can improve the performance of models in tasks such as polyp detection in image guided colonoscopy surgery. Synthetic data generation has other advantages since using tools like Blender gives us more control and we can generate a variety of labelled ground truth data from segmentation masks to optic flow fields. Another advantage of this approach is that often we can easily scale up our synthetic datasets by randomising parameters of the modelled 3D geometry [2].

This synthetic data provided may need some additional modification i.e. geometry, texture, lighting, and camera settings to make it more realistic. The challenge will be to generate synthetic data that is realistic enough to be useful for training models, while also being able to generate enough data to train the models effectively. A Blender add-on (or plug-in) developed at UCL in 2023 by a collaboration between ARC and WEISS for the purpose of data generation which would be a starting point for the project with the aim to check data quality and pre-train models for polyp detection, with performance evaluated on open-source datasets.

Learning objectives

learn to practice agile software development in research
learn to practice using synthetic data in applying ML techniques

Learning components

Facilitating collaborative research through agile methodologies.
Using GitHub workflows to do best practices in software reserach engineering.
Utilising python-based programing language for data manipulation, machine learning and AI applications.
Utilising python-based programing language for generating synthetic data using Blender.

Expected outcomes

Deliver a project consisting of four stages, and gain experience working with both synthetic data and open-source real world data. The model evalutation phase will incorporate benchmarks for pre-trained models (YoloV7 and therefore ways to improve models with synthetic data.

Project stages

The project stages include: Pre-requisites (Data, Software, and Hardware), model selection, train and evaluation, interface development and presentation.

Prerequisites

Skills: Students need to be comfortable (or are willing to self-teach) the following: git for version controlling and Python for data manipulation and machine learning. Blender may also be required for generating synthetic data which can be done through the Blender interface or through scripting in Python.
Data request requirements: This project requires access to publicly available synthetic surgical data generated by Blender which includes synthetic colonoscopy and laparoscopic data which can be modified using Blender to generate more realistic data. The list and size of data includes:

blender.zip (~61MB) contains colon.blend files that be loaded into Blender and modified
polyps.zip (~11.56GB) contain training data as .png
examples.blend (~15.67MB) contains .avi videos of colonoscopy and laparoscopy data

Open-source real world datasets are also required for evaluation.

Data preparation and cleaning:
The synthetic data provided by Blender needs to be rendered in .png format and for the polyp detection task there needs to be 2 files generated per video frame:

The full rendered colonoscopy image with polyps
The segmentation mask of the polyps

Software requirements and dependencies: A laptop with a Python virtual environment configured, including the following libraries: Pandas and PyTorch for YoloV7

Blender version 3.4.1 needs to downloaded and installed on your laptop depending on your operating system i.e. Windows 8.1, 10, or 11 (64-bit required), macOS 10.13 or later (macos-arm64.dmg is for Mac M1/M2/M3 and macos-x64.dmg is for Mac Intel), or any modern Linux distribution (64-bit).

You may want to download the latest stable version of Blender which is Blender 3.6 LTS or the latest version of Blender which is version 4.3 However, these versions may not be compatible with the Blender Randomiser add-on and the baseline synthetic data provided. If the students want to test these more up-to-date versions they can report any issues in the Blender Randomiser add-on repository.

Additional Python-based libraries that can be used include: Blender Randomiser plug-in installed if needing to modify data which will also require Blender version 3.4.1.

NOTE: Here are some notebooks and instructions that serve as a great starting point for preparing model fine-tuning and evaluation for polyp detection.

Hardware and infrastructure specifications:
- For laptops without a GPU, consider using Google Colab's free service as an alternative. Change runtime by going to Edit > Notebook settings > T4 GPU, resulting in: Tesla T4 GPU RAM (TPU) with 15.0 GB memory. System: RAM 12.7 GB. Disk: 112.6GB. See details using !nvidia-smi. NOTE "In the version of Colab that is free of charge notebooks can run for at most 12 hours, depending on availability and your usage patterns." source
- For laptops equipped with a GPU and CUDA drivers:
  - Ensure you have sufficient hard drive space to store data and models.
  - Confirm that your GPU has the necessary CUDA drivers install

Hardware Requirements for Blender:

Minimum Requirements:
- CPU: Dual-core 64-bit processor (Intel or AMD).
RAM: 4 GB (8 GB recommended).
- GPU: Integrated graphics or a discrete GPU with 1 GB VRAM.
- Disk Space: Approximately 500 MB for installation; additional space required for projects and assets.
Recommended Requirements for Smooth Performance:
- CPU: Quad-core 64-bit processor.
- RAM: 16 GB or more.
- GPU (Non-GPU/CPU rendering): Blender can use your CPU for rendering, but it will be slower.
- GPU (GPU rendering):
  - NVIDIA: GeForce GTX 10xx series or newer, CUDA compute capability 3.0 or higher.
  - AMD: RDNA architecture or newer.
  - macOS: Apple Silicon (M1/M2) or newer with Metal support.
- Disk Space: 20+ GB for large projects and libraries.
High-End Recommendations (Heavy Workloads like Simulations or Cycles Rendering):
- 32+ GB RAM, NVIDIA RTX-series GPUs, and NVMe SSD for storage.

Model training and model evaluation

Pre-train YoloV7 model using synthetic data
Evaluate the metrics reported by YoloV7 i.e. precision, recall and mean average precision (mAP) on real-world datasets

Interface development

To present reporting results, we recommend developing a Python-based interface using either Streamlit for a web-based solution or a simple command-line interface with Click or another suitable tool.

Team members and their roles

Roles for each stage will rotate based on each student's expertise, with students encouraged to propose their own roles and rotation schedule before starting the project. This will allow students to take on the positions of Project Manager, Clinical Lead / Principal Investigator (PI), Data Scientist(s), and AI/ML Engineer(s).

Presentation and report

Each group will make a group presentation and each student will write an individual report.

Group work and presentation (20%)

You will be asked to deliver a 15 minute group presentation on the assessment of the project at the end of the project.

Written Report (80%) An 1,500 word individual report documenting the project, reporting the results and the individual’s contribution.

Assignment submission deadline for the Written Report: 30 April 2025, 16:00 BST.

Allocation of marks

As a general guide, marks for the presentation will be allocated according to the following weighting:

Clear statement of goals 4%

Appropriate use of algorithms and tools 4%

Clear statement of results 4%

Evidence of professional approach 4%

Clarity of presentation 4%

As a general guide, marks for the report will be allocated according to the following weighting:

Problem Statement: Background and review 15%

Detailed Project Plan 20%

Summary of Testing, Results 20%

Conclusions & key lessons 15%

Presentation 10%

References

Click to see references

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.devcontainer		.devcontainer
.github		.github
configs		configs
data		data
docs		docs
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

blender_colonoscopy

Project 3: Using synthetic surgical data from Blender to train a polyp detection model

Blender Randomiser Add-on tutorials on Moodle

Project details

Learning objectives

Learning components

Expected outcomes

Project stages

Prerequisites

Model training and model evaluation

Interface development

Team members and their roles

Presentation and report

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

blender_colonoscopy

Project 3: Using synthetic surgical data from Blender to train a polyp detection model

Blender Randomiser Add-on tutorials on Moodle

Project details

Learning objectives

Learning components

Expected outcomes

Project stages

Prerequisites

Model training and model evaluation

Interface development

Team members and their roles

Presentation and report

References

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages