A hands-on Jupyter notebook tutorial for machine-learning-based image processing of electrochemical and scientific microscopy data, developed for the DPG 2026, AKPIK session, Dresden, March 2026.
- Introduction
- Credits / Contributors
- References
- Tutorial Notebooks
- Dependencies
- Getting Started
- License
DPG2026 is a tutorial repository built around Jupyter notebooks, covering the full machine-learning pipeline for scientific image analysis — from raw data to segmented results.
The tutorial is structured around three main themes:
- Image Preprocessing — normalization, denoising, contrast adjustment, edge detection, binarization, and morphological operations.
- Synthetic Image Generation — classical augmentation, physics-based synthesis, DCGAN-based generation, and Stable Diffusion-based image-to-image synthesis.
- Image Segmentation — U-Net training and inference, Segment Anything Model (SAM v1 & v2), NASA MicroNet, and particle tracking with TrackPy.
The tutorial targets electrochemical imaging applications (e.g., SEM, EBC, battery-material microscopy) but the methods generalize to any scientific imaging domain.
Event: DPG 2026, AKPIK session — Dresden, March 2026.
| Name | Affiliation | Contact |
|---|---|---|
| Amir Omidvarnia | Forschungszentrum Jülich | a.omidvarnia@fz-juelich.de |
| Simone Koecher | Forschungszentrum Jülich | s.koecher@fz-juelich.de |
| Mobina Azimi | Forschungszentrum Jülich | m.azimi@fz-juelich.de |
- Kaggle Microscopy / SEM Datasets — https://www.kaggle.com/datasets
- NASA Pretrained Microscopy Models (MicroNet) — https://github.com/nasa/pretrained-microscopy-models
M. Peirce et al., "Pretrained Microscopy Models," NASA, 2022. Available: https://github.com/nasa/pretrained-microscopy-models
-
PyTorch — https://pytorch.org/
A. Paszke et al., "PyTorch: An Imperative Style, High-Performance Deep Learning Library," Advances in Neural Information Processing Systems, vol. 32, 2019.
-
TensorFlow / Keras — https://www.tensorflow.org/
M. Abadi et al., "TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems," 2015. [Online]. Available: https://www.tensorflow.org/
-
Segmentation Models PyTorch — https://github.com/qubvel-org/segmentation_models.pytorch
P. Yakubovskiy, "Segmentation Models PyTorch," GitHub, 2020. Available: https://github.com/qubvel-org/segmentation_models.pytorch
-
Segment Anything Model (SAM v1) — https://github.com/facebookresearch/segment-anything
A. Kirillov et al., "Segment Anything," in Proc. IEEE/CVF ICCV, pp. 4015–4026, 2023. DOI: 10.1109/ICCV51070.2023.00371
-
SAM 2 — https://github.com/facebookresearch/sam2
N. Ravi et al., "SAM 2: Segment Anything in Images and Videos," arXiv, 2024. DOI: 10.48550/arXiv.2408.00714
-
OpenCV — https://opencv.org/
G. Bradski, "The OpenCV Library," Dr. Dobb's Journal of Software Tools, 2000. Available: https://opencv.org/
-
scikit-image — https://scikit-image.org/
S. van der Walt et al., "scikit-image: Image processing in Python," PeerJ, vol. 2, p. e453, 2014. DOI: 10.7717/peerj.453
-
TrackPy — https://github.com/soft-matter/trackpy
D. B. Allan et al., "soft-matter/trackpy: Fast, Flexible Particle-Tracking Toolkit," Zenodo, 2021. DOI: 10.5281/zenodo.4682814
-
Diffusers (Stable Diffusion) — https://huggingface.co/docs/diffusers
P. von Platen et al., "Diffusers: State-of-the-art diffusion models," GitHub, 2022. Available: https://github.com/huggingface/diffusers
-
IOPaint — https://github.com/Sanster/IOPaint
The tutorial is split into three notebook groups, intended to be followed in order:
| Notebook | Description |
|---|---|
preprocessing_basics.ipynb |
Fundamentals of digital image processing: normalization, denoising, contrast enhancement, edge detection, binarization, morphology, and connected-component labeling. |
| Notebook | Description |
|---|---|
example_Aug.ipynb |
Classical augmentation: geometric and intensity transforms to expand labeled datasets. |
example_PB.ipynb |
Physics-based synthesis: combining real backgrounds with parameterized particle models. |
example_DCGAN.ipynb |
Deep Convolutional GAN (DCGAN) for generating realistic synthetic microscopy images. |
example_SDiff.ipynb |
Stable Diffusion image-to-image synthesis for creating new realistic sample variations. |
| Notebook | Description |
|---|---|
example_unet.ipynb |
Train and evaluate a U-Net segmentation model on synthetic/real pairs. |
example_sam1.ipynb |
Zero-shot and prompt-based segmentation using SAM v1. |
example_sam2.ipynb |
Segmentation in images and video using SAM 2. |
example_nasa_micronet.ipynb |
Apply NASA MicroNet pretrained models to electrochemical microscopy data. |
example_trackpy.ipynb |
Particle detection, trajectory linking, drift correction, and motion analysis using TrackPy. |
Most tutorial notebooks share a common configuration and directory layout driven by a single YAML file and a small helper module:
-
Central configuration: All generation and segmentation notebooks load parameters from a YAML file at the repository root (
tutorial_parameters.yaml) usingConfigLoaderfromsrc/synth_data_module. Typical patterns look like:config_path = repo_root / 'tutorial_parameters.yaml'config = ConfigLoader(config_path)- Scalars and paths are then accessed as dictionary-style keys, e.g.
config['pretrained_models_dir']orconfig.get_dataset_params('EBC1').
-
Synthetic data notebooks (
notebooks/synth_data/*):ConfigLoaderandPreparationManagerread dataset- and method-specific blocks (e.g.PB,Aug,DCGAN,SDiff) from the YAML file.- New folders for preprocessed images, masks, and generated synthetic data are created under a configurable base directory
(e.g. inside
preprocessed_data/and method-specific subfolders such asPB_EBC1,PB_SEM, etc.). - For each method, the notebooks initialise a
PreparationManagerand aSynthDataGeneratorwithrepo_root,dataset, andmethod_name; these classes internally create and manage:- training and validation image/mask directories,
- synthetic image and binary-mask output folders,
- optional labelled-mask and augmentation folders (for advanced workflows).
-
Segmentation notebooks (
notebooks/segmentation/*):- All segmentation notebooks (U-Net, SAM1, SAM2, NASA MicroNet, TrackPy) load the same
tutorial_parameters.yamlviaConfigLoaderto obtain shared paths for models and pretrained weights. - From this YAML, they derive repository-relative output locations, for example:
pretrained_model_dir = os.path.join(repo_root, config['pretrained_models_dir'], 'segment_anything1_META')pretrained_model_dir = os.path.join(repo_root, config['pretrained_models_dir'], 'segment_anything2_META')pretrained_model_dir = os.path.join(repo_root, config['pretrained_models_dir'], 'NASA_Micronet')
- Each notebook then creates additional experiment-specific folders as needed, such as:
- U-Net:
output_dirfor saving training curves, predictions, and masks. - SAM1/SAM2: model checkpoint directories under
pretrained_models_dirfor downloaded.pth/.ptfiles. - NASA MicroNet: model snapshots and fine-tuned checkpoints under the NASA-specific subfolder.
- U-Net:
- All segmentation notebooks (U-Net, SAM1, SAM2, NASA MicroNet, TrackPy) load the same
In practice, this design means you only have to edit the YAML file once (for data paths, model roots, and high-level
hyperparameters); all notebooks pick up consistent settings and write their outputs into predictable, method-specific
subdirectories under preprocessed_data/, Pretrained_models/, and synthetic data folders.
The tutorial targets Python 3.11. Key dependencies include:
- Deep learning: PyTorch 2.x, TensorFlow 2.x / Keras
- Segmentation: segmentation-models-pytorch, segment-anything, sam2
- Image processing: OpenCV, scikit-image, Pillow, tifffile
- Generative models: Diffusers, IOPaint
- Tracking: TrackPy, PIMS
- Scientific computing: NumPy, SciPy, pandas, scikit-learn
- Visualization: Matplotlib, Seaborn
Full dependency list: requirements.txt
- Visit https://code.visualstudio.com/ and download the version for your OS.
- Install the Python and Jupyter extensions from the VS Code marketplace.
uv is a fast Python package installer and resolver.
Linux / macOS:
curl -LsSf https://astral.sh/uv/install.sh | shWindows:
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"Verify:
uv --versiongit clone https://jugit.fz-juelich.de/iet-1/dpg2026_mlip_tutorial
.git
cd dpg2026_mlip_tutorial
uv venv .dpg2026 --python 3.11Linux / macOS:
source .dpg2026/bin/activateWindows:
.dpg2026\Scripts\activateuv pip install --upgrade pip
uv pip install -r requirements.txtNote on NASA MicroNet: Install manually with:
pip install git+https://github.com/nasa/pretrained-microscopy-models
Note on TrackPy: If you encounter compatibility issues, install from source:
uv pip install https://github.com/soft-matter/trackpy/archive/master.zip
pip install -e .- Open the Command Palette:
Ctrl+Shift+P(macOS:Cmd+Shift+P) - Select "Python: Select Interpreter"
- Choose
./.dpg2026/bin/python(or browse to the path)
To set the Jupyter kernel:
- Open a
.ipynbnotebook in VS Code - Click "Select Kernel" (top-right corner)
- Choose "Python Environments" → select
.dpg2026
To run notebooks outside VS Code:
source .dpg2026/bin/activate
jupyter notebook --no-browser --ip=0.0.0.0 --port=8888Some clusters may use AMD GPUs. In this case, load the ROCm module before starting Python:
module purge
module load rocm
module load rocm/6.4Install tensorflow-rocm instead of standard TensorFlow:
pip install tensorflow-rocmFor PyTorch with ROCm support:
pip install --index-url https://download.pytorch.org/whl/rocm6.1 torch torchvision torchaudioEnsure the NVIDIA CUDA toolkit is loaded:
module load cudaUse the standard packages:
pip install tensorflowThis project is licensed under the GNU General Public License v3.0.
See the full license text at: https://www.gnu.org/licenses/gpl-3.0.html
