This project builds machine learning models to automatically detect and classify protein complexes in cryo-electron tomography (cryoET) images, enabling scalable analysis of cellular structures and supporting advanced biological and medical research.
A high-performance deep-learning pipeline for identifying protein complexes in 3D cryo-electron tomography (cryoET) tomograms.
Designed for clarity, reproducibility, and professional research use.
Cryo-electron tomography generates high-resolution 3D views of cellular environments.
These volumes contain thousands of tightly packed protein complexes that need automated identification for biological and medical research.
This project provides a complete machine-learning workflow for:
- 3D tomogram preprocessing
- Feature extraction
- Multi-class model training
- Inference and predictions
- Clean reproducible research structure
- Features
- Project Structure
- Dataset
- Modeling Approach
- Notebook
- Usage
- ModelPerformance
- References
- License
✔️ 3D tomogram ingestion & preprocessing
✔️ Deep-learning based multi-class identification
✔️ Modular and extendable source code
✔️ Kaggle-compatible training & inference
✔️ GPU-ready workflow
✔️ Clear, research-friendly architecture
🔗 CZII Cryo-ET Dataset (Kaggle)
https://www.kaggle.com/competitions/czii-cryo-et-object-identification/data
Dataset Includes:
| Component | Description |
|---|---|
| Tomograms | High-resolution 3D cryo-ET volumes |
| Masks | Protein complex segmentations |
| Metadata | Tomogram & annotation info |
| Classes | 5 professionally curated protein complex categories |
- Volume normalization
- 3D grid or patch generation
- Noise reduction
- Augmentation (optional)
- 3D-CNN / adapted YOLO-3D style detector
- Multi-class classifier
- Spatial feature extraction
- Balanced dataloading
- Weighted loss for rare classes
- GPU-trained models
- Accuracy
- F1-score
- IoU
- Per-class recall
📓 Full notebook implementation (training + inference):
https://www.kaggle.com/code/eeshakhanzadi/object-czii-competition?scriptVersionId=219427511
Includes:
| Stage | Status |
|---|---|
| Data Loading | ✔️ |
| Preprocessing | ✔️ |
| Model Training | ✔️ |
| Eval Metrics | ✔️ |
| Prediction Pipeline | ✔️ |
git clone https://github.com/YOUR_USERNAME/CryoET-Object-Identification.git
cd CryoET-Object-Identificationpython src/train.py
python src/inference.py --input sample_data/example_tomo.mrc
CryoET Dataset: https://www.kaggle.com/competitions/czii-cryo-et-object-identification/data
Training Notebook: https://www.kaggle.com/code/eeshakhanzadi/object-czii-competition
Licensed under the MIT License.