For comprehensive technical documentation, see PROJECT_INFO.md
Imaging Data:
tags:
- X-ray
- Wrist
- Segmentation
- Classification license: bigscience-openrail-m
Benchmark code is available in https://github.com/maxQterminal/Rheumatoid-arthritis.
Please run the following command to download RAM-W600 image data:
git clone https://huggingface.co/datasets/TokyoTechMagicYang/RAM-W600Numerical Data: data
Four tabs:
- Lab Assessment: Input 6 biomarkers β Get RA diagnosis
- X-ray Analysis: Upload hand X-ray β Get erosion classification
- Combined Results: See both predictions together
- Model Performance: View model accuracy, comparison, and augmentation strategy
Important: Models are already in models/ folder (EfficientNet-B3, XGBoost). No additional setup needed!
Input: Blood tests (6 biomarkers) + Hand X-ray image
Output: RA diagnosis (Healthy / Seropositive / Seronegative) + Erosion status
Accuracy: 89% (blood tests) + 85.83% (X-ray with augmentation strategy)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER INTERACTION (UI) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Tab 1: Lab Assessment Tab 2: X-ray Analysis β
β βββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ β
β β Input 6 Biomarkers: β β Upload Hand X-ray Image: β β
β β β’ Age (years) β β β’ JPG/PNG/BMP format β β
β β β’ Gender (M/F) β β β’ 224Γ224 or larger β β
β β β’ RF (IU/mL) β β β β
β β β’ Anti-CCP (IU/mL) β β Click: "Analyze X-ray" β β
β β β’ CRP (mg/L) β β β β
β β β’ ESR (mm/hr) β β β β
β β β β β β
β β Click: "Get Diagnosis" β β β β
β βββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ β
β β β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA PREPROCESSING (Backend) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β NUMERIC DATA (Blood Tests) IMAGE DATA (X-ray) β
β βββββββββββββββββββββββββ βββββββββββββββββββββ β
β Input: [Age, Gender, RF, ...] Input: Image pixels β
β β β β
β 1. StandardScaler normalization 1. Resize to 224Γ224 β
β (subtract mean, divide by std) 2. Convert to 3-channel RGB β
β β 3. Apply ImageNet normalization β
β Normalized values ready β β
β for model input Preprocessed image ready β
β for model input β
β β
β See PROJECT_INFO.md "Data Preprocessing" section for details β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββ-βββββββββββββββββββ
β MODEL INFERENCE (Prediction) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β PATH 1: Numeric Model PATH 2: Imaging Model β
β βββββββββββββββββββββββββ ββββββββββββββββββββββ β
β Preprocessed biomarkers Preprocessed image β
β β β β
β ββββββββββββββββ ββββββββββββββββββββ β
β β XGBoost β β EfficientNet-B3 β β
β β Classifier β β CNN β β
β β (100 trees) β β (10.3M params) β β
β ββββββββββββββββ ββββββββββββββββββββ β
β β β β
β Multiclass Output: Binary Output: β
β P(Healthy) = 0.15 P(Erosive) = 0.72 β
β P(Seroneg) = 0.25 (72% confident) β
β P(Seropos) = 0.60 β Max Threshold: 0.5 (default) β
β β Since 0.72 > 0.5: β
β β Predict: "EROSIVE" β
β Prediction: β
β "SEROPOSITIVE" Confidence = 0.72 β
β (60% confident) β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER OUTPUT (UI Display) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Lab Assessment Tab Shows: X-ray Analysis Tab Shows: β
β ββββββββββββββββββββββββ ββββββββββββββββββββββββ β
β β Diagnosis Result: β β X-ray Classification:β β
β β β SEROPOSITIVE RA β β β EROSIVE β β
β β β β β β
β β Confidence: 60% β β Confidence: 72% β β
β β β β Decision: Threshold β β
β β Breakdown: β β = 0.35 β β
β β β’ P(Healthy) = 15% β β β β
β β β’ P(Seroneg) = 25% β β Interpretation: β β
β β β’ P(Seropos) = 60% β β "Joint erosions β β
β β β β are present" β β
β β Clinical Action: β β β β
β β β Start DMARD β β Clinical Action: β β
β β therapy β β β Confirm with β β
β β β Monitor closely β β radiologist β β
β β β Follow-up in 6 wks β β β Adjust treatment β β
β ββββββββββββββββββββββββ ββββββββββββββββββββββββ β
β β
β Combined Results Tab Shows: β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β OVERALL RA DIAGNOSIS SUMMARY β β
β β β β
β β Blood Tests: SEROPOSITIVE (60%) β β
β β Hand X-rays: EROSIVE (72%) β β
β β β β
β β Combined Assessment: β β
β β β HIGH RA LIKELIHOOD β β
β β - Positive autoimmune markers β β
β β - Visible joint erosions β β
β β β β
β β Recommendation: β β
β β β Advanced RA suspected β β
β β β Aggressive treatment indicated β β
β β β Consider rheumatology referral β β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Stage | Input | Processing | Output |
|---|---|---|---|
| User Input | Biomarkers or X-ray image | Enter via UI | Raw data |
| Preprocessing | Raw values/pixels | Normalize, resize, format | Ready for model |
| Model Inference | Preprocessed data | Neural net / Tree ensemble | Probability scores |
| Decision | Probabilities | Apply threshold | Class prediction |
| UI Display | Prediction + confidence | Format for display | Clinical summary |
data/raw_data/
βββ numeric/
β βββ train_pool.csv (3,848 original samples)
β βββ train_numeric.csv (2,658 training samples)
β βββ val_numeric.csv (570 validation)
β βββ test_numeric.csv (570 test)
β βββ healthy.csv (synthetic)
β βββ seronegative.csv (synthetic)
β
βββ imaging/RAM-W600/
βββ JointLocationDetection/images/ (800 X-ray images)
βββ splits/
β βββ train.csv (560 training)
β βββ val.csv (120 validation)
β βββ test.csv (120 test)
βββ SvdHBEScoreClassification/
βββ JointBE_SvdH_GT.json (erosion labels)
models/
βββ xgb_model.joblib (1.1 MB - blood test classifier)
βββ efficientnet.pth (41.3 MB - X-ray classifier - PRIMARY MODEL)
βββ resnet50.pth (41.3 MB - X-ray classifier - alternative)
βββ vit.pth (328 MB - X-ray classifier - alternative)
src/
βββ app/
β βββ app_medical_dashboard.py (Main app)
β βββ demo_predict.py (Test predictions)
βββ data/
βββ synth_and_numeric.py (Data preprocessing)
Problem Solved: Severe class imbalance (4.59:1 - 82% Erosive vs 18% Non-Erosive)
Solution Applied:
- WeightedRandomSampler: Balances batch-level sampling to 1:1 ratio
- Focal Loss (Ξ³=2.0): Focuses training on hard-to-learn minority class examples
- Progressive Augmentation: Flips, rotations Β±15Β°, color jitter, Gaussian blur
- F1-based Early Stopping: Monitors erosive class F1 (not validation loss)
- Optimized for M4 Metal GPU: Float32 dtype, batch size 16
Model Comparison (all trained with identical augmentation pipeline):
| Model | Accuracy | F1 Erosive | F1 Non-Erosive | Status |
|---|---|---|---|---|
| EfficientNet-B3 | 85.83% | 91.63% | 54.05% | β PRIMARY |
| ResNet50 | 82.50% | 89.45% | 48.78% | Alternative |
| ViT-B/16 | 80.00% | 87.23% | 53.85% | Alternative |
Selected Model: EfficientNet-B3
- Highest overall accuracy (85.83%, +5.83pp vs ViT)
- Best minority class F1 (54.05%, handles early RA detection)
- Optimal erosive recall (95.04%, catches most erosion cases)
- Fast inference (200-500 ms) vs ViT (slower, larger memory)
- See
reports/image/model_comparison_all_models.pngfor visualizations
- Input: 6 blood test biomarkers
- Output: Healthy / Seropositive RA / Seronegative RA
- Accuracy: 89.28%
- F1-Score: 85.77%
- ROC-AUC: 93.21%
- Speed: 15-50 ms
- Why this model: Best for tabular data, fast, interpretable, handles mixed feature types
This is critical for understanding why our models are trustworthy:
| Set | Size | Purpose | Model Learns? | Accuracy |
|---|---|---|---|---|
| Training | 2,658 | Model learns patterns | β Yes | 90-95% |
| Validation | 570 | Detect overfitting | β No | 87-89% |
| Test | 570 | Final honest score | β No | 85-89% |
Why this matters for patients:
- Without proper split: Model claims 95% but only 40% on new patients = wrong diagnosis β
- With proper splits: Model says 89% on unseen data = doctor can trust it β
train_pool.csv (3,848 samples): Original raw data before splitting. We split this 70/15/15 to create train/val/test. Kept for reproducibility.
Blood Test Features:
- Age: Patient age
- Gender: Male/Female
- RF: Rheumatoid factor (autoimmune antibody)
- Anti-CCP: Anti-cyclic citrullinated peptide antibody (RA-specific)
- CRP: C-reactive protein (inflammation marker)
- ESR: Erythrocyte sedimentation rate (inflammation indicator)
X-ray Analysis:
- Detects hand bone erosions (joint damage)
- Uses SvdH (Sharp Van Der Heide) scoring
- Binary: Erosive (damage present) or Non-erosive (no damage)
Data Processing: Each data type goes through specific preprocessing before model input:
Numeric Data:
- Normalization: StandardScaler (subtract mean, divide by std)
- Handles missing values with forward-fill + mean imputation
- Stratified split maintains class proportions
Image Data:
- Resize to 224Γ224 pixels
- Convert grayscale to 3-channel RGB (model requirement)
- ImageNet normalization (mean/std from pre-training)
- Data augmentation during training (rotations, flips, scaling)
β See PROJECT_INFO.md - Data Preprocessing for complete technical details
PROJECT_INFO.md covers:
- β Complete train/validation/test split explanation (with clinical implications)
- β How train_pool.csv relates to train/val/test
- β Full project architecture and technical specifications
- β How each model works (XGBoost, EfficientNet-B3)
- β Performance metrics (accuracy, F1, ROC-AUC)
- β Preprocessing steps with code examples
- β Training details and hyperparameters
- β How to make predictions programmatically
- β Exactly where training data comes from
- β Why data is organized this way
- β Data flow diagrams
- β File verification commands
- Python 3.8+
- pip or conda
pip install -r requirements.txt# Make sure you're in the project root directory
streamlit run src/app/app_medical_dashboard.pyOpens at http://localhost:8501
This project is fully portable! You can run it on any system (Windows, Mac, Linux) because:
β
All paths are relative - No hardcoded machine-specific paths
β
Auto-detects project structure - ROOT = os.path.dirname(...) finds models anywhere
β
Works from any directory - Just cd to project root and run
β
All dependencies in requirements.txt - One command to install everything
β
Models included - models/xgb_model.joblib and models/EfficientNet-B3_best.pth already in repo
To clone and run on another machine:
# 1. Clone repository
git clone https://github.com/maxQterminal/Rheumatoid-arthritis.git
cd Rheumatoid-arthritis
# 2. Install dependencies
pip install -r requirements.txt
# 3. Run app (works immediately, no configuration needed!)
streamlit run src/app/app.pyThat's it! No paths to update, no files to move. The app finds everything automatically.
Production Ready: All models trained, optimized, and tested
Documentation: Complete and comprehensive
Performance: 89.28% accuracy (numeric/blood tests) + 84.17% accuracy (imaging/X-rays)
Data Processing: Comprehensive preprocessing pipeline (see PROJECT_INFO.md)
Version: 1.0 | Last Updated: November 18, 2025