-
Notifications
You must be signed in to change notification settings - Fork 0
Dataset & Preprocessing
Sheshank Singh edited this page Nov 3, 2025
·
1 revision
The model uses two publicly available chest X-ray datasets from the NIH Tuberculosis CXR Collection:
-
Shenzhen Hospital CXR Set (China)
- Collected by Shenzhen No. 3 People’s Hospital.
- Includes normal and TB-positive X-rays with manually segmented masks.
-
Montgomery County CXR Set (USA)
- Collected by the Department of Health and Human Services, Montgomery County.
- Contains TB-affected and healthy lungs with expert-labeled left and right masks.
These datasets are standard benchmarks for TB detection and segmentation tasks.
- Grayscale conversion
- Resizing to 256×256
- Normalization of pixel intensity values (0–1 range)
To improve generalization and robustness:
A.Compose([
A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.1, rotate_limit=10, p=0.8),
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5)
])