Dataset & Preprocessing

Dataset and Preprocessing

Datasets Used

The model uses two publicly available chest X-ray datasets from the NIH Tuberculosis CXR Collection:

Shenzhen Hospital CXR Set (China)
- Collected by Shenzhen No. 3 People’s Hospital.
- Includes normal and TB-positive X-rays with manually segmented masks.
Montgomery County CXR Set (USA)
- Collected by the Department of Health and Human Services, Montgomery County.
- Contains TB-affected and healthy lungs with expert-labeled left and right masks.

These datasets are standard benchmarks for TB detection and segmentation tasks.

Preprocessing Steps

Grayscale conversion
Resizing to 256×256
Normalization of pixel intensity values (0–1 range)

Augmentation

To improve generalization and robustness:

A.Compose([
    A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.1, rotate_limit=10, p=0.8),
    A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5)
])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dataset & Preprocessing

Dataset and Preprocessing

Datasets Used

Preprocessing Steps

Augmentation

Uh oh!

Uh oh!

Clone this wiki locally