The CAISR system is designed to streamline tasks related to sleep data analysis by using Docker containers for ease of use. The primary method of using this system is by downloading pre-built Docker images from our website. However, the Python code used to create these images is also available here for transparency and customization purposes. Users who wish to adapt the system to their own datasets or analysis preferences can rebuild the Docker images with minimal effort.
-
Docker: Ensure Docker is installed on your machine. You can download it here.
-
Python: Python 3.7+ should be installed.
-
Required Libraries: Ensure you have the following Python libraries installed:
dockerandsubprocess.Install them using:
pip install docker subprocess
The CAISR system requires a specific directory structure to function correctly. Here’s how you should organize your files and folders:
your_project_directory/
│
├── dockers/ # Contains all the Docker image tar.gz files
│ ├── caisr_preprocess.tar.gz
│ ├── caisr_stage.tar.gz
│ ├── caisr_arousal.tar.gz
│ ├── caisr_resp.tar.gz
│ ├── caisr_limb.tar.gz
│ └── caisr_report.tar.gz
│
├── data/raw/ # Input data folder that will be processed
│ ├── file1.edf
│ ├── file2.edf
│ └── ... (other files)
│
├── data/ # H5 files after basic preprocessing/resampling
│ ├── file1.h5
│ ├── file2.h5
│
├── caisr_output/ # Output folder where results will be stored
│ ├── stage/
│ ├── arousal/
│ ├── resp/
│ ├── limb/
│ └── report/
│
└── caisr.py # The main script provided in this readme
- Place the
*.tar.gzDocker files in thedockers/folder. - These will be loaded automatically by the script if not already installed.
- Place your input data files (e.g.,
.edffiles) in thedata/raw/folder. - The script will automatically create necessary subfolders in the
caisr_output/folder based on the tasks you run.
You can either supply .edf files or .h5 files.
CAISR expects a certain minimum number of channels to be available. The algorithm accounts for some basic channel renaming and imputation of missing channels. However, it’s necessary to follow the channel naming conventions below for optimal performance:
-
EEG (Electroencephalogram):
- At least one EEG is required, named either:
f3-m2,f2-m1,c3-m2,c4-m1,o1-m2,o2-m1, oreeg. - CAISR performs best when six EEG channels are provided:
f3-m2,f2-m1,c3-m2,c4-m1,o1-m2,o2-m1.
- At least one EEG is required, named either:
-
EOG (Electrooculogram):
- At least one EOG is required, named either:
e2-m1ore1-m2.
- At least one EOG is required, named either:
-
Chin EMG (Electromyogram):
- A chin EMG channel
chin1-chin2is required.
- A chin EMG channel
-
Respiratory Channels:
-
Either
ptaf(pressure transducer airflow) orcflow(flow measured during CPAP (Continuous Positive Airway Pressure)) is required. If both are set to zero,abd+chestwill be used as the primary breathing trace. -
Both abdominal and thoracic effort belt signals are required, named:
abd(abdominal) andchest(thoracic). -
Oxygen saturation signal is required, named:
spo2. -
Optional Respiratory Channels for Optimal Performance:
airflow,cflow,cpap_on.
Note:airflowmeasured via thermistor is ideal, yet optional, whenptaforcflowis provided.ptaforcflowshould be set to all zeros unless it's a titration/split-night study. In the case of a titration/split-night study, ideally a binary indicatorcpap_on(e.g., 0000001111111) is provided.
-
-
Leg EMG (Electromyogram):
- Two leg EMGs are required for limb movement analysis, named:
lat(left leg) andrat(right leg).
- Two leg EMGs are required for limb movement analysis, named:
-
Optional:
- Heart rate channel:
hr. - Body position channel:
position(used in the report but not in the analysis).
- Heart rate channel:
-
Sampling Rate:
Any sampling rate can be provided, as CAISR will resample the data to 200 Hz.
You can provide .h5 files instead of .edf files. This skips the channel renaming logic that occurs with .edf files. Therefore, the .h5 files need to match the required format exactly.
- Expected Sampling Rate: 200 Hz.
- Required Channels:
'f3-m2', 'f4-m1', 'c3-m2', 'c4-m1', 'o1-m2', 'o2-m1', 'e1-m2', 'e2-m1', 'chin1-chin2', 'abd', 'chest', 'spo2', 'ecg', 'lat', 'rat' - Optional Channels:
'airflow', 'cpap_on', 'hr', 'position'
The Python code to create a suitable .h5 file is:
def save_prepared_data(path, signal):
with h5py.File(path, 'w') as f:
f.attrs['sampling_rate'] = 200
f.attrs['unit_voltage'] = 'V'
group_signals = f.create_group('signals')
for name in signal.columns:
group_signals.create_dataset(name, data=signal[name], shape=(len(signal), 1),
maxshape=(len(signal), 1), dtype='float32', compression="gzip")- In the
caisr.pyfile, modify thetaskslist to include the tasks you want to execute. - Available tasks are:
preprocess: Preprocessing of raw data.stage: Sleep stage classification.arousal: Arousal detection.resp: Respiratory analysis.limb: Limb movement analysis.report: Generate a comprehensive report based on the analysis.
- Preprocessing is optional. If your data is already preprocessed, you can remove it from the list.
# Example of task configuration
tasks = ['preprocess', 'stage', 'arousal', 'resp', 'limb', 'report']- Execute the main script using Python:
python caisr.py
- The script will automatically load Docker images, run the specified tasks, and output the results into the
caisr_output/folder.
While the CAISR system is optimized for standard .edf files and report formats, users with unique datasets or specific reporting needs can customize the preprocessing and reporting scripts. This flexibility allows you to handle non-standard .edf files or tailor the output reports to better suit your research requirements.
After making your desired changes to the preprocessing or report generation code, you can easily rebuild the Docker images by running:
python create_caisr_dockers.pyThis automated script ensures that your customized Docker containers are quickly generated and ready for use.
The script will first list all available Docker images on your system.
The script will match the Docker images to the specified tasks. If a required Docker image is missing, it will be automatically loaded from the dockers/ folder.
For each task:
- The script will run the corresponding Docker container.
- It will mount the
data/folder as input andcaisr_output/folder as output.
Results from each task will be stored in separate subfolders within the caisr_output/ folder. Each subfolder corresponds to a specific task, such as stage, arousal, etc.
The numerical - per sample - output of CAISR is stored
in caisr_output/combined/, containing the sleep stage hypnogram, the respiratory events, arousal events, and limb movement events. These output CSV files are saved at 2 Hz.
For each input data file, CAISR generates the following output:
-
PDF Report:
A comprehensive report in.pdfformat displaying polysomnography signals and CAISR analysis results. -
CSV File:
A.csvfile (caisr_sleep_metrics_all_studies.csv) containing summary statistics from the sleep staging tasks, with the following key metrics:-
Total Sleep Metrics:
TST (h): Total Sleep Time in hoursRecording (h): Total Recording Time in hoursEff (%): Sleep Efficiency percentage
-
Sleep Stage Distribution:
REM (%): Percentage of time spent in REM sleepN1 (%): Percentage of time spent in N1 sleepN2 (%): Percentage of time spent in N2 sleepN3 (%): Percentage of time spent in N3 sleep
-
Sleep Disruption Metrics:
WASO (min): Wake After Sleep Onset in minutesSL (min): Sleep Latency in minutesSFI: Sleep Fragmentation Index
-
Arousal Metrics:
Arousal I.: Arousal Index
-
Limb Movement Index (LMI):
LMI: Limb Movement Index
-
Respiratory Disturbance Metrics:
AHI: Overall Apnea-Hypopnea IndexAHI NREM: AHI during NREM sleepAHI REM: AHI during REM sleepRDI: Respiratory Disturbance IndexOAI: Obstructive Apnea IndexCAI: Central Apnea IndexMAI: Mixed Apnea IndexHYI: Hypopnea IndexRERAI: Respiratory Effort-Related Arousal Index
-
You can optionally clean up Docker images and containers after running the tasks. This will require reloading/rebuilding the Docker images for future runs.
- Error Handling: The script includes basic error handling. If a Docker image cannot be found or loaded, the script will exit with an error message.
- Modularity: You can customize the tasks and their order by modifying the
taskslist in the script. - Preprocessing: If your data is already preprocessed, you can skip the
preprocesstask.
- Docker Not Found: Ensure Docker is installed and running. Check by running
docker --version. - Docker Image Not Found: Make sure all required Docker images are available in the
dockers/folder or already installed on your system.
This project is free to use for non-commercial purposes. For commercial use, please contact us directly.
Please make sure to cite the following paper if you use CAISR in any of your work
Nasiri, S., Ganglberger, W., Nassi, T., Meulenbrugge, E. J., Moura Junior, V., Ghanta, M., ... & Westover, M. B. (2025). CAISR: Achieving Human-Level Performance in Automated Sleep Analysis Across All Clinical Sleep Metrics. Sleep, zsaf134.
For support or inquiries, please open an issue on GitHub. If you have any questions or need clarification, the CAISR development team can be contacted via:
- Samaneh Nasiri, PhD
- Wolfgang Ganglberger, PhD
- Thijs-Enagnon Nassi, PhD
- Erik-Jan Meulenbrugge
- Haoqi Sun, PhD
- Robert J Thomas, MD
- M Brandon Westover, MD, PhD

