|
| 1 | +# Building the MIMIC database with SQLite |
| 2 | + |
| 3 | +Either `import.sh` or `import.py` can be used to generate a [SQLite]([https://sqlite.org/index.html) database file from the MIMIC-IV demo or full dataset. |
| 4 | + |
| 5 | +`import.sh` is a shell script that will work with any POSIX compliant shell. |
| 6 | +It is memory efficient and does not require loading entire data files |
| 7 | +into memory. It only needs three things to run: |
| 8 | + |
| 9 | +1. A POSIX compliant shell (e.g., dash, bash, zsh, ksh, etc.) |
| 10 | +2. [SQLite]([https://sqlite.org/index.html) |
| 11 | +3. gzip (which is installed by default on any Linux/BSD/Mac variant) |
| 12 | + |
| 13 | +**Note:** The `import.sh` script will set all data fields to *text*. |
| 14 | + |
| 15 | +`import.py` is a python script. It requires the following to run: |
| 16 | + |
| 17 | +1. Python 3 installed |
| 18 | +2. SQLite |
| 19 | +3. [pandas](https://pandas.pydata.org/) |
| 20 | + |
| 21 | +## Step 1: Download the CSV or CSV.GZ files. |
| 22 | + |
| 23 | +- Download the MIMIC-IV dataset from: https://physionet.org/content/mimiciv/ |
| 24 | +- Place `import.sh` or `import.py` into the same folder as the `csv` or `csv.gz` files |
| 25 | + |
| 26 | +i.e. your folder structure should resemble: |
| 27 | + |
| 28 | +``` |
| 29 | +path/to/mimic-iv/ |
| 30 | +├── import.sh |
| 31 | +├── import.py |
| 32 | +├── hosp |
| 33 | +│ ├── admissions.csv.gz |
| 34 | +│ ├── ... |
| 35 | +│ └── transfers.csv.gz |
| 36 | +└── hosp |
| 37 | + ├── chartevents.csv.gz |
| 38 | + ├── ... |
| 39 | + └── procedureevents.csv.gz |
| 40 | +
|
| 41 | +
|
| 42 | +## Step 2: Edit the script if needed. |
| 43 | +
|
| 44 | +`import.sh` does **not** need edits to work with either the demo or full dataset. |
| 45 | +Please continue to Step 3. |
| 46 | +
|
| 47 | +If you are using the `import.py` script, |
| 48 | +it may be necessary to make minor edits to the `import.py` script. For example: |
| 49 | +
|
| 50 | +- If your files are `.csv` rather than `csv.gz`, you will need to change `csv.gz` to `csv`. |
| 51 | +
|
| 52 | +## Step 3: Generate the SQLite file |
| 53 | +
|
| 54 | +To generate the SQLite file: |
| 55 | +
|
| 56 | +If you are using `import.sh`, run on the command-line: |
| 57 | +
|
| 58 | +``` |
| 59 | +$ ./import.sh |
| 60 | +``` |
| 61 | +
|
| 62 | +If you are using `import.py`, run on the command-line: |
| 63 | +
|
| 64 | +``` |
| 65 | +$ python import.py |
| 66 | +``` |
| 67 | +
|
| 68 | +If loading the full dataset, this will take some time, |
| 69 | +particularly the `CHARTEVENTS` table. |
| 70 | +
|
| 71 | +The scripts will ultimately generate an SQLite database file called `mimic4.db`. |
0 commit comments