Skip to content

Commit 0b0a5bc

Browse files
authored
Update omop-ohdsi-resources.md
1 parent 8f536b7 commit 0b0a5bc

File tree

1 file changed

+345
-18
lines changed

1 file changed

+345
-18
lines changed

docs/omop-ohdsi-resources.md

Lines changed: 345 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,354 @@
1-
# OMOP/OHDSI Resources
1+
---
2+
title: OMOP/OHDSI Resources
3+
---
24

3-
This chapter contains practical resources for working with OHDSI and OMOP, including an interactive OMOP data dictionary, code snippets for common analytic tasks, and examples of observational research projects suited to OMOP/OHDSI frameworks.
5+
This chapter contains practical resources for working with Observational Health Data Science and Informatics (OHDSI) community resources and Observational Medical Outcomes Partnership (OMOP) data, including an interactive OMOP data dictionary, code snippets for common analytic tasks using a variety of software, and examples of observational research projects suited to OMOP/OHDSI frameworks. For a good overview of OHDSI/OMOP, please see the resources in the chapter, "New to RWE? Start Here."
46

5-
## OMOP CDM Basic Data Dictionary
6-
(Click-through resource in the original.)
7+
# OMOP Scavenger Hunt
78

8-
## Projects Best Suited for Observational Research and OHDSI Network Studies
9-
- Analytic use cases and examples (Clinical Characterization, Treatment Utilization, Outcome Incidence, etc.)
9+
In addition to the essential resources listed on "New to RWE? Start Here" page, jump-start your expertise by exploring these resources early in your OHDSI journey.
1010

11-
## Commonly Used CDM Tables Overview
12-
Person, Visit Occurrence, Condition Occurrence, Drug Exposure, Measurement, Procedure Occurrence, Observation, Device Exposure, Death.
11+
---
12+
## Join the Community
13+
### [Introduce yourself on the OHDSI Forums](https://forums.ohdsi.org/) — “Welcome to OHDSI” thread
14+
### [Follow OHDSI on LinkedIn](https://www.linkedin.com/company/ohdsi)
15+
### [Subscribe to the OHDSI Newsletter](https://ohdsi.org/subscribe-to-our-newsletter/)
16+
### [Learn about OHDSI Workgroups](https://ohdsi.org/upcoming-working-group-calls/)
17+
### [Attend an OHDSI Community Call](https://ohdsi.org/community-calls/)
18+
### [Review past & upcoming OHDSI events](https://www.ohdsi.org/ohdsi2025/)
19+
### [Join the OHDSI Microsoft Teams environment](https://forms.office.com/Pages/ResponsePage.aspx?id=lAAPoyCRq0q6TOVQkCOy1ZyG6Ud_r2tKuS0HcGnqiQZUQ05MOU9BSzEwOThZVjNQVVFGTDNZRENONiQlQCN0PWcu)
1320

14-
## OMOP Data Quality
15-
Pointers to The Book of OHDSI and Kahn et al. (2016).
21+
## Reading and Reference
22+
### [Bookmark the Book of OHDSI](https://ohdsi.github.io/TheBookOfOhdsi/)
23+
### [Bookmark NIH All of Us OMOP documentation](https://support.researchallofus.org/hc/en-us/articles/360039585391-How-the-Observational-Medical-Outcomes-Partnership-OMOP-vocabulary-are-structured)
1624

17-
## ETL Basics and Steps
18-
High-level steps from profiling and mapping to testing and deployment.
25+
## OMOP Training & Tutorials
26+
### [Enroll in the EHDEN Academy](https://academy.ehden.eu/)
27+
### [Watch OHDSI tutorials & workshops](https://youtube.com/playlist?list=PLpzbqK7kvfeXRQktX0PV-cRpb3EFA2e7Z)
1928

20-
## OHDSI Analysis Tools, Data Science Handbook, Data Management Tools & Resources, Programming Resources
21-
Multiple references listed in the original.
29+
## The OMOP CDM
30+
### [Bookmark the OMOP Common Data Model site](https://ohdsi.github.io/CommonDataModel/index.html)
31+
### [Read the OMOP CDM FAQ](https://ohdsi.github.io/CommonDataModel/faq.html)
2232

23-
## Special Topic: Clinical Registries Using OHDSI
24-
Includes slides link.
33+
## Standardized Vocabularies
34+
### [Search vocabularies in Athena](https://athena.ohdsi.org/search-terms/start) — look up individual concepts of interest to you
35+
36+
## Data
37+
### [Download the MIMIC-IV demo OMOP dataset](https://physionet.org/content/mimic-iv-demo-omop/0.9/1_omop_data_csv/)
38+
39+
## Software & Tools
40+
### [Review OHDSI software tools](https://ohdsi.org/software-tools/)
41+
### [Explore the Atlas Demo](https://atlas-demo.ohdsi.org/)
42+
43+
---
44+
45+
## 📊 OMOP CDM Basic Data Dictionary
46+
47+
For a sample interactive OMOP data dictionary detailing the fields in the OMOP CDM, please click on the thumbnail below. For the specific ARC study data dictionary, visit the [Neuromine Data Portal](https://data.answerals.org/home).
48+
49+
[![OMOP Data Dictionary Thumbnail](assets/DDthumbnail.png)](assets/OMOPDD.html)
50+
51+
## 📈 Projects Best Suited for Observational Research and OHDSI Network Studies
52+
53+
---
54+
55+
<h3>🧪 Analytic Use Cases and Examples</h3>
56+
57+
<style>
58+
table.use-case-table {
59+
width: 100%;
60+
border-collapse: collapse;
61+
margin-top: 1em;
62+
}
63+
64+
table.use-case-table th,
65+
table.use-case-table td {
66+
border: 1px solid #aaa;
67+
padding: 8px;
68+
text-align: left;
69+
vertical-align: top;
70+
}
71+
72+
table.use-case-table th {
73+
background-color: #f2f2f2;
74+
}
75+
</style>
76+
77+
<table class="use-case-table">
78+
<thead>
79+
<tr>
80+
<th>Analytic Use Case</th>
81+
<th>Type</th>
82+
<th>Structure</th>
83+
<th>Example</th>
84+
</tr>
85+
</thead>
86+
<tbody>
87+
<tr>
88+
<td rowspan="3"><strong>Clinical Characterization</strong></td>
89+
<td>Disease Natural History</td>
90+
<td>Amongst patients who are diagnosed with &lt;insert your disease of interest&gt;, what are the patient’s characteristics from their medical history?</td>
91+
<td>Amongst patients with rheumatoid arthritis, what are their demographics (age, gender), prior conditions, medications, and health service utilization behaviors?</td>
92+
</tr>
93+
<tr>
94+
<td>Treatment Utilization</td>
95+
<td>Amongst patients who have &lt;insert your disease of interest&gt;, which treatments were patients exposed to amongst &lt;list of treatments for disease&gt; and in which sequence?</td>
96+
<td>Amongst patients with depression, which treatments were patients exposed to SSRI, SNRI, TCA, bupropion, esketamine and in which sequence?</td>
97+
</tr>
98+
<tr>
99+
<td>Outcome Incidence</td>
100+
<td>Amongst patients who are new users of &lt;insert your drug of interest&gt;, how many patients experienced &lt;insert your known adverse event of interest from the drug profile&gt; within &lt;time horizon following exposure start&gt;?</td>
101+
<td>Amongst patients who are new users of methylphenidate, how many patients experienced psychosis within 1 year of initiating treatment?</td>
102+
</tr>
103+
<tr>
104+
<td rowspan="2"><strong>Population-level Effect Estimation</strong></td>
105+
<td>Safety Surveillance</td>
106+
<td>Does exposure to &lt;insert your drug of interest&gt; increase the risk of experiencing &lt;insert an adverse event&gt; within &lt;time horizon following exposure start&gt;?</td>
107+
<td>Does exposure to ACE inhibitor increase the risk of experiencing Angioedema within 1 month after exposure start?</td>
108+
</tr>
109+
<tr>
110+
<td>Comparative Effectiveness</td>
111+
<td>Does exposure to &lt;insert your drug of interest&gt; have a different risk of experiencing &lt;insert any outcome (safety or benefit)&gt; within &lt;time horizon following exposure start&gt;, relative to &lt;insert your comparator treatment&gt;?</td>
112+
<td>Does exposure to ACE inhibitor have a different risk of experiencing acute myocardial infarction while on treatment, relative to thiazide diuretic?</td>
113+
</tr>
114+
<tr>
115+
<td rowspan="3"><strong>Patient-level Prediction</strong></td>
116+
<td>Disease Onset and Progression</td>
117+
<td>For a given patient who is diagnosed with &lt;insert your disease of interest&gt;, what is the probability that they will go on to have &lt;another disease or related complication&gt; within &lt;time horizon from diagnosis&gt;?</td>
118+
<td>For a given patient who is newly diagnosed with atrial fibrillation, what is the probability that they will go on to have ischemic stroke in next 3 years?</td>
119+
</tr>
120+
<tr>
121+
<td>Treatment Response</td>
122+
<td>For a given patient who is a new user of &lt;insert your chronically-used drug of interest&gt;, what is the probability that they will &lt;insert desired effect&gt; in &lt;time window&gt;?</td>
123+
<td>For a given patient with T2DM who starts on metformin, what is the probability that they will maintain HbA1C &lt;6.5% after 3 years?</td>
124+
</tr>
125+
<tr>
126+
<td>Treatment Safety</td>
127+
<td>For a given patient who is a new user of &lt;insert your drug of interest&gt;, what is the probability that they will experience &lt;insert adverse event&gt; within &lt;time horizon following exposure&gt;?</td>
128+
<td>For a given patient who is a new user of warfarin, what is the probability that they will have GI bleed in 1 year?</td>
129+
</tr>
130+
</tbody>
131+
</table>
132+
133+
**Source:** OHDSI. *(2023).* [Save Our Sisyphus Challenge Slides (PDF)](https://www.ohdsi.org/wp-content/uploads/2023/01/SOS-challenge-intro-24jan2023.pdf)
134+
---
135+
---
136+
137+
## 🧭 Current CDM
138+
139+
![CDM54 Image](assets/cdm54.png)
140+
141+
*Source: [OHDSI Common Data Model](https://ohdsi.github.io/CommonDataModel/index.html)*
142+
143+
144+
145+
- 🔗 **Interactive (Select) OMOP Data Dictionary**
146+
https://github.com/DBJHU/DBJHU.github.io/blob/main/SelectOMOPDataDictionaryInteractivev2.html
147+
148+
149+
150+
---
151+
152+
## 🗂️ Commonly Used CDM Tables Overview
153+
154+
The OMOP common data model (CDM) is a relational database made up of different tables that relate to each other by foreign keys (XXXX_ID values; e.g., PERSON_ID or PROVIDER_ID). The OMOP tables in your data export are as follows:
155+
156+
| Table | Description |
157+
|----------------------|-------------|
158+
| **Person** | Contains basic demographic information describing a participant, including biological sex, birth date, race, and ethnicity. |
159+
| **Visit_occurrence** | Captures encounters with healthcare providers or similar events. Contains the type of visit a person has (outpatient care, inpatient care, or long-term care), as well as the date and duration information. Rows in other tables can reference this table, for example, condition_occurrences related to a specific visit. |
160+
| **Condition_occurrence** | Indicates the presence of a disease or medical condition stated as a diagnosis, a sign, or symptom, which is either observed by a provider or reported by the patient. |
161+
| **Drug_exposure** | Captures records about the utilization of a medication. Drug exposures include prescription and over-the-counter medicines, vaccines, and large-molecule biologic therapies. Radiological devices ingested or applied locally do not count as drugs. Drug exposure is inferred from clinical events associated with orders, prescriptions written, pharmacy dispensing, procedural administrations, and other patient-reported information. |
162+
| **Measurement** | Contains both orders and results of a systematic and standardized examination or testing of a participant or participant's sample, including laboratory tests, vital signs, quantitative findings from pathology reports, etc. |
163+
| **Procedure_occurrence** | Contains records of activities or processes ordered by or carried out by a healthcare provider on the patient to have a diagnostic or therapeutic purpose. |
164+
| **Observation** | Captures clinical facts about a person obtained in the context of an examination, questioning, or a procedure. Any data that cannot be represented by another domain, such as social and lifestyle facts, medical history, and family history, are recorded here. |
165+
| **Device_exposure** | Captures information about a person's exposure to a foreign physical object or instrument which is used for diagnostic or therapeutic purposes. Devices include implantable objects, blood transfusions, medical equipment and supplies, other instruments used in medical procedures, and material used in clinical care. |
166+
| **Death** | Contains the clinical events surrounding how and when a participant dies. |
167+
168+
169+
170+
---
171+
172+
## ✅ OMOP Data Quality
173+
174+
- [The Book of OHDSI — Chapter 15: Data Quality](https://ohdsi.github.io/TheBookOfOhdsi/DataQuality.html)
175+
- [Kahn et al. (2016): A Harmonized Data Quality Assessment Terminology and Framework](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5051581/)
176+
177+
---
178+
179+
## 🔧 ETL Basics
180+
181+
- PDF: https://www.ohdsi.org/wp-content/uploads/2019/09/OMOP-Common-Data-Model-Extract-Transform-Load.pdf
182+
- Book: https://ohdsi.github.io/TheBookOfOhdsi/ExtractTransformLoad.html
183+
184+
185+
186+
---
187+
188+
## 🛠️ ETL Steps
189+
190+
1. **Dataset profiling and documentation**
191+
- Create data model documentation, sample data, data dictionaries, code lists, and other relevant information (23-Aug)
192+
- Execute database profiling scan (WhiteRabbit) on source database
193+
- Prepare mapping approach/documents based on scan reports from database profiling scan
194+
195+
2. **Generation of the ETL Design**
196+
- Mapping workshop with all relevant parties to:
197+
1. Understand the source
198+
2. Define the scope of source data to be transformed
199+
3. Define acceptance criteria for OMOP output
200+
**Output:** draft mapping document
201+
- Finalize mapping document:
202+
- Integrate all notes/documentation from workshop
203+
- Work through mappings and verify, update, fill in gaps
204+
- Meetings/emails with data contact/technical contact (TC) as needed
205+
206+
3. **Source Data Integrations and Semantic Mapping**
207+
- Source Code mapping:
208+
- Identify which codes are already mapped to standard vocabulary
209+
- Identify code types for codes that need to be mapped
210+
- Translation of code description/phrases to English, if/as needed
211+
- Create proposed code mappings
212+
- Generate mappings for data coming out of flowsheets (together with consortium)
213+
- Review/approval of code mappings (often by medical experts with the Data Owner)
214+
- Identify imaging & waveform data; map using consortium-defined guidelines
215+
- Use OHNLP to extract OMOP data from unstructured sources
216+
217+
4. **Technical architecture design**
218+
- CI/CD strategy & version control
219+
- OHDSI ecosystem needs & infrastructure design
220+
221+
5. **Technical ETL Development**
222+
- Implement ETL (preferred language/structure)
223+
- Update ETL based on testing/QA/feedback (8, 9)
224+
225+
6. **Setting up Infrastructure**
226+
- Deploy core servers and services based on (4)
227+
228+
7. **Install OHDSI tools**
229+
- Database server, Achilles/DQD/Ares, Atlas/WebAPI, RStudio Server, HADES, notebooks & other site-specific tools
230+
231+
8. **ETL Testing and Validation**
232+
- Test ETL on sample/dev data, then DO data
233+
- Verify & document QA
234+
- Submit Achilles/DQD/AresIndexer results regularly
235+
- Plan & manage ETL development
236+
237+
9. **Data Quality Assessment**
238+
- QA/Acceptance testing for mapping accuracy & completeness
239+
- Review & approval by Data Owner
240+
241+
10. **Documentation**
242+
- Mapping Documentation, Themis checks, and technical/transform documentation
243+
244+
11. **Project Management Throughout**
245+
- Organize tasks, milestones, and follow-up
246+
247+
---
248+
249+
## 🧪 OHDSI Analysis Tools
250+
251+
R, SQL, Python, or any preferred data analysis software.
252+
**Reference:** [The Book of OHDSI — Chapter 9: SQL and R](https://ohdsi.github.io/TheBookOfOhdsi/SqlAndR.html)
253+
254+
255+
256+
---
257+
258+
## 📘 Data Science Handbook
259+
260+
[Open, rigorous and reproducible research: A practitioner’s handbook](https://datascience.stanford.edu/programs/stanford-data-science-scholars-program/data-science-handbook) — Stanford Data Science
261+
262+
263+
264+
---
265+
266+
## 🧰 Data Management Tools & Resources
267+
268+
- DMP Tool: https://dmptool.org/
269+
- NIH DMS Policy Planning: https://sharing.nih.gov/data-management-and-sharing-policy/planning-and-budgeting-for-data-management-and-sharing/writing-a-data-management-and-sharing-plan#after
270+
271+
---
272+
273+
## 💻 Programming Resources (Jupyter, Python, SQL, R)
274+
275+
- [Project Jupyter](https://jupyter.org/)
276+
- [What is the Jupyter Notebook?](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/what_is_jupyter.html)
277+
- [NIAID NIH Informatics resources](https://bioinformatics.niaid.nih.gov/resources)
278+
279+
**Software Carpentry (free lessons):**
280+
- [Programming with Python](http://swcarpentry.github.io/python-novice-inflammation/)
281+
- [Programming with R](http://swcarpentry.github.io/r-novice-inflammation/)
282+
- [Databases and SQL](http://swcarpentry.github.io/sql-novice-survey/)
283+
284+
**Additional resources:**
285+
- [DataCamp](http://www.datacamp.com/)
286+
- [Khan Academy — SQL Basics](https://www.khanacademy.org/computing/computer-programming/sql/sql-basics/v/welcome-to-sql)
287+
- [Codecademy — Learn Python 2](https://www.codecademy.com/learn/learn-python)
288+
- [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/)
289+
- [R for Data Science](https://r4ds.had.co.nz/)
290+
- NIH “All of Us” documentation:
291+
[Jupyter & programming](https://support.researchallofus.org/hc/en-us/articles/360039690191-Jupyter-Notebooks-and-programming)
292+
293+
294+
295+
---
296+
297+
## 🌐 OHDSI Resources
298+
299+
- [OHDSI Forums](https://forums.ohdsi.org/)*Introduce yourself on the “Welcome to OHDSI” thread!*
300+
- [The Book of OHDSI](https://ohdsi.github.io/TheBookOfOhdsi/)
301+
- [OMOP CDM FAQ](https://ohdsi.github.io/CommonDataModel/faq.html)
302+
- [OHDSI Microsoft Teams](https://forms.office.com/Pages/ResponsePage.aspx?id=lAAPoyCRq0q6TOVQkCOy1ZyG6Ud_r2tKuS0HcGnqiQZUQ05MOU9BSzEwOThZVjNQVVFGTDNZRENONiQlQCN0PWcu)
303+
- [MIMIC-IV demo OMOP dataset](https://physionet.org/content/mimic-iv-demo-omop/0.9/1_omop_data_csv/)
304+
- [EHDEN Academy](https://academy.ehden.eu/)
305+
- [Atlas Demo](https://atlas-demo.ohdsi.org/) and [Athena](https://athena.ohdsi.org/search-terms/start)
306+
- [OHDSI YouTube: tutorials & workshops](https://youtube.com/playlist?list=PLpzbqK7kvfeXRQktX0PV-cRpb3EFA2e7Z)
307+
- [OHDSI Community Dashboard](https://dash.ohdsi.org/)
308+
- [OMOP Common Data Model (docs)](https://ohdsi.github.io/CommonDataModel/index.html)
309+
- [Learn GitHub](https://docs.github.com/en/get-started/quickstart/hello-world)
310+
- [Community Calls](https://ohdsi.org/community-calls/) and [Workgroups](https://ohdsi.org/upcoming-working-group-calls/)
311+
- Follow OHDSI: [Twitter](https://twitter.com/OHDSI)[LinkedIn](https://www.linkedin.com/company/ohdsi)
312+
- [Subscribe to the OHDSI Newsletter](https://ohdsi.org/subscribe-to-our-newsletter/)
313+
- [OHDSI software](https://ohdsi.org/software-tools/)
314+
- [NIH All of Us — OMOP documentation](https://support.researchallofus.org/hc/en-us/articles/360039585391-How-the-Observational-Medical-Outcomes-Partnership-OMOP-vocabulary-are-structured)
315+
316+
317+
318+
---
319+
320+
## ⭐ Special Topic: Clinical Registries Using OHDSI
321+
322+
[![OHDSI and Clinical Registries: Sanity for Health Systems (Aug. 22 Community Call)](http://img.youtube.com/vi/DPatSxFkIpI/0.jpg)](https://youtu.be/DPatSxFkIpI?si=VOqE4VTlzIcxuWdP)
323+
324+
- Slides: [Clinical Registries in OHDSI — September 2022 (PDF)](https://www.ohdsi.org/wp-content/uploads/2022/09/OHDSI-Clinical_Registries.pdf)
325+
326+
## 💻 OMOP Code Snippets
327+
328+
We provide a publicly available set of OMOP code snippets used in the [I-LEARN Course](https://ilearn.tuftsctsi.org/product?catalog=D1RS_2025_18) to help learners explore and analyze OMOP Common Data Model datasets using tools like **R**, **SQL**, and **Python**.
329+
330+
🔗 **Repository**: [BoyceLab/OMOP-Code-Snippets-for-I-LEARN-Course](https://github.com/BoyceLab/OMOP-Code-Snippets-for-I-LEARN-Course)
331+
332+
### 🧰 What You'll Find in this Repository
333+
The repository contains example scripts and templates to:
334+
335+
- Query OMOP data using **SQL**
336+
- Analyze OMOP-mapped data using **R**
337+
- Connect and run queries via **RPostgreSQL**
338+
- Explore how standard concepts relate to source codes
339+
340+
### 📂 Folder Highlights
341+
342+
- SQL/: Ready-to-use SQL queries for common OMOP domains (e.g., drug exposure, observation).
343+
- R/: R scripts that demonstrate how to load, analyze, and visualize OMOP data.
344+
- concepts/: Examples for working with concept_id and concept_relationship tables.
345+
346+
### 📘 Use Cases
347+
348+
These snippets are designed for:
349+
- Learners in the **Tufts CTSI I-LEARN course**
350+
- Researchers new to **OHDSI/OMOP**
351+
- Analysts working with **OMOP-formatted ALS datasets**
352+
353+
---
25354

26-
## OMOP Code Snippets
27-
Repository link in the original.

0 commit comments

Comments
 (0)