A Python tool for the structured analysis of clinical trial eligibility criteria by extracting and organizing atomic criteria into logical structures.
ClearMatch processes clinical trial data from ClinicalTrials.gov, extracting structured information about eligibility criteria. It performs three key steps:
- Identification – Extracts atomic criteria from raw text.
- Logical Structuring – Organizes these criteria using logical relationships (
AND,OR,NOT,XOR,CONDITIONAL). - Matching Patients to Oncological Clinical Trials – (Planned but not yet implemented).
- ✅ Fetches clinical trial data from the ClinicalTrials.gov API.
- ✅ Extracts and structures eligibility criteria into logical expressions.
- ✅ Persists processed data as JSON files for further analysis.
- 🚧 Upcoming: Automated patient matching system.
- 🚧 Upcoming: UI.
- Python 3.13+
- OpenAI API key (for GPT-4o access) 📌 Get your API key here: OpenAI API Keys
1️⃣ Clone the repository
gh repo clone judacas/Clinical-Trial-PromptsThis uses GitHub CLI. If you don’t have it, use:
git clone https://github.com/judacas/Clinical-Trial-Prompts.gitmake sure to then cd into the root directory
cd Clinical-Trial-Prompts2️⃣ **Set up a virtual environment (Optional but Recommended) **
python -m venv .venv
source .venv/bin/activate # macOS/Linux
.venv\Scripts\activate # Windows3️⃣ Install dependencies
pip install -r requirements.txt4️⃣ Set up environment variables
Copy the example sample.env file and rename it to proper .env naming convention:
cp src/sample.env src/.env # macOS/Linux
copy src\sample.env src\.env # WindowsTo edit the .env file in the terminal, use:
nano src/.env # Linux/macOS
notepad src\.env # WindowsThen, add your OpenAI API key:
OPENAI_API_KEY="your-api-key-here"📌 Note: The .env file is ignored by Git to prevent accidental key exposure.
Run ClearMatch using:
python -m src.mainFollow the command-line instructions to process and structure clinical trial data.
- Fetch raw trial data from ClinicalTrials.gov.
- Identify the atomic eligibility criteria in the selected trials.
- Structure criteria using logical operators (
AND,OR, etc.). - Store results as structured JSON files in
output/subdirectory for further use.
- 🔹 Add automated patient-trial matching.
- 🔹 Implement an API to allow external applications to query structured trial data.
- 🔹 Optimize logical structuring for better accuracy.
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License.