Skip to content

Commit 2b67c92

Browse files
author
wikiselev
committed
first commit
0 parents  commit 2b67c92

File tree

23 files changed

+2294
-0
lines changed

23 files changed

+2294
-0
lines changed

.github/workflows/docs.yml

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
name: Build and publish Alethio Therapeutics documentation
2+
3+
on:
4+
push:
5+
branches: [ '**' ] # This will match all branches
6+
tags:
7+
- 'v*'
8+
# You can specify paths if you only want to trigger on certain file changes
9+
paths:
10+
- 'src/**'
11+
- 'pyproject.toml'
12+
- 'README.md'
13+
workflow_dispatch: # This enables manual triggering
14+
inputs:
15+
version:
16+
description: 'Override version tag (optional)'
17+
required: false
18+
default: ''
19+
20+
# security: restrict permissions for CI jobs.
21+
permissions:
22+
contents: read
23+
24+
jobs:
25+
# Build the documentation and upload the static HTML files as an artifact.
26+
build:
27+
runs-on: ubuntu-latest
28+
steps:
29+
- uses: actions/checkout@v5
30+
with:
31+
persist-credentials: false
32+
- uses: actions/setup-python@v6
33+
with:
34+
python-version: '3.10'
35+
- run: pip install -e .
36+
- run: pip install pdoc
37+
- run: pdoc -o docs alethiotx --docformat restructuredtext
38+
39+
- uses: actions/upload-pages-artifact@v4
40+
with:
41+
path: docs/
42+
43+
# Deploy the artifact to GitHub pages.
44+
# This is a separate job so that only actions/deploy-pages has the necessary permissions.
45+
deploy:
46+
needs: build
47+
runs-on: ubuntu-latest
48+
permissions:
49+
pages: write
50+
id-token: write
51+
environment:
52+
name: github-pages
53+
url: ${{ steps.deployment.outputs.page_url }}
54+
steps:
55+
- id: deployment
56+
uses: actions/deploy-pages@v4

.github/workflows/pypi-publish.yml

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
name: Publish to PyPI
2+
3+
on:
4+
push:
5+
branches: [ '**' ] # This will match all branches
6+
tags:
7+
- 'v*'
8+
# You can specify paths if you only want to trigger on certain file changes
9+
paths:
10+
- 'src/**'
11+
- 'pyproject.toml'
12+
- 'README.md'
13+
workflow_dispatch: # This enables manual triggering
14+
inputs:
15+
version:
16+
description: 'Override version tag (optional)'
17+
required: false
18+
default: ''
19+
20+
jobs:
21+
build-and-publish:
22+
runs-on: ubuntu-latest
23+
permissions:
24+
contents: read
25+
id-token: write # for PyPI (future trusted publisher)
26+
steps:
27+
- name: Checkout
28+
uses: actions/checkout@v4
29+
30+
- name: Set up Python
31+
uses: actions/setup-python@v5
32+
with:
33+
python-version: "3.11"
34+
cache: "pip"
35+
36+
- name: Install build backend
37+
run: |
38+
python -m pip install --upgrade pip
39+
pip install build twine
40+
41+
- name: Build distributions
42+
run: python -m build
43+
44+
- name: Verify metadata
45+
run: twine check dist/*
46+
47+
- name: Publish to PyPI
48+
run: twine upload --skip-existing dist/*

.gitignore

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
dist
2+
*.egg-info
3+
.venv/
4+
__pycache__/
5+
*.csv
6+
*.tsv
7+
*.txt
8+
*.pkl

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2025 Alethio Therapeutics
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
# alethiotx
2+
3+
[![Python Version](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
4+
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5+
6+
**Alethio Therapeutics Python Toolkit** - A growing collection of open-source computational tools used by Alethio Therapeutics.
7+
8+
## Overview
9+
10+
`alethiotx` is a modular Python package providing specialized tools for therapeutic research and drug discovery. Currently, the package features the **Artemis** module for drug target prioritization using public knowledge graphs. Additional modules and capabilities will be added in future releases.
11+
12+
### Current Modules
13+
14+
#### Artemis Module (`alethiotx.artemis`)
15+
16+
The Artemis module enables accessible and scalable drug prioritization by integrating clinical trial data, drug databases (TTD), pathway information, and machine learning models. It leverages public knowledge graphs to prioritize therapeutic targets across multiple disease areas.
17+
18+
### Artemis Module Features
19+
20+
- **Clinical Trials**: Query and analyze clinical trials data from ClinicalTrials.gov
21+
- **TTD**: Match clinical interventions with TTD drug information and targets
22+
- **Pathway Genes**: Retrieve and analyze pathway genes using GeneShot API
23+
- **Target Scoring**: Calculate clinical target scores for drug targets based on trial phases and approvals
24+
- **Machine Learning Pipeline**: Built-in cross-validation and for target prediction
25+
- **Multi-Disease Support**: Pre-configured for breast, lung, prostate, melanoma, bowel cancer, diabetes, and cardiovascular disease
26+
27+
### Future Modules
28+
29+
Additional modules for various aspects of drug discovery and therapeutic research are planned for future releases. Stay tuned!
30+
31+
## Installation
32+
33+
```bash
34+
pip install alethiotx
35+
```
36+
37+
## Quick Start
38+
39+
> **Note:** The examples below demonstrate the **Artemis** module functionality. As new modules are added to the package, they will have their own usage examples.
40+
41+
### 1. Retrieve Clinical Trials Data
42+
43+
```python
44+
from alethiotx.artemis import trials, ttd, drugscores
45+
46+
# Query clinical trials for a specific indication
47+
breast_trials = get_clinical_trials(search='Breast Cancer', last_6_years=True)
48+
49+
# Match trials with TTD to get target information
50+
ttd_data = ttd(breast_trials)
51+
52+
# Calculate clinical development scores
53+
scores = get_clinical_scores(ttd_data, include_approved=True)
54+
print(scores.head())
55+
```
56+
57+
### 2. Load Pre-computed Clinical Scores
58+
59+
```python
60+
from alethiotx.artemis import load_clinical_scores
61+
62+
# Load clinical scores for multiple diseases
63+
breast, lung, prostate, melanoma, bowel, diabetes, cardio = load_clinical_scores(date='2025-11-11')
64+
```
65+
66+
### 3. Pathway Gene Analysis
67+
68+
```python
69+
from alethiotx.artemis import get_pathway_genes load_pathway_genes
70+
71+
# Query GeneShot for disease-associated genes
72+
aml_genes = get_pathway_genes("acute myeloid leukemia")
73+
print(aml_genes.loc["FLT3", ["gene_count", "rank"]])
74+
75+
# Get top pathway genes for diseases
76+
breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg = load_pathway_genes(n=100)
77+
```
78+
79+
### 4. Machine Learning Pipeline
80+
81+
```python
82+
from alethiotx.artemis import pre_model, cv_pipeline, roc_curve
83+
import pandas as pd
84+
85+
# Prepare your knowledge graph features (X) and clinical scores (y)
86+
result = pre_model(X, y, pathway_genes=pathway_genes, bins=3)
87+
88+
# Run cross-validation pipeline
89+
scores = cv_pipeline(X, y, n_iterations=10, scoring='roc_auc')
90+
print(f"Mean AUC: {sum(scores)/len(scores):.3f}")
91+
92+
# Generate ROC curves
93+
mean_auc = roc_curve(result['X'], result['y_binary'], n_splits=5, classifier='rf')
94+
```
95+
96+
### 5. Visualize Gene Overlaps with UpSet Plots
97+
98+
```python
99+
from alethiotx.artemis import prepare_upset, create_upset_plot
100+
101+
# Load clinical scores or pathway genes for multiple diseases
102+
breast, lung, prostate, melanoma, bowel, diabetes, cardio = load_clinical_scores()
103+
104+
# Prepare data for UpSet plot (mode='ct' for clinical targets)
105+
upset_data = prepare_upset(breast, lung, prostate, melanoma, bowel, diabetes, cardio, mode='ct')
106+
107+
# Create and display the UpSet plot
108+
plot = create_upset_plot(upset_data, min_subset_size=5)
109+
plot.plot()
110+
111+
# For pathway genes, use mode='pg'
112+
breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg = load_pathway_genes(n=100)
113+
upset_data_pg = prepare_upset(breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg, mode='pg')
114+
plot_pg = create_upset_plot(upset_data_pg, min_subset_size=10)
115+
plot_pg.plot()
116+
```
117+
118+
## Supported Disease Indications (Artemis Module)
119+
120+
The Artemis module includes built-in support for:
121+
122+
- **Myeloproliferative Neoplasm (MPN)**
123+
- **Breast Cancer**
124+
- **Lung Cancer**
125+
- **Prostate Cancer**
126+
- **Bowel Cancer (Colorectal)**
127+
- **Melanoma**
128+
- **Diabetes Mellitus Type 2**
129+
- **Cardiovascular Disease**
130+
131+
## Artemis Module API Reference
132+
133+
### Data Loading & Processing
134+
135+
- `get_clinical_trials()` - Retrieve clinical trials from ClinicalTrials.gov
136+
- `ttd()` - Match trials with TTD drug/target data
137+
- `get_clinical_scores()` - Calculate per-target clinical development scores
138+
- `load_clinical_scores()` - Load pre-computed clinical scores from S3
139+
- `get_pathway_genes()` - Query Ma'ayan Lab's GeneShot API for gene associations
140+
- `load_pathway_genes()` - Retrieve pathway gene data
141+
142+
### Data Preparation
143+
144+
- `get_all_targets()` - Extract unique target genes from score lists
145+
- `cut_clinical_scores()` - Filter scores by threshold
146+
- `find_overlapping_genes()` - Identify genes present in multiple datasets
147+
- `uniquify_clinical_scores()` - Remove overlapping genes from clinical scores
148+
- `uniquify_pathway_genes()` - Remove overlapping genes from pathway lists
149+
150+
### Machine Learning
151+
152+
- `pre_model()` - Prepare datasets for ML model training
153+
- `cv_pipeline()` - Cross-validation pipeline with customizable classifiers
154+
155+
### Visualization
156+
157+
- `prepare_upset()` - Prepare disease-related data for UpSet plot visualization
158+
- `create_upset_plot()` - Create UpSet plots for visualizing gene set intersections across diseases
159+
160+
## Data Storage (Artemis Module)
161+
162+
The Artemis module uses AWS S3 for storing pre-computed data:
163+
164+
```
165+
s3://alethiotx-artemis/data/
166+
├── clinical_targets/{date}/{disease}.csv
167+
├── pathway_genes/{date}/{disease}.csv
168+
└── ttd/{date}
169+
```
170+
171+
## Requirements
172+
173+
- Python >= 3.9
174+
- requests
175+
- scikit-learn
176+
- pandas
177+
- numpy
178+
- matplotlib
179+
- setuptools
180+
- fsspec
181+
- s3fs
182+
- upsetplot
183+
184+
## Citation
185+
186+
If you use the Artemis module in your research, please cite:
187+
188+
```
189+
Artemis: public knowledge graphs enable accessible and scalable drug target discovery
190+
Vladimir Kiselev, Alethio Therapeutics
191+
```
192+
193+
For other modules, citation information will be provided as they are released.
194+
195+
## License
196+
197+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
198+
199+
## Author
200+
201+
**Vladimir Kiselev**
202+
Email: vlad.kiselev@alethiomics.com
203+
204+
## Links
205+
206+
- **Homepage**: https://github.com/alethiotx/pypi
207+
- **Issues**: https://github.com/alethiotx/pypi/issues
208+
209+
## Contributing
210+
211+
Contributions are welcome! Please feel free to submit a Pull Request.
212+
213+
---
214+
215+
**Current Focus:** Artemis - Enabling accessible and scalable drug target discovery through public knowledge graphs.
216+
**Coming Soon:** Additional modules for expanded drug discovery capabilities.

pyproject.toml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
[build-system]
2+
requires = ["setuptools>=68", "wheel"]
3+
build-backend = "setuptools.build_meta"
4+
5+
[project]
6+
name = "alethiotx"
7+
version = "2.1.1"
8+
description = "Alethio Therapeutics Python Toolkit"
9+
readme = "README.md"
10+
license = {file = "LICENSE"}
11+
authors = [{name = "Vladimir Kiselev", email = "vlad.kiselev@alethiomics.com"}]
12+
keywords = ["alethiotx", "artemis"]
13+
classifiers = [
14+
"Programming Language :: Python :: 3",
15+
"License :: OSI Approved :: MIT License",
16+
"Operating System :: OS Independent"
17+
]
18+
requires-python = ">=3.9"
19+
dependencies = [
20+
"requests",
21+
"scikit-learn",
22+
"pandas",
23+
"numpy",
24+
"matplotlib",
25+
"setuptools",
26+
"upsetplot",
27+
"chembl-downloader"
28+
]
29+
30+
[project.urls]
31+
Homepage = "https://github.com/alethiotx/pypi"
32+
Issues = "https://github.com/alethiotx/pypi/issues"
33+
34+
[tool.setuptools]
35+
package-dir = {"" = "src"}
36+
37+
[tool.setuptools.packages.find]
38+
where = ["src"]

0 commit comments

Comments
 (0)