Skip to content

Commit e56312a

Browse files
committed
Initial commit: chrXatlas chromosome X evidence atlas
Pipeline (src/xatlas + scripts 00-14), curated panel configs, scoring weights, docs, test suite, and interactive frontend (site/ with homepage, panel detail, trait detail, search).
0 parents  commit e56312a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+9282
-0
lines changed

.gitignore

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Environment
2+
.env
3+
.venv/
4+
5+
# Python
6+
__pycache__/
7+
*.pyc
8+
*.egg-info/
9+
dist/
10+
build/
11+
12+
# Data (large, not tracked)
13+
data/
14+
site/data/
15+
16+
# Tabix index files
17+
*.tbi
18+
*.bgz.tbi
19+
20+
# Caches
21+
.pytest_cache/
22+
.ruff_cache/
23+
.mypy_cache/
24+
.cache/
25+
26+
# OS
27+
.DS_Store
28+
29+
# Playwright / screenshots
30+
.playwright-mcp/
31+
*.png
32+
33+
# Agent / plan / working notes (not for repo)
34+
do/
35+
CODEX_HANDOFF.md
36+
CLAUDE.md
37+
.claude/
38+
*.plan
39+
*.plan.md
40+
*handoff*
41+
*todo*
42+
uv.lock
43+
*.tbi

Makefile

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
PYTHON ?= python
2+
3+
setup:
4+
$(PYTHON) -m venv .venv
5+
. .venv/bin/activate && pip install -r requirements.txt && pip install -e .
6+
7+
fetch-manifests:
8+
$(PYTHON) scripts/00_fetch_panukb_manifests.py
9+
$(PYTHON) scripts/04_prepare_eqtl_index.py
10+
11+
select-traits:
12+
$(PYTHON) scripts/01_select_seed_traits.py --panel config/panel_small.csv
13+
14+
extract-x:
15+
$(PYTHON) scripts/02_extract_panukb_x.py
16+
17+
genes:
18+
$(PYTHON) scripts/03_build_x_gene_catalog.py
19+
20+
loci:
21+
$(PYTHON) scripts/05_call_x_loci.py
22+
23+
eqtl:
24+
$(PYTHON) scripts/06_fetch_eqtl_region_hits.py
25+
26+
map-genes:
27+
$(PYTHON) scripts/07_map_loci_to_genes.py
28+
29+
release:
30+
$(PYTHON) scripts/08_export_release_tables.py
31+
32+
test:
33+
$(PYTHON) -m pytest

README.md

Lines changed: 338 additions & 0 deletions
Large diffs are not rendered by default.

config/eqtl_priority_studies.csv

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
study_id,priority,reason
2+
GTEx,1,Imported broad tissue resource with chrX support
3+
CommonMind,2,Brain tissue with chrX support
4+
Braineac2,3,Brain tissue with chrX support
5+
Young_2019,4,Microglia chrX support
6+
CEDAR,5,Immune cell / tissue chrX support
7+
Fairfax_2012,6,B-cell chrX support
8+
Fairfax_2014,7,Monocyte chrX support
9+
Naranbhai_2015,8,Neutrophil chrX support
10+
FUSION,9,Muscle and adipose nonPAR support
11+
iPSCORE,10,iPSC nonPAR support
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
query_id,domain,priority,query,preferred_trait_type,preferred_pheno_sex,notes
2+
albumin,biochemistry,1,Albumin,biomarkers,both_sexes,High-yield core protein marker from the independent-set build
3+
alanine_aminotransferase,biochemistry,2,Alanine aminotransferase,biomarkers,both_sexes,High-yield liver-adjacent biochemistry marker
4+
alkaline_phosphatase,biochemistry,3,Alkaline phosphatase,biomarkers,both_sexes,Strong chrX biochemistry signal
5+
apolipoprotein_a,lipoproteins,4,Apolipoprotein A,biomarkers,both_sexes,Supported lipid transport marker
6+
apolipoprotein_b,lipoproteins,5,Apolipoprotein B,biomarkers,both_sexes,Supported atherogenic lipid marker
7+
c_reactive_protein,inflammation,6,C-reactive protein,biomarkers,both_sexes,Inflammation marker with broad support
8+
calcium,mineral,7,Calcium,biomarkers,both_sexes,Supported mineral marker
9+
cholesterol,lipoproteins,8,Cholesterol,biomarkers,both_sexes,General lipid anchor
10+
creatinine,kidney,9,Creatinine,biomarkers,both_sexes,Renal biochemistry anchor
11+
cystatin_c,kidney,10,Cystatin C,biomarkers,both_sexes,Creatinine-independent renal marker
12+
gamma_glutamyltransferase,liver,11,Gamma glutamyltransferase,biomarkers,both_sexes,Strong liver enzyme signal
13+
glycated_haemoglobin_hba1c,glycemic,12,Glycated haemoglobin (HbA1c),biomarkers,both_sexes,Glycemic axis anchor
14+
hdl_cholesterol,lipoproteins,13,HDL cholesterol,biomarkers,both_sexes,Supported protective lipid marker
15+
igf_1,endocrine,14,IGF-1,biomarkers,both_sexes,Strong endocrine growth-axis marker
16+
ldl_direct,lipoproteins,15,LDL direct,biomarkers,both_sexes,Supported LDL anchor
17+
phosphate,mineral,16,Phosphate,biomarkers,both_sexes,Smaller but clean supported mineral signal
18+
shbg,endocrine,17,SHBG,biomarkers,both_sexes,Supported endocrine transport marker
19+
total_protein,biochemistry,18,Total protein,biomarkers,both_sexes,Supported circulating protein marker
20+
triglycerides,lipoproteins,19,Triglycerides,biomarkers,both_sexes,Strong lipid metabolism signal
21+
urea,kidney,20,Urea,biomarkers,both_sexes,Strong renal/metabolic marker
22+
vitamin_d,micronutrients,21,Vitamin D,biomarkers,both_sexes,Supported micronutrient anchor

config/panel_expanded.csv

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
query_id,domain,priority,query,preferred_trait_type,preferred_pheno_sex,notes
2+
bmi,anthropometrics,1,body mass index,continuous,both_sexes,Core anthropometric trait
3+
height,anthropometrics,1,standing height,continuous,both_sexes,Core anthropometric trait
4+
waist,anthropometrics,2,waist circumference,continuous,both_sexes,Adiposity distribution
5+
hip,anthropometrics,3,hip circumference,continuous,both_sexes,Adiposity distribution
6+
body_fat_mass,body_composition,1,Whole body fat mass,continuous,both_sexes,Whole-body adiposity complement to BMI waist hip
7+
grip_strength,musculoskeletal,1,Hand grip strength (left),continuous,both_sexes,Objective strength trait
8+
heel_bmd,bone_health,1,Heel bone mineral density (BMD),continuous,both_sexes,Skeletal trait with strong ascertainment
9+
sbp,cardiometabolic,1,systolic blood pressure,continuous,both_sexes,Cardiovascular trait
10+
dbp,cardiometabolic,2,diastolic blood pressure,continuous,both_sexes,Cardiovascular trait
11+
current_smoking,substance_use,3,Current tobacco smoking,continuous,both_sexes,Exposure trait with observed chrX loci in probe panel
12+
ldl,lipids,1,LDL direct,biomarkers,both_sexes,Lipid trait
13+
hdl,lipids,2,HDL cholesterol,biomarkers,both_sexes,Lipid trait
14+
tg,lipids,3,triglycerides,biomarkers,both_sexes,Lipid trait
15+
apolipoprotein_a,lipids,4,Apolipoprotein A,biomarkers,both_sexes,Additional lipid transport biology
16+
apolipoprotein_b,lipids,5,Apolipoprotein B,biomarkers,both_sexes,Atherogenic lipid marker
17+
hba1c,glycemic,1,glycated haemoglobin,biomarkers,both_sexes,Glycemic trait
18+
glucose,glycemic,2,glucose,biomarkers,both_sexes,Glycemic trait
19+
t2d,disease,1,type 2 diabetes,phecode,both_sexes,Common disease anchor
20+
coronary_atherosclerosis,disease,2,Coronary atherosclerosis,phecode,both_sexes,Disease anchor with observed chrX loci in probe panel
21+
bilirubin_total,liver,5,Total bilirubin,biomarkers,both_sexes,Observed chrX-signal biomarker replacement for zero-locus seed
22+
hypothyroidism,disease,4,Hypothyroidism,phecode,both_sexes,Common endocrine disease anchor
23+
hb,hematology,1,Haemoglobin concentration,continuous,both_sexes,Hematologic trait
24+
rbc,hematology,2,red blood cell (erythrocyte) count,continuous,both_sexes,Hematologic trait
25+
plt,hematology,3,platelet count,biomarkers,both_sexes,Hematologic trait
26+
lymphocyte,hematology,4,Lymphocyte count,continuous,both_sexes,Immune cell count anchor
27+
crp,inflammation,1,C-reactive protein,biomarkers,both_sexes,Inflammation anchor
28+
alt,liver,1,alanine aminotransferase,biomarkers,both_sexes,Liver biomarker
29+
alp,liver,2,alkaline phosphatase,biomarkers,both_sexes,Liver biomarker
30+
ast,liver,3,Aspartate aminotransferase,biomarkers,both_sexes,Complementary liver enzyme
31+
ggt,liver,4,Gamma glutamyltransferase,biomarkers,both_sexes,Complementary liver enzyme
32+
albumin,inflammation_nutrition,1,Albumin,biomarkers,both_sexes,Nutritional / inflammatory biomarker
33+
creatinine,kidney,1,creatinine,biomarkers,both_sexes,Kidney biomarker
34+
urate,kidney,2,urate,biomarkers,both_sexes,Metabolic / renal
35+
egfr,kidney,3,"Estimated glomerular filtration rate, serum creatinine",continuous,both_sexes,Renal function estimate
36+
cystatin_c,kidney,4,Cystatin C,biomarkers,both_sexes,Creatinine-independent filtration marker
37+
urea,kidney,5,Urea,biomarkers,both_sexes,Renal / metabolic complement
38+
testosterone,endocrine,1,testosterone,biomarkers,both_sexes,Endocrine trait
39+
shbg,endocrine,2,SHBG,biomarkers,both_sexes,Exact SHBG manifest label to avoid prior mis-selection
40+
igf_1,endocrine,3,IGF-1,biomarkers,both_sexes,Growth / aging endocrine axis
41+
vitamin_d,micronutrients,1,Vitamin D,biomarkers,both_sexes,Micronutrient biology
42+
fev1,respiratory,1,Forced expiratory volume in 1-second (FEV1),continuous,both_sexes,Core lung function trait
43+
fvc,respiratory,2,Forced vital capacity (FVC),continuous,both_sexes,Complementary lung function trait
44+
insomnia,neuropsychiatric,1,Sleeplessness / insomnia,continuous,both_sexes,Neuropsychiatric proxy
45+
depression,neuropsychiatric,2,Frequency of depressed mood in last 2 weeks,continuous,both_sexes,Neuropsychiatric proxy
46+
neuroticism,behavioral,1,neuroticism score,continuous,both_sexes,Behavioral trait
47+
fluid_intelligence,cognitive,1,fluid intelligence score,continuous,both_sexes,Cognitive trait
48+
menarche,reproductive,1,Age when periods started (menarche),continuous,both_sexes,Sex-specific trait
49+
menopause,reproductive,2,age at menopause,continuous,females,Sex-specific trait

config/panel_mind_focus.csv

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
query_id,domain,priority,query,preferred_trait_type,preferred_pheno_sex,notes
2+
fluid_intelligence,cognitive,1,fluid intelligence score,continuous,both_sexes,Core reasoning trait
3+
reaction_time,cognitive,1,Mean time to correctly identify matches,continuous,both_sexes,Processing speed trait
4+
prospective_memory,cognitive,2,Prospective memory result,continuous,both_sexes,Memory task summary
5+
numeric_memory,cognitive,2,Maximum digits remembered correctly,continuous,both_sexes,Numeric memory task
6+
neuroticism,behavioral,1,neuroticism score,continuous,both_sexes,Aggregate affect trait
7+
risk_taking,behavioral,2,Risk taking,categorical,both_sexes,Risk preference proxy
8+
mood_swings,behavioral,3,Mood swings,categorical,both_sexes,Mood instability proxy
9+
irritability,behavioral,4,Irritability,categorical,both_sexes,Aggression-adjacent proxy
10+
depressed_mood,mental_health,1,Frequency of depressed mood in last 2 weeks,continuous,both_sexes,Mood symptom
11+
trouble_relaxing,mental_health,2,Recent trouble relaxing,continuous,both_sexes,Anxiety symptom
12+
worry_cannot_stop,mental_health,3,Frequency of inability to stop worrying during worst period of anxiety,continuous,both_sexes,Worry severity
13+
anxious_month,mental_health,4,"Ever felt worried, tense, or anxious for most of a month or longer",categorical,both_sexes,Anxiety propensity
14+
insomnia,sleep,1,Sleeplessness / insomnia,continuous,both_sexes,Sleep disruption trait
15+
sleep_duration,sleep,2,Sleep duration,continuous,both_sexes,Core sleep trait
16+
current_smoking,substance_use,1,Current tobacco smoking,continuous,both_sexes,Behavioral exposure
17+
alcohol_frequency,substance_use,2,Alcohol intake frequency,continuous,both_sexes,Alcohol behavior
18+
heavy_drinking,substance_use,3,Frequency of consuming six or more units of alcohol,continuous,both_sexes,Binge-drinking proxy

config/panel_mind_risk_core.csv

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
query_id,domain,priority,query,preferred_trait_type,preferred_pheno_sex,notes
2+
reaction_time,cognitive,1,Mean time to correctly identify matches,continuous,both_sexes,Best-performing direct cognitive anchor
3+
fluid_intelligence,cognitive,2,fluid intelligence score,continuous,both_sexes,Keep one direct reasoning trait even though current support is weak
4+
neuroticism,behavioral,1,neuroticism score,continuous,both_sexes,Aggregate affect trait with usable chrX support
5+
risk_taking,behavioral,2,Risk taking,categorical,both_sexes,Risk preference proxy
6+
mood_swings,behavioral,3,Mood swings,categorical,both_sexes,Mood instability proxy
7+
irritability,behavioral,4,Irritability,categorical,both_sexes,Aggression-adjacent proxy with usable signal
8+
depressed_mood,mental_health,1,Frequency of depressed mood in last 2 weeks,continuous,both_sexes,Mood symptom with prior support
9+
insomnia,sleep,1,Sleeplessness / insomnia,continuous,both_sexes,Behaviorally relevant sleep-disruption trait
10+
current_smoking,substance_use,1,Current tobacco smoking,continuous,both_sexes,Robust behavior/exposure trait
11+
alcohol_frequency,substance_use,2,Alcohol intake frequency,continuous,both_sexes,Best-performing alcohol behavior trait

config/panel_replacement_probe.csv

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
query_id,domain,priority,query,preferred_trait_type,preferred_pheno_sex,notes
2+
coronary_atherosclerosis,disease,1,Coronary atherosclerosis,phecode,both_sexes,Probe replacement for myocardial infarction
3+
rheumatoid_arthritis,disease,2,Rheumatoid arthritis,phecode,both_sexes,Probe inflammatory disease anchor
4+
psoriasis,disease,3,Psoriasis,phecode,both_sexes,Probe dermatologic disease anchor
5+
sleep_duration,sleep,1,Sleep duration,continuous,both_sexes,Probe behavioral replacement for pulse rate
6+
current_smoking,substance_use,2,Current tobacco smoking,continuous,both_sexes,Probe exposure trait replacement for pulse rate
7+
bilirubin_total,liver,1,Total bilirubin,biomarkers,both_sexes,Probe biomarker replacement with likely chrX loci
8+
copd,respiratory,1,Doctor diagnosed COPD (chronic obstructive pulmonary disease),categorical,both_sexes,Probe respiratory disease replacement for asthma
9+
wheeze,respiratory,2,Wheeze or whistling in the chest in last year,categorical,both_sexes,Probe symptom-level respiratory replacement for asthma

config/panel_sleep_risk_focus.csv

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
query_id,domain,priority,query,preferred_trait_type,preferred_pheno_sex,notes
2+
insomnia,sleep,1,Sleeplessness / insomnia,continuous,both_sexes,Direct sleep-disruption trait
3+
sleep_duration,sleep,2,Sleep duration,continuous,both_sexes,Core sleep quantity trait
4+
risk_taking,behavioral,1,Risk taking,categorical,both_sexes,Risk preference proxy
5+
mood_swings,behavioral,2,Mood swings,categorical,both_sexes,Mood instability proxy
6+
irritability,behavioral,3,Irritability,categorical,both_sexes,Aggression-adjacent proxy
7+
trouble_relaxing,mental_health,1,Recent trouble relaxing,continuous,both_sexes,Anxiety symptom
8+
worry_cannot_stop,mental_health,2,Frequency of inability to stop worrying during worst period of anxiety,continuous,both_sexes,Worry severity
9+
depressed_mood,mental_health,3,Frequency of depressed mood in last 2 weeks,continuous,both_sexes,Mood symptom
10+
current_smoking,substance_use,1,Current tobacco smoking,continuous,both_sexes,Behavioral exposure
11+
alcohol_frequency,substance_use,2,Alcohol intake frequency,continuous,both_sexes,Alcohol behavior
12+
heavy_drinking,substance_use,3,Frequency of consuming six or more units of alcohol,continuous,both_sexes,Binge-drinking proxy

0 commit comments

Comments
 (0)