Skip to content

Commit fad65f0

Browse files
committed
2 parents 0d6a223 + afb81a3 commit fad65f0

26 files changed

+3419
-1712
lines changed

docs/fields.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,12 @@ residue number of the last amino acid in the peptide
1313
### sequence (str)
1414
fasta sequence of the peptide
1515

16+
### protein (str)
17+
protein name or identifier
18+
19+
HDExaminer name: Protein
20+
DynamX name: Protein
21+
1622
### state (str)
1723
state label
1824

@@ -93,6 +99,9 @@ These fields are derived from other fields defined in the above sections.
9399
added after data aggregation
94100
Total number of replicates that were aggregated together
95101

102+
### n_charges
103+
Total number of different charged states that were aggregated together
104+
96105
### n_clusters
97106
added after data aggregation
98107
Total number of isotopic clusters that were aggregated together. When replicates include multiple isotopic clusters (different charged states), this value will be larger than n_replicates.

docs/hd_examiner_files/HDX export file test.csv

Lines changed: 290 additions & 0 deletions
Large diffs are not rendered by default.

docs/hd_examiner_formats.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,25 @@ FD control: 'MAX' (older version)
159159
Comments:
160160

161161

162+
### Kingfisher HD examiner example
163+
164+
File: HDX export file test.csv
165+
Source: https://github.com/juan2089/Kingfisher-HDX/blob/Kingfisher-v1.1/www/HDX%20export%20file%20test.csv
166+
167+
Columns:
168+
The first line is a header with exposure times.
169+
170+
The second line has the column names, starting with:
171+
'State,Protein,Start,End,Sequence,Search RT,Charge,Max D,'
172+
173+
Followed by repeating blocks of:
174+
'Start RT,End RT,#D,%D,#D right,%D right,Score,Conf,'
175+
Format: (almost!) HD examiner summary file
176+
177+
This is a HD examiner 'peptide pool' file
178+
179+
180+
162181
## HD Examiner manual on exporting data
163182

164183
**Peptide Pool Results / Uptake Summary Table**

examples/from_hxms_file.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
from typing import Optional
55

66
from hdxms_datasets.database import populate_known_ids, submit_dataset
7-
from hdxms_datasets.loader import (
7+
from hdxms_datasets.reader import (
88
read_hxms,
99
)
1010
from hdxms_datasets.models import (

examples/from_zip_file.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
from hdxms_datasets import load_dataset
2+
from pathlib import Path
3+
4+
DATA_ID = "HDX_C1198C76" # SecA DynamX state data
5+
DATA_ID = "HDX_D9096080" # SecB DynamX state data
6+
7+
fname = "HDX_3BAE2080.zip" # Example dataset in a zip file
8+
9+
# %%
10+
test_pth = Path(__file__).parent.parent / "tests"
11+
database_dir = test_pth / "datasets"
12+
13+
dataset = load_dataset(database_dir / fname) # Should load the dataset from the zip file
14+
15+
print(dataset.states)
16+
17+
# %%

examples/load_local_dynamx_cluster.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@
5353
plot_peptides(selected, domain=(0, 1), value="frac_max_uptake")
5454

5555
# %%
56-
peptides = dataset.states[0].peptides[0]
56+
peptides = dataset.states[0].peptides[0].load()
5757
StructureView(dataset.structure).peptide_coverage(peptides)
5858

5959
# %%

examples/load_local_dynamx_state.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,9 @@
4444
# load the partially deuterated peptides
4545
df = state.peptides[0].load(
4646
convert=True,
47-
aggregate=True,
48-
# sort_rows=True,
49-
# sort_columns=True,
47+
aggregate=None, # dynamx state data is already aggregated
48+
sort_rows=True,
49+
sort_columns=True,
5050
)
5151
print(df.columns)
5252
# > ['start', 'end', 'sequence', 'state', 'exposure', 'centroid_mz', 'rt', 'rt_sd', 'uptake', 'uptake_sd']
@@ -112,12 +112,12 @@
112112
# %%
113113
# show a single peptide
114114
start, end = processed["start", "end"].row(10)
115-
view = StructureView(dataset.structure).color_peptide(start, end, chain=["A"])
115+
view = StructureView(dataset.structure).color_peptide(start, end)
116116
view
117117

118118
# %%
119119
# select a set of peptides for further viusualization
120-
peptides = dataset.states[0].peptides[0]
120+
peptides = dataset.states[0].peptides[0].load()
121121

122122
# %%
123123
# show regions of the structure that are covered by peptides

examples/load_local_hdexaminer.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@
3535
selected = processed.filter(nw.col("exposure") == exposure_value)
3636
plot_peptides(selected.to_polars(), value="frac_max_uptake", domain=(0, 1))
3737
# %%
38-
# %%
3938

4039
peptides = dataset.states[0].peptides[0]
4140
StructureView(dataset.structure).peptide_coverage(selected)

examples/read_files.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# %%
2+
3+
from pathlib import Path
4+
5+
from hdxms_datasets import identify_format
6+
7+
# %%
8+
9+
cwd = Path(__file__).parent
10+
11+
# %%
12+
13+
# read a hxms file
14+
f = cwd / "test_data" / "ecDHFR" / "ecDHFR_2025-09-23_APO.hxms"
15+
16+
fmt_spec = identify_format(f)
17+
# read to dataframe
18+
df = fmt_spec.read(f)
19+
20+
# convert to open-hdx format
21+
df_converted = fmt_spec.convert(df)
22+
df_converted.to_native()
23+
24+
# %%
25+
# read an dynamx file
26+
f = cwd / "test_data" / "ecSecB" / "ecSecB_apo.csv"
27+
fmt_spec = identify_format(f)
28+
# read to dataframe
29+
df = fmt_spec.read(f)
30+
31+
# convert to open-hdx format
32+
df_converted = fmt_spec.convert(df)
33+
df_converted.to_native()
34+
# %%

examples/test_data/ecDHFR/notes.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,7 @@ the HXMS file format manuscript.
44
Correct source:
55
https://www.biorxiv.org/content/10.1101/2025.10.14.682397v1.supplementary-material
66

7+
8+
9+
ecDHFR tutorial.csv
10+
Source: https://huggingface.co/spaces/glasgow-lab/PFLink

0 commit comments

Comments
 (0)