Skip to content

Commit 5b6455c

Browse files
committed
Merge branch 'main' of https://github.com/compomics/psm_utils into pepxml-fixes
2 parents 3271702 + 854bc60 commit 5b6455c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+4378
-1754
lines changed

.github/workflows/publish.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ jobs:
1919
- name: Set up Python
2020
uses: actions/setup-python@v5
2121
with:
22-
python-version: "3.9"
22+
python-version: "3.10"
2323

2424
- name: Install dependencies
2525
run: |

.github/workflows/test.yml

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -13,25 +13,30 @@ jobs:
1313
steps:
1414
- uses: actions/checkout@v4
1515

16-
- name: Set up Python 3.9
16+
- name: Set up Python
1717
uses: actions/setup-python@v5
1818
with:
19-
python-version: "3.9"
19+
python-version: "3.10"
2020

21-
- name: Install dependencies
22-
run: |
23-
python -m pip install --upgrade pip
24-
pip install ruff
21+
- name: Lint with Ruff
22+
uses: astral-sh/ruff-action@v3
23+
with:
24+
args: check --exclude docs,tests
2525

26-
- name: Check with Ruff
27-
run: ruff check --output-format=github .
26+
- name: Check formatting with Ruff
27+
uses: astral-sh/ruff-action@v3
28+
with:
29+
args: format --check --diff --exclude docs,tests
2830

2931
- name: Install package and its dependencies
30-
run: pip install --editable .[dev,idxml]
32+
run: pip install --editable .[dev,io]
33+
34+
- name: Static type checking with mypy
35+
run: mypy --non-interactive
3136

3237
- name: Test with pytest and codecov
3338
run: |
34-
pytest --cov=psm_utils --cov-report=xml tests/
39+
pytest --cov=psm_utils --cov-report=xml tests/
3540
3641
- name: Upload coverage reports to Codecov
3742
uses: codecov/codecov-action@v3
@@ -46,7 +51,7 @@ jobs:
4651
runs-on: ubuntu-latest
4752
strategy:
4853
matrix:
49-
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13"]
54+
python-version: ["3.10", "3.11", "3.12", "3.13"]
5055
steps:
5156
- uses: actions/checkout@v4
5257

@@ -62,7 +67,7 @@ jobs:
6267
6368
- name: Install optional dependencies that might not be available
6469
continue-on-error: true
65-
run: pip install .[idxml]
70+
run: pip install .[io]
6671

6772
- name: Test imports
6873
run: python -c "import psm_utils"

CHANGELOG.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,62 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [1.5.0] - 2025-10-27
9+
10+
### Added
11+
12+
-`io`: Read/write support for **JSON** and **CBOR** formats. (#125)
13+
-`io.percolator`: Added support for Comet-style N- and C-terminal modifications (#121 and #131 by @ATPs)
14+
15+
### Changed
16+
17+
- ♻️ `stats.qvalues`: Set the **regular target–decoy formula** explicitly in Pyteomics when `remove_decoy=False` and apply the **+1 correction** (probability that the first excluded decoy out-scores the threshold PSM). This produces less overly conservative q-values (e.g., on `example_files/msms.txt`). (#128)
18+
- 🏷️ **Typing**: Adopted full **MyPy** typing across the codebase. (#125)
19+
- 👷 **CI**: Replaced file-hash–based tests for `io.idxml` with unit tests; added formatting checks. (#125)
20+
21+
### Removed
22+
23+
- 💥 Dropped support for **Python 3.9**. (#125)
24+
25+
### Fixed
26+
27+
- 🐛 `io.mzid`: Treat **MS:1001460 “unknown modification”** as a **delta-mass–designated** modification in peptidoforms so mass calculations remain possible. Previously rendered as `[unknown modification]`. (#126 by @levitsky)
28+
- 🐛 `io.fragpipe`: Build more comprehensive **ProForma** strings using “Assigned Modifications” from FragPipe output. (fixes #123; #124 by @levitsky)
29+
- 🐛 `peptidoform`: Allow **residue `X` with a MassModification** to indicate a **gap of known mass** per ProForma §4.2.7; resolves failures computing theoretical mass for non-natural residues (fixes #127). (#130)
30+
31+
## [1.4.1] - 2025-04-15
32+
33+
### Fixed
34+
35+
- Restored compatibility with older Sage versions that have no ion mobility columns (introduced in v1.4.0) (by @rodvrees in #120)
36+
37+
## [1.4.0] - 2025-03-06
38+
39+
### Added
40+
41+
-`io.sage`: Add parsing of ion mobility values (PR #113)
42+
43+
### Fixed
44+
45+
- 🐛 `io.percolator`: Fix bug in `PercolatorTabWriter` where style parameter was not propagated (fixes #114, PR #117)
46+
- 📝 Docs: Explicitly set Sphinx configuration path for Read the Docs (fixes #115, PR #118)
47+
48+
## [1.3.0] - 2025-01-20
49+
50+
### Added
51+
52+
-`io.idxml`: Parse ion mobility from idXML files if present.
53+
- 🐍 Added support for Python 3.12 and 3.13
54+
55+
### Removed
56+
57+
- 🐍 Removed support for Python 3.7
58+
59+
### Fixed
60+
61+
- 🐛 Fix bug introduced in #102 where dtypes were not coerced anymore by Numpy, which lead to unexpected behavior downstream (e.g., `psm_list["is_decoy"]` would return an array of objects instead of bools)
62+
- 🩹 Fix potential downstream issues because pepxml-read PSM had `rescoring_features=None` (partially fixes #108)
63+
864
## [1.2.0] - 2024-11-19
965

1066
### Added

README.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,10 +90,12 @@ Supported file formats
9090
File format psm_utils tag Read support Write support Comments
9191
===================================================================================================================== ======================== =============== =============== ==========
9292
`AlphaDIA precursors TSV <https://alphadia.readthedocs.io/en/latest/quickstart.html#output-files>`_ ``alphadia`` ✅ ❌
93+
`CBOR <https://psm-utils.readthedocs.io/en/stable/api/psm_utils.io#module-psm_utils.io.cbor>`_ ``cbor`` ✅ ✅
9394
`DIA-NN TSV <https://github.com/vdemichev/DiaNN#output>`_ ``diann`` ✅ ❌
9495
`FlashLFQ generic TSV <https://github.com/smith-chem-wisc/FlashLFQ/wiki/Identification-Input-Formats>`_ ``flashlfq`` ✅ ✅
9596
`FragPipe PSM TSV <https://fragpipe.nesvilab.org/docs/tutorial_fragpipe_outputs.html#psmtsv/>`_ ``fragpipe`` ✅ ❌
9697
`ionbot CSV <https://ionbot.cloud/>`_ ``ionbot`` ✅ ❌
98+
`JSON <https://psm-utils.readthedocs.io/en/stable/api/psm_utils.io#module-psm_utils.io.json>`_ ``json`` ✅ ✅
9799
`OpenMS idXML <https://www.openms.de/>`_ ``idxml`` ✅ ✅ Requires the optional ``openms`` dependency (``pip install psm-utils[idxml]``)
98100
`MaxQuant msms.txt <https://www.maxquant.org/>`_ ``msms`` ✅ ❌
99101
`MS Amanda CSV <https://ms.imp.ac.at/?goto=msamanda>`_ ``msamanda`` ✅ ❌

docs/source/api/psm_utils.io.rst

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,18 @@ psm_utils.io
77

88

99

10-
psm_utils.io.alphapept
11-
##################
10+
psm_utils.io.alphadia
11+
#####################
1212

13-
.. automodule:: psm_utils.io.alphapept
13+
.. automodule:: psm_utils.io.alphadia
14+
:members:
15+
:inherited-members:
16+
17+
18+
psm_utils.io.cbor
19+
#################
20+
21+
.. automodule:: psm_utils.io.cbor
1422
:members:
1523
:inherited-members:
1624

@@ -32,7 +40,7 @@ psm_utils.io.flashlfq
3240

3341

3442
psm_utils.io.fragpipe
35-
##################
43+
#####################
3644

3745
.. automodule:: psm_utils.io.fragpipe
3846
:members:
@@ -56,6 +64,15 @@ psm_utils.io.ionbot
5664

5765

5866

67+
psm_utils.io.json
68+
#################
69+
70+
.. automodule:: psm_utils.io.json
71+
:members:
72+
:inherited-members:
73+
74+
75+
5976
psm_utils.io.maxquant
6077
#####################
6178

1.53 KB
Binary file not shown.
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
[
2+
{
3+
"peptidoform": "SSYGSSSNDDSYGSSNNDDSYGSSNK/3",
4+
"spectrum_id": "71876",
5+
"run": "LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01",
6+
"spectrum": "71876",
7+
"is_decoy": false,
8+
"score": 136.160126,
9+
"qvalue": 0.0,
10+
"pep": 0.0,
11+
"precursor_mz": 894.33783,
12+
"retention_time": 2800.518555,
13+
"ion_mobility": 1e-06,
14+
"protein_list": [
15+
"P18899"
16+
],
17+
"rank": 1,
18+
"source": "AlphaDIA",
19+
"provenance_data": {
20+
"alphadia_filename": "C:\\Users\\ralfg\\git\\psm_utils\\example_files\\alphadia.precursors.tsv"
21+
},
22+
"metadata": {},
23+
"rescoring_features": {
24+
"rt_observed": 2800.518555,
25+
"mobility_observed": 1e-06,
26+
"mz_observed": 894.33783,
27+
"charge": 3.0,
28+
"delta_rt": 452.909424
29+
}
30+
},
31+
{
32+
"peptidoform": "SSQGSSSSTQSAPSETASASK/2",
33+
"spectrum_id": "41978",
34+
"run": "LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01",
35+
"spectrum": "41978",
36+
"is_decoy": false,
37+
"score": 122.27832,
38+
"qvalue": 0.0,
39+
"pep": 0.0,
40+
"precursor_mz": 986.440491,
41+
"retention_time": 1647.208252,
42+
"ion_mobility": 1e-06,
43+
"protein_list": [
44+
"Q9ULU4"
45+
],
46+
"rank": 1,
47+
"source": "AlphaDIA",
48+
"provenance_data": {
49+
"alphadia_filename": "C:\\Users\\ralfg\\git\\psm_utils\\example_files\\alphadia.precursors.tsv"
50+
},
51+
"metadata": {},
52+
"rescoring_features": {
53+
"rt_observed": 1647.208252,
54+
"mobility_observed": 1e-06,
55+
"mz_observed": 986.440491,
56+
"charge": 2.0,
57+
"delta_rt": -23.25415
58+
}
59+
},
60+
{
61+
"peptidoform": "SSQTSGTNEQSSAIVSAR/2",
62+
"spectrum_id": "68554",
63+
"run": "LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01",
64+
"spectrum": "68554",
65+
"is_decoy": false,
66+
"score": 152.012512,
67+
"qvalue": 0.0,
68+
"pep": 0.0,
69+
"precursor_mz": 905.432312,
70+
"retention_time": 2678.317139,
71+
"ion_mobility": 1e-06,
72+
"protein_list": [
73+
"O60763"
74+
],
75+
"rank": 1,
76+
"source": "AlphaDIA",
77+
"provenance_data": {
78+
"alphadia_filename": "C:\\Users\\ralfg\\git\\psm_utils\\example_files\\alphadia.precursors.tsv"
79+
},
80+
"metadata": {},
81+
"rescoring_features": {
82+
"rt_observed": 2678.317139,
83+
"mobility_observed": 1e-06,
84+
"mz_observed": 905.432312,
85+
"charge": 2.0,
86+
"delta_rt": 31.525879
87+
}
88+
}
89+
]
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
peptidoform spectrum_id run collection is_decoy score qvalue pep precursor_mz retention_time ion_mobility protein_list rank source provenance:alphadia_filename rescoring:rt_observed rescoring:mobility_observed rescoring:mz_observed rescoring:charge rescoring:delta_rt
2+
SSYGSSSNDDSYGSSNNDDSYGSSNK/3 71876 LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01 False 136.160126 0.0 0.0 894.33783 2800.518555 1e-06 ['P18899'] 1 AlphaDIA C:\Users\ralfg\git\psm_utils\example_files\alphadia.precursors.tsv 2800.518555 1e-06 894.33783 3.0 452.909424
3+
SSQGSSSSTQSAPSETASASK/2 41978 LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01 False 122.27832 0.0 0.0 986.440491 1647.208252 1e-06 ['Q9ULU4'] 1 AlphaDIA C:\Users\ralfg\git\psm_utils\example_files\alphadia.precursors.tsv 1647.208252 1e-06 986.440491 2.0 -23.25415
4+
SSQTSGTNEQSSAIVSAR/2 68554 LFQ_Orbitrap_AIF_Condition_A_Sample_Alpha_01 False 152.012512 0.0 0.0 905.432312 2678.317139 1e-06 ['O60763'] 1 AlphaDIA C:\Users\ralfg\git\psm_utils\example_files\alphadia.precursors.tsv 2678.317139 1e-06 905.432312 2.0 31.525879

online/Home.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55

66
class StreamlitPageHome(StreamlitPage):
7+
"""Streamlit page for the home section."""
8+
79
def _main_page(self):
810
pass
911

online/_utils.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ class ECDF:
1111
----------
1212
x : array_like
1313
Observations
14+
1415
"""
1516

1617
def __init__(self, x):
@@ -101,15 +102,15 @@ def pp_plot(psm_df):
101102

102103
def fdr_plot(psm_df, fdr_threshold):
103104
"""Plot number of identifications in function of FDR threshold."""
104-
df = (
105+
target_psm_df = (
105106
psm_df[~psm_df["is_decoy"]]
106107
.reset_index(drop=True)
107108
.sort_values("qvalue", ascending=True)
108109
.copy()
109110
)
110-
df["count"] = (~df["is_decoy"]).cumsum()
111+
target_psm_df["count"] = (~target_psm_df["is_decoy"]).cumsum()
111112
fig = px.line(
112-
df,
113+
target_psm_df,
113114
x="qvalue",
114115
y="count",
115116
log_x=True,

0 commit comments

Comments
 (0)