Skip to content

Commit 97490ab

Browse files
Merge pull request #26 from MannLabs/development
2 parents c9e1189 + 99ecf97 commit 97490ab

File tree

17 files changed

+176
-95
lines changed

17 files changed

+176
-95
lines changed

.github/workflows/pip_installation.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@ jobs:
1313
runs-on: ${{ matrix.os }}
1414
strategy:
1515
matrix:
16-
os: [ubuntu-latest, macOS-latest, windows-latest]
16+
# os: [ubuntu-latest, macOS-latest, windows-latest]
17+
os: [ubuntu-latest, windows-latest]
1718
steps:
1819
- uses: actions/checkout@v2
1920
- uses: conda-incubator/setup-miniconda@v2
@@ -38,7 +39,8 @@ jobs:
3839
runs-on: ${{ matrix.os }}
3940
strategy:
4041
matrix:
41-
os: [ubuntu-latest, macOS-latest, windows-latest]
42+
# os: [ubuntu-latest, macOS-latest, windows-latest]
43+
os: [ubuntu-latest, windows-latest]
4244
steps:
4345
- uses: actions/checkout@v2
4446
- uses: conda-incubator/setup-miniconda@v2

.github/workflows/publish_and_release.yml

Lines changed: 32 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -73,38 +73,38 @@ jobs:
7373
asset_path: release/one_click_linux_gui/dist/pydiaid_gui_installer_linux.deb
7474
asset_name: pydiaid_gui_installer_linux.deb
7575
asset_content_type: application/octet-stream
76-
Create_MacOS_Release:
77-
runs-on: macos-latest
78-
needs: Create_Draft_On_GitHub
79-
steps:
80-
- name: Checkout code
81-
uses: actions/checkout@v2
82-
- uses: conda-incubator/setup-miniconda@v2
83-
with:
84-
auto-update-conda: true
85-
python-version: ${{ matrix.python-version }}
86-
- name: Conda info
87-
shell: bash -l {0}
88-
run: conda info
89-
- name: Creating installer for MacOS
90-
shell: bash -l {0}
91-
run: |
92-
cd release/one_click_macos_gui
93-
. ./create_installer_macos.sh
94-
- name: Test installer for MacOS
95-
shell: bash -l {0}
96-
run: |
97-
sudo installer -pkg release/one_click_macos_gui/dist/pydiaid_gui_installer_macos.pkg -target /
98-
- name: Upload MacOS Installer
99-
id: upload-release-asset
100-
uses: actions/upload-release-asset@v1
101-
env:
102-
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
103-
with:
104-
upload_url: ${{ needs.Create_Draft_On_GitHub.outputs.upload_url }}
105-
asset_path: release/one_click_macos_gui/dist/pydiaid_gui_installer_macos.pkg
106-
asset_name: pydiaid_gui_installer_macos.pkg
107-
asset_content_type: application/octet-stream
76+
# Create_MacOS_Release:
77+
# runs-on: macos-latest
78+
# needs: Create_Draft_On_GitHub
79+
# steps:
80+
# - name: Checkout code
81+
# uses: actions/checkout@v2
82+
# - uses: conda-incubator/setup-miniconda@v2
83+
# with:
84+
# auto-update-conda: true
85+
# python-version: ${{ matrix.python-version }}
86+
# - name: Conda info
87+
# shell: bash -l {0}
88+
# run: conda info
89+
# - name: Creating installer for MacOS
90+
# shell: bash -l {0}
91+
# run: |
92+
# cd release/one_click_macos_gui
93+
# . ./create_installer_macos.sh
94+
# - name: Test installer for MacOS
95+
# shell: bash -l {0}
96+
# run: |
97+
# sudo installer -pkg release/one_click_macos_gui/dist/pydiaid_gui_installer_macos.pkg -target /
98+
# - name: Upload MacOS Installer
99+
# id: upload-release-asset
100+
# uses: actions/upload-release-asset@v1
101+
# env:
102+
# GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
103+
# with:
104+
# upload_url: ${{ needs.Create_Draft_On_GitHub.outputs.upload_url }}
105+
# asset_path: release/one_click_macos_gui/dist/pydiaid_gui_installer_macos.pkg
106+
# asset_name: pydiaid_gui_installer_macos.pkg
107+
# asset_content_type: application/octet-stream
108108
Create_Windows_Release:
109109
runs-on: windows-latest
110110
needs: Create_Draft_On_GitHub

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ In case of issues, check out the following links:
165165
---
166166
## Citations
167167

168-
Check out the [dia-PASEF publication](https://doi.org/10.1016/j.mcpro.2022.100279) and [synchro-PASEF publication](https://doi.org/10.1016/j.mcpro.2022.100489).
168+
Check out the [optimal dia-PASEF](https://doi.org/10.1016/j.mcpro.2022.100279), [synchro-PASEF](https://doi.org/10.1016/j.mcpro.2022.100489) and [PASEF workflows and py_diAID](https://doi.org/10.1038/s41596-024-01104-w) publications.
169169

170170
---
171171
## How to contribute

misc/bumpversion.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[bumpversion]
2-
current_version = 0.0.30
2+
current_version = 0.0.40
33
commit = True
44
tag = False
55
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\-(?P<release>[a-z]+)(?P<build>\d+))?

pydiaid/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33

44
__project__ = "pydiaid"
5-
__version__ = "0.0.30"
5+
__version__ = "0.0.40"
66
__license__ = "Apache"
77
__description__ = "An open-source Python package of the AlphaPept ecosystem"
88
__author__ = "Mann Labs"

pydiaid/gui.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -481,7 +481,7 @@ def __init__(self, start_server=False):
481481
name="py_diAID",
482482
github_url='https://github.com/MannLabs/pydiaid',
483483
)
484-
self.project_description = """#### py_diAID is a Python tool that automatically and optimally places DIA (Data-Independent Acquisition) window schemes for efficient precursor coverage. Using pre-acquired precursor information, it generates dia-PASEF, synchro-PASEF, and Orbitrap Astral DIA methods. The name diAID stands for Automated Isolation Design for DIA.\n <i style="font-size: 0.8em; display: block"> Please cite: Skowronek, … , Mann, MCP, 2022 for dia-PASEF and \n Skowronek, … , Willems, Raether, Mann, MCP, 2023 for synchro-PASEF.</i>"""
484+
self.project_description = """#### py_diAID is a Python tool that automatically and optimally places DIA (Data-Independent Acquisition) window schemes for efficient precursor coverage. Using pre-acquired precursor information, it generates dia-PASEF, synchro-PASEF, and Orbitrap Astral DIA methods. The name diAID stands for Automated Isolation Design for DIA.\n <i style="font-size: 0.8em; display: block"> Please cite: Skowronek et al., Mann, MCP, 2022 for optimal dia-PASEF methods, \n Skowronek et al., Willems, Raether, Mann, MCP, 2023 for synchro-PASEF, \n Skowronek et al., Mann, Nat Protoc, 2025 for PASEF workflows and py_diAID.</i>"""
485485

486486
self.manual_path = os.path.join(
487487
DOCS_PATH,

pydiaid/loader.py

Lines changed: 101 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,11 @@ def load_library(
3131
try:
3232
# Special case for DIANN single-run which needs the file path directly
3333
if analysis_software == 'DIANN single-run':
34-
return __parse_diann_single_run(library_name, ptm_list, require_im)
34+
if library_name.split(".")[-1] == "parquet":
35+
analysis_software = 'DIANN library'
36+
else:
37+
return __parse_diann_single_run(library_name, ptm_list, require_im)
38+
3539

3640
# For all other software, load the dataframe first
3741
dataframe = __load_dataframe_from_file(library_name)
@@ -73,9 +77,67 @@ def __load_dataframe_from_file(
7377

7478
if library_name.split(".")[-1] == "csv":
7579
return pd.read_csv(library_name, sep=',')
80+
if library_name.split(".")[-1] == "parquet":
81+
return pd.read_parquet(library_name, engine='fastparquet')
7682
else:
7783
return pd.read_csv(library_name, sep='\t') # .xls, .tsv, .txt
7884

85+
class ColumnMapper:
86+
"""Handles column name mapping across different software versions."""
87+
88+
# Define all possible column variants as class attributes
89+
COLUMN_VARIANTS = {
90+
'decoy': ['decoy', 'Decoy', 'is_decoy'],
91+
'qvalue': ['QValue', 'Q.Value', 'q_value', 'Q_Value'],
92+
'mobility': [
93+
'PrecursorIonMobility',
94+
'IonMobility',
95+
'Ion Mobility',
96+
'ion_mobility',
97+
'IM',
98+
'Mobility',
99+
'1/K0'
100+
],
101+
'mz': ['PrecursorMz', 'Precursor.Mz', 'Mz', 'PrecursorMZ', 'Calibrated Observed M/Z'],
102+
'charge': ['PrecursorCharge', 'Precursor.Charge', 'Charge'],
103+
'protein': ['ProteinId', 'ProteinName', 'Protein.Names', 'Protein', 'Protein ID'],
104+
'modified_peptide': [
105+
'ModifiedPeptideSequence',
106+
'ModifiedPeptide',
107+
'Modified.Sequence',
108+
'Modified Sequence',
109+
'Modified Peptide'
110+
],
111+
'peptide': ['Peptide', 'PeptideSequence', 'Sequence']
112+
}
113+
114+
def __init__(self, dataframe: pd.DataFrame):
115+
"""Initialize with a dataframe and map its columns."""
116+
self.df = dataframe
117+
self.column_map = self._create_column_map()
118+
119+
def _create_column_map(self) -> dict:
120+
"""Create mapping of standard names to actual column names in dataframe."""
121+
column_map = {}
122+
for standard_name, variants in self.COLUMN_VARIANTS.items():
123+
found_col = next((col for col in variants if col in self.df.columns), None)
124+
column_map[standard_name] = found_col
125+
return column_map
126+
127+
def get_column(self, standard_name: str) -> str:
128+
"""Get the actual column name for a standard column identifier."""
129+
return self.column_map.get(standard_name)
130+
131+
def validate_required_columns(self, required_columns: list) -> None:
132+
"""Validate that all required columns exist."""
133+
missing = [col for col in required_columns if self.get_column(col) is None]
134+
if missing:
135+
raise ValueError(f"Required columns missing: {', '.join(missing)}")
136+
137+
def has_column(self, standard_name: str) -> bool:
138+
"""Check if a standard column exists in the dataframe."""
139+
return self.get_column(standard_name) is not None
140+
79141

80142
def __parse_alpha_pept(
81143
dataframe: pd.DataFrame,
@@ -175,8 +237,8 @@ def __parse_ms_fragger(
175237
columns.
176238
177239
Parameters:
178-
dataframe (pd.DataFrame): imported output file from the analysis software
179-
"MSFragger".
240+
dataframe (pd.DataFrame): imported library or psm file from the analysis software
241+
"MSFragger". Required columns (supports multiple naming variants)
180242
File format: .tsv, required columns: 'PrecursorMz', 'PrecursorIonMobility',
181243
'PrecursorCharge', 'ProteinId', 'ModifiedPeptideSequence'.
182244
ptm_list (list): a list with identifiers used for filtering a specific dataframe column.
@@ -186,19 +248,28 @@ def __parse_ms_fragger(
186248
pd.DataFrame: returns a pre-filtered data frame with unified column names.
187249
"""
188250

189-
im_col = 'PrecursorIonMobility' if 'PrecursorIonMobility' in dataframe.columns else None
251+
mapper = ColumnMapper(dataframe)
190252

191-
if require_im and im_col is None:
192-
raise Exception("Ion mobility data required but not found in MSFragger output")
253+
required_columns = ['mz', 'charge', 'protein', 'modified_peptide']
254+
if require_im:
255+
required_columns.append('mobility')
256+
257+
mapper.validate_required_columns(required_columns)
258+
259+
if mapper.has_column('peptide'):
260+
peptide_col = mapper.get_column('peptide')
261+
mod_peptide_col = mapper.get_column('modified_peptide')
262+
dataframe[mod_peptide_col] = dataframe[mod_peptide_col].replace('', pd.NA)
263+
dataframe[mod_peptide_col] = dataframe[mod_peptide_col].fillna(dataframe[peptide_col])
193264

194265
return library_loader(
195266
dataframe,
196267
ptm_list,
197-
mz='PrecursorMz',
198-
im=im_col,
199-
charge='PrecursorCharge',
200-
protein='ProteinId',
201-
modified_peptide='ModifiedPeptideSequence'
268+
mz=mapper.get_column('mz'),
269+
im=mapper.get_column('mobility'),
270+
charge=mapper.get_column('charge'),
271+
protein=mapper.get_column('protein'),
272+
modified_peptide=mapper.get_column('modified_peptide')
202273
)
203274

204275

@@ -313,40 +384,37 @@ def __parse_diann_lib(
313384
314385
Parameters:
315386
dataframe (pd.DataFrame): imported library file from the analysis software
316-
"DIANN". Required columns:
317-
'PrecursorMz',
318-
'IonMobility',
319-
'PrecursorCharge',
320-
'ProteinName',
321-
'ModifiedPeptide',
322-
'decoy',
323-
'QValue'.
387+
"DIANN". Required columns (supports multiple naming variants)
324388
ptm_list (list): a list with identifiers used for filtering a specific dataframe column.
325389
require_im (bool): if True, requires ion mobility data; if False, makes ion mobility optional.
326390
327391
Returns:
328392
pd.DataFrame: returns a pre-filtered data frame with unified column names.
329393
"""
330-
# Filter out decoys and apply Q-value thresholdt
394+
# Initialize column mapper
395+
mapper = ColumnMapper(dataframe)
396+
397+
# Check required columns
398+
required_columns = ['decoy', 'qvalue', 'mz', 'charge', 'protein', 'modified_peptide']
399+
if require_im:
400+
required_columns.append('mobility')
401+
402+
mapper.validate_required_columns(required_columns)
403+
404+
# Filter dataframe
331405
filtered_dataframe = dataframe[
332-
(dataframe['decoy'] == 0) & # Remove decoy entries
333-
(dataframe['QValue'] <= 0.01) # Filter for 1% FDR
406+
(dataframe[mapper.get_column('decoy')] == 0) &
407+
(dataframe[mapper.get_column('qvalue')] <= 0.01)
334408
]
335-
336-
# Check if IM column exists
337-
im_col = 'IonMobility' if 'IonMobility' in dataframe.columns else None
338409

339-
if require_im and im_col is None:
340-
raise Exception("Ion mobility data required but not found in DIANN library")
341-
342410
return library_loader(
343411
filtered_dataframe,
344412
ptm_list,
345-
mz='PrecursorMz',
346-
im=im_col,
347-
charge='PrecursorCharge',
348-
protein='ProteinName',
349-
modified_peptide='ModifiedPeptide'
413+
mz=mapper.get_column('mz'),
414+
im=mapper.get_column('mobility'),
415+
charge=mapper.get_column('charge'),
416+
protein=mapper.get_column('protein'),
417+
modified_peptide=mapper.get_column('modified_peptide')
350418
)
351419

352420

pydiaid/oadia/method_generator.py

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -901,17 +901,9 @@ def adjust_bin_boundaries(bins, phospho_enriched=False):
901901
start_value = bins[i][0]
902902
end_value = bins[i][1]
903903

904-
# Adjust start value (except for first bin)
905-
if i == 0:
906-
adjusted_start = start_value
907-
else:
908-
adjusted_start = find_closest_forbidden_zone(start_value, phospho_enriched)
909-
910-
# Adjust end value (except for last bin)
911-
if i == len(bins) - 1:
912-
adjusted_end = end_value
913-
else:
914-
adjusted_end = find_closest_forbidden_zone(end_value, phospho_enriched)
904+
# Adjust value (including first and lastbin)
905+
adjusted_start = find_closest_forbidden_zone(start_value, phospho_enriched)
906+
adjusted_end = find_closest_forbidden_zone(end_value, phospho_enriched)
915907

916908
adjusted_bins.append([adjusted_start, adjusted_end])
917909

pydiaid/synchropasef/method_creator.py

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -373,6 +373,15 @@ def calculate_scan_area(
373373

374374

375375
return df_scan_area
376+
377+
378+
def check_slope(params_list):
379+
"""Fix floating-point precision errors in parameter lists"""
380+
for i, params in enumerate(params_list):
381+
col0, col1, col2, col3, col4 = params
382+
slope = (col4 - col1) / (col3 - col0)
383+
print(slope)
384+
print(params)
376385

377386

378387
def generate_isolation_windows(
@@ -438,17 +447,19 @@ def generate_isolation_windows(
438447
"2"
439448
)
440449

450+
rounding_factor = 0
441451
list_method_parameters = list()
442452
for index in range(len(mz_start_lower_IM)):
443453
list_temp = [
444454
df_scan_area["lower_IM"].iloc[0],
445-
round(mz_start_lower_IM[index], 1),
446-
round(mz_start_lower_IM[index]+mz_width_lower_IM[index], 1),
455+
np.round(mz_start_lower_IM[index], rounding_factor),
456+
np.round(mz_start_lower_IM[index]+mz_width_lower_IM[index], rounding_factor),
447457
df_scan_area["upper_IM"].iloc[0],
448-
round(mz_start_upper_IM[index], 1)
458+
np.round(mz_start_upper_IM[index], rounding_factor)
449459
]
450460
list_method_parameters.append(list_temp)
451-
list_method_parameters
461+
462+
# check_slope(list_method_parameters)
452463

453464
df_method_parameters = pd.DataFrame(
454465
list_method_parameters,

release/one_click_linux_gui/control

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
Package: pydiaid
2-
Version: 0.0.30
2+
Version: 0.0.40
33
Architecture: all
44
Maintainer: Mann Labs <opensource@alphapept.com>
55
Description: py_diAID

0 commit comments

Comments
 (0)