Merge branch 'master' of https://github.com/DineshNehra29/pyopenms-docs

DineshNehra29 · DineshNehra29 · commit cb0e219fc16e · 2025-03-24T20:24:57.000+05:30
diff --git a/.github/workflows/code-blocks-linting.yaml b/.github/workflows/code-blocks-linting.yaml
@@ -16,7 +16,7 @@ jobs:
 
     - name: Get changed files
       id: changed-files
-      uses: tj-actions/changed-files@v41
+      uses: tj-actions/changed-files@v46
       with:
         dir_names_max_depth: 0
         files: |
diff --git a/.github/workflows/test-pr.yaml b/.github/workflows/test-pr.yaml
@@ -19,7 +19,7 @@ jobs:
 
     - name: Get changed files
       id: changed-files
-      uses: tj-actions/changed-files@v45
+      uses: tj-actions/changed-files@v46
       with:
           dir_names_max_depth: 0
           files: |
diff --git a/docs/source/img/OpenMS.svg b/docs/source/img/OpenMS.svg
diff --git a/docs/source/user_guide/quantitative_data.rst b/docs/source/user_guide/quantitative_data.rst
@@ -27,7 +27,7 @@ analysis.
         masstrace.push_back(p)
 
 Usually, the quantitative features would be produced by a so-called
-:py:class:`~.FeatureFinder` algorithm, which we will discuss in the `next chapter <feature_detection.html>`_. The
+`FeatureFinder` algorithm, which we will discuss in the `next chapter <feature_detection.html>`_. The
 features can be stored in a :py:class:`~.FeatureMap` and written to disk.
 
 .. code-block:: python
diff --git a/docs/source/user_guide/reading_raw_ms_data.rst b/docs/source/user_guide/reading_raw_ms_data.rst
@@ -121,11 +121,14 @@ chromatograms as they are read from the disk. A simple implementation could look
 
 which can the be used as follows:
 
-.. code-block:: output
+.. code-block:: python
 
     filename = b"test.mzML"
     consumer = MSCallback()
     oms.MzMLFile().transform(filename, consumer)
+	
+.. code-block:: output
+	
     Read a spectrum
     Read a spectrum
     Read a spectrum
@@ -138,50 +141,61 @@ spectrum or chromatogram is read from disk, the function ``consumeSpectrum`` or
 ``consumeChromatogram`` is called and a specific action is performed. We can
 use this to implement a simple filtering function for mass spectra:
 
-.. code-block:: output
+.. code-block:: python
 
-    class FilteringConsumer:
-        """
-        Consumer that forwards all calls the internal consumer (after
-        filtering)
-        """
+	import os
+	import pyopenms as oms
+	from urllib.request import urlretrieve
 
-        def __init__(self, consumer, filter_string):
-            self._internal_consumer = consumer
-            self.filter_string = filter_string
+	gh = "https://raw.githubusercontent.com/OpenMS/pyopenms-docs/master"
+	urlretrieve(gh + "/src/data/tiny.mzML", "test.mzML")
 
-        def setExperimentalSettings(self, s):
-            self._internal_consumer.setExperimentalSettings(s)
+	print("Current Working Directory where all files are stored:", os.getcwd())
 
-        def setExpectedSize(self, a, b):
-            self._internal_consumer.setExpectedSize(a, b)
+	class FilteringConsumer:
+		"""
+		Consumer that forwards all calls the internal consumer (after
+		filtering out spectra with less than 'min_spec_size' peaks)
+		"""
 
-        def consumeChromatogram(self, c):
-            if c.getNativeID().find(self.filter_string) != -1:
-                self._internal_consumer.consumeChromatogram(c)
+		def __init__(self, consumer, min_spec_size = 0):
+			self._internal_consumer = consumer
+			self._min_spec_size = min_spec_size
 
-        def consumeSpectrum(self, s):
-            if s.getNativeID().find(self.filter_string) != -1:
-                self._internal_consumer.consumeSpectrum(s)
+		def setExperimentalSettings(self, s):
+			self._internal_consumer.setExperimentalSettings(s)
+
+		def setExpectedSize(self, a, b):
+			self._internal_consumer.setExpectedSize(a, b)
+
+		def consumeChromatogram(self, c):
+			# just forward; do nothing to chromatograms
+			self._internal_consumer.consumeChromatogram(c)
+
+		def consumeSpectrum(self, s):
+			print("Spec has size: ", s.size())
+			if s.size() >= self._min_spec_size:
+				print(" --> keep it")
+				self._internal_consumer.consumeSpectrum(s)
+			else:
+				print(" --> discard it")
 
-    ###################################
-    filter_string = "DECOY"
-    inputfile = "in.mzML"
-    outputfile = "out.mzML"
-    ###################################
 
-    consumer = oms.PlainMSDataWritingConsumer(outputfile)
-    consumer = FilteringConsumer(consumer, filter_string)
+	min_spec_size = 11  ## we will keep spectra with 11 or more peaks and discard the others
+	inputfile = "test.mzML"
+	outputfile = "out.mzML"
 
-    oms.MzMLFile().transform(inputfile, consumer)
+	consumer = oms.PlainMSDataWritingConsumer(outputfile)
+	consumer = FilteringConsumer(consumer, min_spec_size)
 
+	oms.MzMLFile().transform(inputfile, consumer)
 
-where the spectra and chromatograms are filtered by their native ids. It is
+, where the spectra are filtered by their size. It is
 similarly trivial to implement filtering by other attributes. Note how the data
 are written to disk using the :py:class:`~.PlainMSDataWritingConsumer` which is one of
 multiple available consumer classes -- this specific class will simply take the
-spectrum ``s`` or chromatogram ``c`` and write it to disk (the location of the
-output file is given by the ``outfile`` variable).
+:py:class:`~.MSSpectrum` ``s`` or :py:class:`~.MSChromatogram` ``c`` and write it to disk (the location of the
+output file is given by the ``outputfile`` variable).
 
 Note that this approach is memory efficient in cases where computation should
 only occur on part of the data or the whole data may not fit into memory.