OpenMS
diff --git a/‎.github/copilot-instructions.md‎
Lines changed: 224 additions & 0 deletions b/‎.github/copilot-instructions.md‎
Lines changed: 224 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 93 additions & 2 deletions b/‎README.md‎
Lines changed: 93 additions & 2 deletions
diff --git a/‎openms_python/__init__.py‎
Lines changed: 2 additions & 0 deletions b/‎openms_python/__init__.py‎
Lines changed: 2 additions & 0 deletions
@@ -0,0 +1,224 @@
+# GitHub Copilot Instructions for openms-python
+
+## Repository Overview
+
+`openms-python` is a Pythonic wrapper around pyOpenMS for mass spectrometry data analysis. The goal is to provide an intuitive, Python-friendly interface that makes working with mass spectrometry data feel natural for Python developers and data scientists.
+
+**Key Principle**: Make pyOpenMS more Pythonic by wrapping verbose C++ bindings with intuitive Python APIs.
+
+## Code Style and Conventions
+
+### Python Style
+- Follow PEP 8 conventions
+- Use Black formatter with 100 character line length (configured in `pyproject.toml`)
+- Target Python 3.8+ compatibility
+- Use type hints for better IDE support and code clarity
+- Prefer clear, descriptive names over abbreviations
+
+### Wrapper Design Patterns
+
+1. **Properties over getters/setters**: Use `@property` decorators instead of verbose get/set methods
+   ```python
+   # Good
+   spec.retention_time
+   # Avoid
+   spec.getRT()
+   ```
+
+2. **Pythonic iteration**: Support Python's iteration protocols (`__iter__`, `__len__`, `__getitem__`)
+   ```python
+   for spec in experiment.ms1_spectra():
+       print(spec.retention_time)
+   ```
+
+3. **Method chaining**: Return `self` from mutation methods to enable fluent interfaces
+   ```python
+   exp.filter_by_ms_level(1).filter_by_rt(100, 500)
+   ```
+
+4. **DataFrame integration**: Provide `to_dataframe()` and `from_dataframe()` methods for pandas interoperability
+
+5. **Context managers**: Support `with` statements for file I/O operations
+
+6. **Mapping interface for metadata**: Classes wrapping `MetaInfoInterface` should support dict-like access
+   ```python
+   feature["label"] = "sample_a"
+   ```
+
+### Class Naming Convention
+- Wrapper classes use the `Py_` prefix (e.g., `Py_MSExperiment`, `Py_FeatureMap`)
+- This distinguishes them from pyOpenMS classes while maintaining recognizability
+
+### File Organization
+- Core wrapper classes: `py_*.py` files (e.g., `py_msexperiment.py`, `py_featuremap.py`)
+- I/O utilities: `io.py` and `_io_utils.py`
+- Helper utilities: `_meta_mapping.py` for metadata handling
+- Workflow helpers: `workflows.py` for high-level pipelines
+- Example data: `examples/` directory contains sample files like `small.mzML`
+
+## Testing Requirements
+
+### Test Structure
+- All tests in `tests/` directory
+- Test files follow `test_*.py` naming convention
+- Use pytest as the testing framework
+- Aim for good coverage of wrapper functionality
+
+### Running Tests
+```bash
+# Install development dependencies
+pip install -e ".[dev]"
+
+# Run all tests
+pytest -v
+
+# Run with coverage
+pytest -v --cov=openms_python --cov-report=term-missing
+```
+
+### Test Patterns
+- Test basic wrapper functionality (properties, methods)
+- Test DataFrame conversions (to/from)
+- Test file I/O (load/store operations)
+- Test iteration and filtering
+- Test method chaining
+- Use `conftest.py` for shared fixtures
+
+## Development Setup
+
+### Installation
+```bash
+git clone https://github.com/openms/openms-python.git
+cd openms-python
+pip install -e ".[dev]"
+```
+
+### Dependencies
+- **Core**: pyopenms (>=3.0.0), pandas (>=1.3.0), numpy (>=1.20.0)
+- **Dev**: pytest, pytest-cov, black, flake8, mypy
+
+### Code Formatting
+```bash
+# Format code with Black
+black openms_python tests
+
+# Check style with flake8
+flake8 openms_python tests
+```
+
+## Key Architecture Patterns
+
+### 1. Wrapper Pattern
+Most classes wrap a corresponding pyOpenMS class and delegate to it while providing Pythonic interfaces:
+```python
+class Py_MSExperiment:
+    def __init__(self, exp=None):
+        self._exp = exp if exp is not None else oms.MSExperiment()
+    
+    @property
+    def retention_time(self):
+        return self._exp.getRT()
+```
+
+### 2. Factory Methods
+Use class methods for alternative constructors:
+```python
+@classmethod
+def from_file(cls, filepath):
+    # Load from file and return new instance
+    
+@classmethod
+def from_dataframe(cls, df):
+    # Create from pandas DataFrame
+```
+
+### 3. Smart Filtering
+Provide multiple ways to filter data:
+- Method-based: `filter_by_rt(min_rt, max_rt)`
+- Property-based: `rt_filter[min:max]`
+- Iterator-based: `ms1_spectra()`, `ms2_spectra()`
+
+### 4. Metadata Handling
+Classes that wrap `MetaInfoInterface` should implement mapping protocol:
+- `__getitem__`, `__setitem__`, `__delitem__`
+- `__contains__`, `__iter__`, `__len__`
+- `get()`, `pop()`, `update()` methods
+
+## Common Tasks
+
+### Adding a New Wrapper Class
+1. Create a new `py_<classname>.py` file
+2. Wrap the corresponding pyOpenMS class
+3. Add Pythonic properties for common getters/setters
+4. Implement `__len__`, `__iter__`, `__getitem__` if applicable
+5. Add `to_dataframe()` and `from_dataframe()` if appropriate
+6. Add `load()` and `store()` methods for file I/O
+7. Write comprehensive tests in `tests/test_py_<classname>.py`
+8. Update `__init__.py` to export the new class
+9. Add examples to README.md
+
+### Adding Helper Functions
+- High-level workflow functions go in `workflows.py`
+- I/O utilities go in `io.py` or `_io_utils.py`
+- Metadata utilities go in `_meta_mapping.py`
+
+### Documentation
+- Add docstrings to all public classes and methods
+- Include usage examples in docstrings
+- Update README.md with new features
+- Keep API reference section in README current
+
+## Special Considerations
+
+### Memory Management
+- Be mindful of memory when working with large datasets
+- Provide streaming alternatives for large files (see `stream_mzml`)
+- Consider using generators for iteration over large collections
+
+### pyOpenMS Compatibility
+- The package depends on pyOpenMS >= 3.0.0
+- When wrapping pyOpenMS classes, preserve all functionality
+- Add convenience methods but don't remove or break existing capabilities
+
+### Error Handling
+- Provide clear, helpful error messages
+- Validate inputs before passing to pyOpenMS
+- Handle common edge cases (empty containers, missing files, etc.)
+
+### Performance
+- Wrapper overhead should be minimal
+- Avoid unnecessary data copies
+- Use NumPy arrays for peak data when possible
+- Consider performance implications of DataFrame conversions
+
+## Examples and Documentation
+
+The README.md contains extensive examples. When adding new features:
+1. Add code examples showing the improvement over pyOpenMS
+2. Use "Before (pyOpenMS)" vs "After (openms-python)" format
+3. Include practical use cases
+4. Show integration with pandas/numpy when relevant
+
+## CI/CD
+
+The repository uses GitHub Actions for continuous integration:
+- Workflow: `.github/workflows/integration-tests.yml`
+- Runs on: Python 3.10 (configurable via matrix)
+- Tests run automatically on push to main and on pull requests
+
+## Contributing Guidelines
+
+When contributing:
+1. Make minimal, focused changes
+2. Maintain backward compatibility unless explicitly breaking
+3. Add tests for new functionality
+4. Format code with Black
+5. Ensure all tests pass
+6. Update documentation as needed
+
+## Questions or Issues?
+
+- Check existing documentation in README.md
+- Review existing wrapper implementations for patterns
+- Look at test files for usage examples
+- Open a discussion on GitHub for design questions
@@ -366,6 +366,79 @@ normalized_tic = chrom.normalize_to_tic()
 chrom["sample_id"] = "Sample_A"
 chrom["replicate"] = 1
 print(chrom.get("sample_id"))
+
+
+### Ion Mobility Support
+
+`openms-python` provides comprehensive support for ion mobility data through float data arrays and mobilograms.
+
+#### Float Data Arrays
+
+Spectra can have additional data arrays (e.g., ion mobility values) associated with each peak:
+
+```python
+from openms_python import Py_MSSpectrum
+import pandas as pd
+import numpy as np
+
+# Create a spectrum with ion mobility data
+df = pd.DataFrame({
+    'mz': [100.0, 200.0, 300.0],
+    'intensity': [50.0, 100.0, 75.0],
+    'ion_mobility': [1.5, 2.3, 3.1]
+})
+
+spec = Py_MSSpectrum.from_dataframe(df, retention_time=60.5, ms_level=1)
+
+# Access ion mobility values
+print(spec.ion_mobility)  # array([1.5, 2.3, 3.1])
+
+# Set ion mobility values
+spec.ion_mobility = np.array([1.6, 2.4, 3.2])
+
+# Convert to DataFrame with float arrays
+df = spec.to_dataframe(include_float_arrays=True)
+print(df)
+#      mz  intensity  ion_mobility
+# 0  100.0       50.0           1.6
+# 1  200.0      100.0           2.4
+# 2  300.0       75.0           3.2
+```
+
+#### Mobilograms
+
+Mobilograms represent the ion mobility dimension, showing intensity vs. drift time for a specific m/z.
+
+**Note:** OpenMS C++ has a native `Mobilogram` class that may not yet be wrapped in pyopenms. This wrapper uses `MSChromatogram` as the underlying representation for mobilogram data.
+
+```python
+from openms_python import Py_Mobilogram
+import numpy as np
+
+# Create a mobilogram from arrays
+drift_times = np.array([1.0, 1.5, 2.0, 2.5, 3.0])
+intensities = np.array([100.0, 150.0, 200.0, 180.0, 120.0])
+
+mob = Py_Mobilogram.from_arrays(drift_times, intensities, mz=500.0)
+
+print(f"m/z: {mob.mz}")
+print(f"Points: {len(mob)}")
+print(f"Base peak drift time: {mob.base_peak_drift_time}")
+
+# Convert to DataFrame
+df = mob.to_dataframe()
+print(df.head())
+#   drift_time  intensity     mz
+# 0         1.0      100.0  500.0
+# 1         1.5      150.0  500.0
+# 2         2.0      200.0  500.0
+
+# Create from DataFrame
+df = pd.DataFrame({
+    'drift_time': [1.0, 2.0, 3.0],
+    'intensity': [50.0, 100.0, 75.0]
+})
+mob = Py_Mobilogram.from_dataframe(df, mz=600.0)
 ```
 
 ## Workflow helpers
@@ -797,16 +870,34 @@ plt.show()
 - `base_peak_mz`: m/z of most intense peak
 - `base_peak_intensity`: Intensity of base peak
 - `peaks`: Tuple of (mz_array, intensity_array)
+- `float_data_arrays`: List of FloatDataArray objects
+- `ion_mobility`: Ion mobility values as NumPy array
+- `drift_time`: Spectrum-level drift time value
 
 **Methods:**
 - `from_dataframe(df, **metadata)`: Create from DataFrame (class method)
-- `to_dataframe()`: Convert to DataFrame
+- `to_dataframe(include_float_arrays=True)`: Convert to DataFrame
 - `filter_by_mz(min_mz, max_mz)`: Filter peaks by m/z
 - `filter_by_intensity(min_intensity)`: Filter peaks by intensity
 - `top_n_peaks(n)`: Keep top N peaks
 - `normalize_intensity(max_value)`: Normalize intensities
 
-- `normalize_intensity(max_value)`: Normalize intensities
+### Py_Mobilogram
+
+**Properties:**
+- `name`: Name of the mobilogram
+- `mz`: m/z value this mobilogram represents
+- `drift_time`: Drift time values as NumPy array
+- `intensity`: Intensity values as NumPy array
+- `peaks`: Tuple of (drift_time_array, intensity_array)
+- `total_ion_current`: Sum of intensities
+- `base_peak_drift_time`: Drift time of most intense point
+- `base_peak_intensity`: Intensity of base peak
+
+**Methods:**
+- `from_arrays(drift_time, intensity, mz=None, name=None)`: Create from arrays (class method)
+- `from_dataframe(df, **metadata)`: Create from DataFrame (class method)
+- `to_dataframe()`: Convert to DataFrame
 
 ### Identifications, ProteinIdentifications & PeptideIdentifications
 
 
@@ -24,6 +24,7 @@
 from .py_msexperiment import Py_MSExperiment
 from .py_msspectrum import Py_MSSpectrum
 from .py_chromatogram import Py_MSChromatogram
+from .py_mobilogram import Py_Mobilogram
 from .py_feature import Py_Feature
 from .py_featuremap import Py_FeatureMap
 from .py_consensusmap import Py_ConsensusMap
@@ -101,6 +102,7 @@ def get_example(name: str, *, load: bool = False, target_dir: Union[str, Path, N
     "Py_MSExperiment",
     "Py_MSSpectrum",
     "Py_MSChromatogram",
+    "Py_Mobilogram",
     "Py_Feature",
     "Py_FeatureMap",
     "Py_ConsensusMap",