Skip to content

IndexError in build_nice_OPdict(src: dict, lipid: Lipid) #457

@mdondrup

Description

@mdondrup

There is a problem using build_nice_OPdict(src: dict, lipid: Lipid) on some or most experimental OP
data.If they are missing a value for standard error. This results in IndexError: list index out of range

To Reproduce
Steps to reproduce the behavior:
Load an experiment, tested with unpublished/ferreiraDOPC for lipid DOPC

  1. Specify repository version and Data subrepository version.
commit 630c8889c9a37195f165e9a5732830a41db9e01e (HEAD -> main, origin/main, origin/HEAD)
Author: Alexey Nesterenko <comcon1@protonmail.com>
Date:   Wed Feb 4 11:31:42 2026 +0100

    Use mapping name convention in src/fairmd/lipids/molecules.py

    Co-authored-by: Michael Dondrup <mdondrup@users.noreply.github.com>

Python 3.13.5

Here's a rudimentary test, manually setting the data directory to use all data.
I am running this through tox:

def test_build_nice_OPdict_with_all_experiments(self, monkeypatch, tmpdir):
        """Test build_nice_OPdict with all available experiments from ExperimentCollection.
        
        This test:
        1. Loads all OP experiments from the databank
        2. For each experiment and lipid in that experiment
        3. Runs build_nice_OPdict on the raw OP data
        4. Validates the output structure and content
        """
        import fairmd.lipids
        from fairmd.lipids.experiment import ExperimentCollection, ExperimentError
        from fairmd.lipids.auxiliary.opconvertor import build_nice_OPdict
        
        # Set the data path to the test data directory, this needs to be changed to mock directory 
        # for proper testing
        
        fairmd.lipids.FMDL_DATA_PATH = "../../BilayerData"
        experiments = ExperimentCollection.load_from_data("OPExperiment")
        
        # Verify we loaded experiments
        assert len(experiments) > 0, "No experiments were loaded"
        
        # Test build_nice_OPdict on all experiments
        results_count = 0
        for exp in experiments:
            # Get experiment data, but skip if no experiment data is present
            try:
                exp_data = exp.data
            except ExperimentError:
                continue
            assert isinstance(exp_data, dict), f"Experiment {exp.exp_id} data is not a dict"
            
            # Process each lipid in the experiment
            for lipid_name, raw_op_data in exp_data.items():
                # Get the lipid object from the experiment
                lipid = exp.lipids[lipid_name]
                
                # Run build_nice_OPdict
                nice_op_dict = build_nice_OPdict(raw_op_data, lipid)
                
                # Validate output structure
                assert isinstance(nice_op_dict, dict), \
                    f"build_nice_OPdict output is not a dict for {lipid_name} in {exp.exp_id}"
                
                # If there's data, validate the structure of fragments
                if nice_op_dict:
                    for fragment_name, fragment_data in nice_op_dict.items():
                        assert isinstance(fragment_data, list), \
                            f"Fragment {fragment_name} data is not a list"
                        
                        # Validate each entry in the fragment
                        for entry in fragment_data:
                            assert isinstance(entry, dict), \
                                f"Fragment entry is not a dict"
                            assert "C" in entry, "Entry missing 'C' key"
                            assert "H" in entry, "Entry missing 'H' key"
                            assert "OP" in entry, "Entry missing 'OP' key"
                            assert "STD" in entry, "Entry missing 'STD' key"
                            
                            # Validate data types
                            assert isinstance(entry["C"], str), "C atom name is not string"
                            assert isinstance(entry["H"], str), "H atom name is not string"
                            assert isinstance(entry["OP"], (int, float)) or entry["OP"] is None, "OP is not numeric"
                            assert isinstance(entry["STD"], (int, float)) or entry["STD"] is None, "STD is not numeric"
                
                results_count += 1
        
        # Ensure we actually tested something
        assert results_count > 0, "No lipid data was tested"
        
        # Clean up
        ExperimentCollection.clear_instance()

Expected behavior
Loads all data without error. One might argue that the Data is at fault here. Anyway, more sanity checks
won't hurt.

A hotfix is in PR #456:

tests/test_op.py ....                                                                          [100%]

========================================= 4 passed in 14.79s =========================================
  tests-all: OK (17.99=setup[2.67]+cmd[0.12,0.02,15.17] seconds)
  congratulations :) (18.14 seconds)

Actual behavior

tests/test_op.py:115:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/fairmd/lipids/auxiliary/opconvertor.py:96: in build_nice_OPdict
    nice_OPdict: dict = _fragmentize(src, lipid.mapping_dict)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

src = {'M_G1C10_M M_G1C10H1_M': [-0.1071], 'M_G1C11_M M_G1C11H1_M': [-0.0279], 'M_G1C12_M M_G1C12H1_M': [-0.0465], 'M_G1C12_M M_G1C12H2_M': [-0.0465], ...}
mdict = {'M_G1C10H1_M': {'ATOMNAME': 'H43', 'FRAGMENT': 'sn-1'}, 'M_G1C10_M': {'ATOMNAME': 'C43', 'FRAGMENT': 'sn-1'}, 'M_G1C11H1_M': {'ATOMNAME': 'H44', 'FRAGMENT': 'sn-1'}, 'M_G1C11_M': {'ATOMNAME': 'C44', 'FRAGMENT': 'sn-1'}, ...}

    def _fragmentize(src, mdict):
        r = {}
        for apair, opvals in src.items():
            atom_c, atom_h = apair.split(" ")
            frag_c = mdict[atom_c].get("FRAGMENT", "total")
            if frag_c not in r:
                r[frag_c] = []
            r[frag_c].append(
>               {"C": atom_c, "H": atom_h, "OP": opvals[0], "STD": opvals[1]},
                                                                   ^^^^^^^^^
            )
E           IndexError: list index out of range

src/fairmd/lipids/auxiliary/opconvertor.py:91: IndexError
--------------------------- Captured stdout teardown ---------------------------
DBG: Mocking completed
=========================== short test summary info ============================
FAILED tests/test_op.py::TestBuildNiceOPdict::test_build_nice_OPdict_with_all_experiments - IndexError: list index out of range

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions