Skip to content

RFDiffusion3 to ProteinMPNN API via AtomArray objects fails with duplicate atom names #123

@kjczarne

Description

@kjczarne

Here's another fun issue.

I've attempted passing the AtomArray outputs from RFdiffusion3 directly to ProteinMPNN:

            engine = MPNNInferenceEngine(
                model_type="protein_mpnn",
                checkpoint_path=str(weights_path),
                is_legacy_weights=True,
                out_directory=None,
                write_fasta=False,
                write_structures=False
            )
            # Fix some annotations
            atom_array._annot['coord_to_be_noised'] = np.stack(atom_array._annot['coord_to_be_noised'])  # type: ignore
            outputs = engine.run(atom_arrays=[atom_array], input_dicts=None)
            print(outputs)

The atom_array object above has been directly fed from RFD3InferenceEngine. So I would expect this to work without issues. However I get:

Duplicate atom names detected in the same residue -- cannot infer struct_conn. This may happen when a non-polymer is loaded from a CIF file without using `atomworks.io.parser.parse`. 

For this issue I have no proposed fix at the moment as I'm unsure whether it's RFD3 not dumping correct .atom_name annotations in the AtomArray or if I'm just doing something fundamentally wrong.

Metadata

Metadata

Assignees

No one assigned

    Labels

    MPNNIssues related to any version of MPNNRFdiffusion3Issues related to RFD3

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions