Missing Unresolved Residues in `mmcif_parsing`

Hi All! 

Thank you for your effort in developing the open-source AF3.

#### Issue Description
I have encountered an issue with the `mmcif_parsing` module related to unresolved residues. It appears that when the protein sequence is parsed directly from the `structure` object in `Biopython`, the unresolved residues — those that do not appear in the mmcif coordinates part (`_atom_site`) — are not included in the `MmcifObject`. 

#### Impact
We need the unresolved residues for some computations, such as calculating the unresolved relative solvent accessible surface area (RASA).

#### Example
For instance, the actual sequence for the protein with PDB ID `7a4d` is:
```
QVQLQESGGGLVQPGGSLRLSCAAPGFRLDNYVIGWFRQAPGKEREGVSCISSSAGSTYYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCATACYSSYVTYWGQGTQVTVSSGRYPYDVPDYGSGRA
```

However, when using `mmcif_parsing`, the parsed sequence is:
```
QLQESGGGLVQPGGSLRLSCAAPGFRLDNYVIGWFRQAPGKEREGVSCISSSAGSTYYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCATACYSSYVTYWGQGTQVTVSSGR
```

#### Additional Observations
We also noticed that the cached MSAs in `data/pdb_data/data_caches/msa/train_msas/7a4d-assembly1A_protein.a3m` are computed based on the latter sequence, which excludes the unresolved residues.

#### Request for Assistance
Is there a solution to include the unresolved residues in the parsed sequence? Any guidance or help with this issue would be greatly appreciated.

Best regards,
Shaoning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Missing Unresolved Residues in `mmcif_parsing` #284

Issue Description

Impact

Example

Additional Observations

Request for Assistance

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Missing Unresolved Residues in mmcif_parsing #284

Description

Issue Description

Impact

Example

Additional Observations

Request for Assistance

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Missing Unresolved Residues in `mmcif_parsing` #284