Skip to content

Questions about the chain/interface clustering files #307

@zqcai19

Description

@zqcai19

@lucidrains @amorehead Hi, thank you very much for your efforts in the reproduction of AlphaFold3. I have downloaded the preprocessed mmCIF files and chain/interface clustering files as described in the README and would like to use the clustered test set to evaluate AF3.

Based on my understanding, the json, csv, and fasta files should contain information on the chain IDs, cluster mapping, and sequences. However, I noticed inconsistencies between them and the RCSB PDB. For example, in filtered_all_chain_sequences.json:

  1. 8a14-assembly1: The file only records 2 chains, whereas RCSB shows that it has 6 chains.
  2. 8sza-assembly1: The file does not seem to include ligand information.
  3. The sequences in both cases appear to be cropped compared to the original sequences in RCSB.

Other entries have similar inconsistencies as well. Am I missing something here? How to use the chain/interface clustering files to evaluate AF3?

Thank you in advance for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions