A small, highly incomplete, list of what to know about PDB formats (at least, what I found relevant but, concurrently, to so much highlighted).
WARNING: The PDB format is not updated anymore. Now the standard is the, quite heavy but coherent, mmCIF file format.
- If specified a polymer with the characteristic
Protein/Oligosaccharide(RCSB), then with have an AA bind to another molecule such as a Carbohydrate - unit cell (x-ray exp)
$\neq$ biological unit. Here the guide from RCSB
- Missing residues are described by the
REMARK 465record. SEQRESdescribe the sequence: it contains missing residues to be compared withREMARK 465- AA that has been modified by the experimentalist, or post-translationally or other, are present with the
HETATMrecord and described in the header with theMODRES,HETandFORMULArecords- Missing atoms that are non-standard AA are not required to be described by a
MODRESrecord. Nevertheless, they appear in theSEQRESrecord
- Missing atoms that are non-standard AA are not required to be described by a
REMARK 610describe non-polymer residues with missing atoms
Biopython- API doc and general description
- very nicely implemented library but the documentation is far from being complete
PDBused to read PDBs, speciallyATOMrecordsSeqIOused to read pdb headers and sequences. Used especially for the second case, hence bio-informatics applications.
MDAnalysis- Used with MD trajectories for IO and analysis. Complete library with steep learning curve
openmm/pdbfixerModeller- VEry complete software for a large range of application, maybe too much, and API doc which is not well structured