Releases · project-gemmi/gemmi

23 Nov 14:30

wojdyr

v0.7.4

31d7fe7

0.7.4 Latest

Latest

Library

Add calculation of hydrogen bonds according to the venerable DSSP method
(implementation of the whole DSSP method was considered, attempted and postponed)
Add reading of MRC map in mode 12 (float16_t)
(using the half library that's now bundled with gemmi)
bundled third-party libs: updated fast_float
and added https://sourceforge.net/projects/half
Calculating anomalous maps (f") from a model
gemmi <-> mmdb conversion: handle SEQRES
to_mmcif: populate more items in _struct_ref_seq.
to_pdb: to preserve original letter casing of modres.mod_id
mmcif: read _em_3d_reconstruction.resolution into Structure::resolution
Add a function populate_structure_from_block() (#383)
cif.hpp: keywords loop_, global_ and stop_ can be part of unquoted strings
(stop_ is a keyword, but stop_it is not)
Added Intensities::merged()
Added add 6 PYR residues to the internal ResInfo table
In add_hydrogens_without_positions(): tweaked how occupancies of neighbouring
atoms are considered
Tweaked determine_cutoff_radius() in dencalc.hpp

Programs

blobs: can be called with a map instead of map coefficients
h: add option --d-fract=FRACT (#385) that adds atoms as H/D mix,
with the given deuterium fraction.
sf2map: add option --pow which outputs, for instance, a Pattterson map
contact: add option --asus that list asymmetric units that are in contact with 1_555,
not individual contacts.
sfcalc: Add support for writing anomalous structure factors (FCanom and PHICanom columns)
to MTZ files alongside regular structure factors.
fix: gemmi convert --add-tls didn't work for mmCIF input

Python

Added a small utility fetch.py to fetch files from the PDB:
can be used as python -m gemmi.fetch 3abc
Added read_structure_string()
Added bindings to misc C++ functions
Added FlatStructure that exposes coordinate data in a single table

Assets 2

05 Jul 18:42

wojdyr

v0.7.3

6b9910a

0.7.3

Library:

A breaking change in Op:

Until now parse_triplet("h,k,l") was equivalent to parse_triplet("x,y,z").
This is not how it's handled in cctbx and in, for example Pointless.
In the Pointless documentation of the REINDEX keyword there is such a note:

Note that the real and reciprocal space operators correspond to mutually transposed matrices, eg "x-y,-y,-z" corresponds to "h,-h-k,-l".

Gemmi Op was changed to store the notation kind that was used at creation,
so now parse_triplet("h,k,l") .triplet() gives "h,k,l", not "x,y,z".
New Op methods have been added: is_hkl(), as_hkl() and as_xyz()
Apart from this, in case of hkl-Ops parse_triplet and triplet silently transpose the rotation matrix. See #359 for details.
This change was extensively tested with Mtz::reindex() – against Pointless.
It could happen that it causes problems in other scenarios.
DDL2: added alternate (deposition) checks
Mtz: added C++ Mtz::write_to_buffer() and Python Mtz.write_to_bytes() (used for streaming MTZ files from web servers)
and C++ size_to_write()
These new C++ functions are currently undocumented, not sure if they are useful.
Changed Mtz::write_to_string() to clear the string instead of appending to it. Probably nobody expected the latter.
Reorganized MTZ reading to avoid fseek() – so we can read a stream or gzipped file, without storing/unpacking it into memory buffer as we did before (i.e. reading huge gzipped MTZ files uses less memory)
added UnitCell::find_nearest_pbc_images()
CIF reading: added optional argument int check_level=1 (see docs for details)
Form factors: support for custom form factors (in the form of sums of 5 Gaussians) – for a Cambridge project on chemistry-dependent form factors.
to_mmcif: write _atom_site_anisotrop.pdbx_PDB_model_num if there are 2+ models
to_pdb: split option use_linkr into use_linkr and use link_id
the built-in residue list is now partly editable (example in docs)
mmcif: workaround for a problem with reading 7pvv.cif (#369)
occupancy and B_iso_or_equiv are now optional (because of #375)
pdb: added option ignore_ter to PdbReadOptions
started working on secondary structure determination – that's unfinished and unusable yet
renamed interpolate_grid_of_aligned_model2() to interpolate_grid_around_model()
added functions interpolate_points() and interpolate_grid_flexible() (#363), needed in Pandda2

Python:

added python bindings to Ccp4Base::ccp4_header and to some functions related to gemmi::ChemComp

Program

gemmi-validate: new check for monomer files, and new option --depo to check against PDB's deposition criteria (from DDL2)

Assets 2

24 Mar 20:14

wojdyr

v0.7.1

b8ea482

0.7.1

Library

reading mmcif: added reading of TLS information
writing mmcif: added a few new items
using Logger also in classes Ddl and CifToMtz (Logger is now used in all library functions that output warnings or messages)
improved and documented mmCIF validation (with DDL2)
added read_ccp4_header() for reading only map header when a map is a huge file
added a few functions related to TLS (will be documented later)
documented: working with XDS files, normalization of amplitudes (F->E)
calculating merging statistics (R-merge, R-meas, R-pim, CC1/2) in various ways: Gemmi can calculate R-merge (and other R-*) in 3 different ways that are present in the literature and other programs; for CC1/2, the sigma-tau method is used
internal refactoring of file reading
misc bug fixes

Python

functions Mtz.filtered() and XdsAscii.filtered()
a number of other additions and a few small changes/fixes
all cif-reading function consistently read gzipped files (previously, cif.read_file() and gemmi.read_small_structure() didn't)

Program

gemmi fprime supports ranges of energies
gemmi merge – added new options, most importantly --stats to print quality metrics
gemmi convert – option --add-tls to convert "residual" B-factors (from Refmac) to full B-factors
gemmi mask – solvent masking that takes into account alternative conformers and atom occupancy (experimental)

Assets 2

30 Nov 20:02

wojdyr

v0.7.0

32ed576

0.7.0

C++14 (or later) is required to build the library, C++17 (or later) to build Python bindings.
Expect breaking changes, especially in Python bindings.
The lists below are not complete, but should cover most of the changes.

Library

Added unified logging of warnings/errors from various gemmi functions (class Logger)
replaced string Model::name with int Model::num
mmcif: better handling of null auth_comp_id
fixes for mmJSON
Removed deprecated functions:
- UnitCell.fractionalization_matrix and orthogonalization_matrix – use frac.mat and orth.mat
- count_hydrogen_sites() – use has_hydrogen() or count_atom_sites(gemmi.Selection('[H,D]')
- Grid::resample_to() – use interpolate_grid()
unified API of Grid interpolation functions. They now have parameter order that can be 0 (nearest value), 1 (linear interpolation), or 3 (cubic). In C++ there are also functions such as trilinear_interpolation() to ensure no overhead.
to_pdb: write HET records
Extended selection syntax with: [metals] and [nonmetals].
Added function set_is_metal() intended for debatable metalloids
improved interoperability with MMDB (a CCP4 library)
MonLib: removed read_cif args
mtz: fixed writing BATCH records
hydrogen placement: fixes needed for new files with metals in CCP4 Monomer Library
pdb: fixed reading TLS S tensor
Structure metadata: expanded RefinementInfo

Python

Python bindings migrated from pybind11 to nanobind.
- Much lower runtime overhead, faster build times, better error diagnostics.
- Built-in typing stubs.
- Only Python 3.8+.
- Sadly, no support for Buffer Protocol. It was replaced with NumPy __array__ methods.
  For NumPy, you can also use .array properties that were available also in the previous releases.
- No implicit conversions from list to ndarray, and from bytes to string (let me know where it causes problems)
- gemmi.ValueSigmaAsuData.value_array has now shape (N,2)
Added pickling support for Structure, Model, Chain, Residue, Atom, cif.Document, cif.Block.
Added function interpolate_position_array (#323).
Python extension module is now installed into site-packages/gemmi/ (this change should be invisible to the user)

Program

gemmi convert --sifts-num is now more customizable
gemmi sf2map: added option --check (see docs)
gemmi cif2mtz: add a rule to spec to convert pdbx_F_calc_with_solvent to F-model (+phase)
gemmi xds2mtz: handles merged files from XSCALE
gemmi mtz2cif and merge: recognize extension .ahkl as XDS file

Assets 2

06 Sep 20:56

wojdyr

v0.6.7

0da57ac

0.6.7

This is primarily a bug-fix release. New Python bindings are not included yet.

Enhancements:

New subcommand gemmi set for changing coordinates, B-factors and occupancies in coordinate files (mmCIF and PDB). Unlike other tools, it replaces numbers while leaving the rest of the file intact. An alternative to CCP4 PDBSET keywords: BFACTOR, OCCUPANCY, SHIFT, NOISE. Note that gemmi convert offers overlapping capabilities. For instance, gemmi convert --apply-symop=x+0.123,y,z shifts the coordinates similarly to gemmi set --shift='9.3 0 0' (the latter takes the shift in Angstroms).
Improved anisotropic scaling of structure factors. More work is planned in this area.

Fixes:

fixed reading of mmCIF files without _atom_site.auth_seq_id
in Topology preparation: fixed a couple of bugs, peptide links are now assumed to be CIS for ω=0±60° (previously, ω=0±30°)
fixed re-assignment of ATOM/HETATM record types (gemmi convert --assign-records)
fixed gemmi convert --sifts-num for UniProt sequence numbers >5000

And various minor changes that are hard to describe concisely.

Assets 2

28 May 19:22

wojdyr

v0.6.6

0b23c06

0.6.6

Library:

SmallStructure: changed how the space group is read and accessed.
Relying on H-M space group names alone was not always sufficient. The new mechanism uses the list of operations and Hall symbol in preference to the H-M symbol – the order is configurable.
symmetry triplets: parse decimal fractions (small molecule files may use notation such as x+0.25 instead of x+1/4)
tabulated space groups: a few more settings: B 1 2 1, B 1 21 1, F 1 m 1, F 1 d 1, F 1 2 1
X-ray scattering coefficients: changed the default value of IT92::ignore_charge to true (i.e. charges are now ignored by default; before version 0.6.3 they were always ignored)
cif::Table: added method ensure_loop() that converts tag-value pairs into a loop; might be needed before calling append_row()
place_hydrogens(): fix for NH3-like configurations
improved gemmi->mmdb conversion
Grid: tweaked good_grid_size() to ensure that when creating a grid up to a certain d_min, all reflections up to d_min are in the grid (it matters when no oversampling is applied)
DensityCalculator: deprecated function set_grid_cell_and_spacegroup(), use grid.setup_from()
fixed TNT-compatible reciprocal space ASU calculation for non-standard settings
infer_polymer_end(): complicate the heuristic even more, to detect files that have HETATM incorrectly used for standard residues in a polymer (such files were reported, they are either a result of mutating from non-standard residues, or a buggy program)
added function assign_het_flags() to re-set ATOM/HETATM flags
Model: added funtions calculate_b_iso_range() and calculate_b_aniso_range(); the first one can be used to detect if pLDDT is in the range 0-100 (like from AlphaFold) or 0-1 (like from ESMFold)
writing mmCIF: write _entity_poly_seq.hetero
added flag Entity::reflects_microhetero that shows if sequences were read from SEQRES (and don't account for point mutations) or from _entity_poly_seq; new function add_microhetero_to_sequences() changes the former to the latter

Program:

gemmi sfcalc: added a few more options
gemmi convert: added options --assign-records[=A|H], improved --sifts-num, adding microheterogeneities to _entity_poly_seq when converting from PDB
gemmi cifdiff: added option -t for basic comparison of values for a single tag

Other:

minimal WebAssembly port (C++ code compiled with emscripten) of Structure,
as a proof-of-concept and for reading mmCIF files in UglyMol
examples/to_rdkit.py: example of conversion of gemmi ChemComp to RDKit Mol

and a number of less important changes

Assets 2

17 Feb 13:40

wojdyr

v0.6.5

e471b13

0.6.5

Library:

gemmi can now be built with zlib-ng, a faster fork of zlib (good for working with large, compressed files)
experimental: binary serialization of Structure (contained objects, such as Model, Chain or UnitCell, can also be serialized separately)
finalized handling of 5-character monomer names; uses the tilde-hetnam extension (ABCDE ↔ ~DE) for PDB files
when atom names in the coordinate file match previous names (_chem_comp_atom.alt_atom_id) from the monomer library (the names in the CCD and therefore also in the ML change occasionally), print better diagnostic; added function MonLib::update_old_atom_names() to update the names in a Structure
topology: fixed handling of two bonds between the same two residues
options for handling mmCIF files with incorrect entities (modified add_entity_ids() when called with overwrite=true)
added function Intensities::prepare_merged_mtz()
a few bug fixes (for instance, in handling of negative residue numbers in the selection syntax)

Python bindings:

generating type stubs - see #293
python: cif.Loop.val() has been replaced with __getitem__/__setitem__
fixed Mtz.Batch.ints and Mtz.Batch.floats

Program

subcommand diff has been renamed to cifdiff
subcommand prep has been renamed to crd
validate: more options for checking monomer files
gemmi-grep: added option --extended-regexp
mtz2cif: added column names Iplus/Iminus (used by ccp4i2) to the default conversion spec

Note: this list is meant to show important changes only.

Assets 2

13 Dec 16:20

wojdyr

v0.6.4

84b5803

0.6.4

Library

completely changed build system for Python module, from setuptools to scikit-build-core
optimized electron density calculation: single-precision version is now about 2x faster and slightly less exact; some other grid-based calculations also got optimized in the process
as part of the above optimizations, some of the grid computations require that the model is in the standard orientation (conventional axis directions); in other cases (which are very rare after the remediation of non-standard coordinate frames in the PDB) call standardize_crystal_frame()
CIF output: more flexible formatting
mmCIF writing: category _entity_poly is included by default, with pdbx_strand_id and pdbx_seq_one_letter_code
minor changes in reading mmCIF coordinate files
cif: added functions Loop::add_columns(), Loop::remove_column(), Column::erase()
MRC map format: ORIGIN record is ignored (previously, if ORIGIN was non-zero, Ccp4::full_cell() returned false and some map properties were not set)
new function Grid::symmetrize_avg()
fixed bug in ReciprocalGrid::prepare_asu_data()
added function read_pir_or_fasta() for reading sequences (previously it was undocumented and more limited)
added function pdbx_one_letter_code() which returns a string like AA(MSE)H…, for _entity_poly.pdbx_seq_one_letter_code
new functions expand_one_letter() and expand_one_letter_sequence() that take ResidueKind.AA/RNA/DNA as argument replaced expand_protein_one_letter*()
adjusted weights in align_sequence_to_polymer()
added function assign_best_sequences()
PDB reading: added Structure::ter_status flag to indicate if TER records were: absent, present, clearly in wrong places
experimental (not documented yet) new functions: Model::get_cra(), Model::get_parent_of()
Topo::Bond stores a flag for bonds between different symmetry images
ChemComp::Atom: store _chem_comp_atom.alt_atom_id as old_id, use it in new function update_old_atom_names()
riding hydrogens: added H had wrong occupancy in special, rare cases
added Vec3f – Vec3 with single-precision numbers
minor API changes: Binner::setup() doesn't return anything, changed argument types of Scaling::scale_data(), align_sequences()

Program

new tool gemmi-diff that compares categories and tags in two (mm)CIF files
gemmi-align prints vertical list with option --verbose
gemmi-residues has new options: -e, -sss, --chains
gemmi-rmsz: added option --missing to print missing atoms
gemmi-validate: more options for validating monomer files
gemmi-h: more options
gemmi-mtz: prints info about SYMM records

Assets 2

07 Sep 13:08

wojdyr

v0.6.3

28b5670

0.6.3

new: normalization of amplitudes using so-called "Karle" approach, similar as in the CCP4 program ECALC
added X-ray scattering coefficients for ions (previously, the charge of atom was ignored)
pdb: reading CONECT records, and an option to also write them
when reading pdb, if any chain has 2+ TER records, all TER records are ignored
more configuration options for writing pdb files
added functions Mtz::expand_to_p1() and Mtz::read_file_gz()
cif::Block::find_value(tag) now returns also value from the corresponding loop if that loop has only one row
changes in gemmi-validate related to validation with DDL2
gemmi-sfcalc: added option --sigma-cutoff
gemmi sf2map --mapmask: if the unit cells in coordinate file is different than in SF file, use only the latter
improved transform_to_assembly(), expand_ncs() and rename_chain()
cif2mtz: Mtz column for pdbx_DELPHWT has now label PHDELWT (#272)
fixed ensure_asu(): phase-shift (for phases and H-L coefficients) was wrong
fixed UnitCell::find_nearest_image() for non-crystals with NCS
fixed DensityCalculator::requested_grid_spacing()
changes and enhancements in add_chemcomp_to_block(), in solvent masking, in mtz2cif,
and in several other places
added python bindings to MtzToCif, cif::Ddl, PdbWriteOptions, changed how options for PDB writing are passed, more bindings for Mtz::Batch

Assets 2

25 May 17:22

wojdyr

v0.6.2

3f7b8d7

0.6.2

a number of fixes, mostly in topology preparation
support for extended (longer) CCD and PDB codes that are about to be introduced by the PDB
gemmi-convert: added option to rename a monomer
a few changes and additions in cif2mtz, including:
- anomalous data written as separate rows for F+ and F- is now converted as expected
- _refln.F_squared_meas is now a synonym for F_squared_meas
gemmi-grep: new option --only-tags
gemmi-validate: a couple of new checks and options
pdb and mmCIF: convert MODRES <-> _pdbx_struct_mod_residue
cif.Block: blocks with no name (just data_) used to have the name set to "#", now it's " "

Assets 2

Releases: project-gemmi/gemmi

0.7.4

Library

Programs

Python

Uh oh!

0.7.3

Library:

Python:

Program

Uh oh!

0.7.1

Library

Python

Program

Uh oh!

0.7.0

Library

Python

Program

Uh oh!

0.6.7

Uh oh!

0.6.6

Uh oh!

0.6.5

Uh oh!

0.6.4

Uh oh!

0.6.3

Uh oh!

0.6.2

Uh oh!