All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Most recent change on the bottom.
- add support for equivariance testing of arbitrary Cartesian tensor outputs
- [Breaking] use entry points for
nequip.extensions (e.g. for field registration) - alternate neighborlist support enabled with
NEQUIP_NLenvironment variable, which can be set toase(default),matscipyorvesin - Allow
n_trainandn_valto be specified as percentages of datasets. - Only attempt training restart if
trainer.pthfile present (prevents unnecessary crashes due to file-not-found errors in some cases)
- [Breaking]
NEQUIP_MATSCIPY_NLenvironment variable no longer supported
- Fixed
flake8install location inpre-commit-config.yaml
- add Tensorboard as logger option
- [Breaking] Refactor overall model logic into
GraphModeltop-level module - [Breaking] Added
model_dtype BATCH_PTR_KEYinAtomicDataDictAtomicInMemoryDataset.rdf()andexamples/rdf.pytype_to_chemical_symbol- Pair potential terms
nequip-evaluate --output-fields-from-original-dataset- Error (or warn) on unused options in YAML that likely indicate typos
dataset_*_absmaxstatistics optionHDF5Dataset(#227)include_file_as_baseline_configfor simple modifications of existing configsnequip-deploy --using-datasetto support data-dependent deployment steps- Support for Gaussian Mixture Model uncertainty quantification (https://doi.org/10.1063/5.0136574)
start_of_epoch_callbacksnequip.train.callbacks.loss_schedule.SimpleLossSchedulefor changing the loss coefficients at specified epochsnequip-deploy build --checkpointand--overrideto avoid many largely duplicated YAML files- matscipy neighborlist support enabled with
NEQUIP_MATSCIPY_NLenvironment variable
- Always require explicit
seed - [Breaking] Set
dataset_seedtoseedif it is not explicitly provided - Don't log as often by default
- [Breaking] Default nonlinearities are
silu(e) andtanh(o) - Will not reproduce previous versions' data shuffling order (for all practical purposes this does not matter, the
shuffleoption is unchanged) - [Breaking]
default_dtypedefaults tofloat64(model_dtypedefaultfloat32,allow_tf32: trueby default--- see https://arxiv.org/abs/2304.10061) nequip-benchmarknow only uses--n-dataframes to build the model- [Breaking] By default models now use
StressForceOutput, notForceOutput - Added
edge_energytoALL_ENERGY_KEYSsubjecting it to global rescale
- Work with
wandb>=0.13.8 - Better error for standard deviation with too few data
load_model_stateGPU -> CPU- No negative volumes in rare cases
- [Breaking]
fixed_fieldsmachinery (npz_fixed_field_keysis still supported, but through a more straightforward implementation) - Default run name/WandB project name of
NequIP, they must now always be provided explicitly - [Breaking] Removed
_paramsas an allowable subconfiguration suffix (i.e. instead ofoptimizer_paramsnow onlyoptimizer_kwargsis valid, not both) - [Breaking] Removed
per_species_rescale_arguments_in_dataset_units
- sklearn dependency removed
nequip-benchmarkandnequip-trainreport number of weights and number of trainable weightsnequip-benchmark --no-compileand--verboseand--memory-summarynequip-benchmark --pdbfor debugging model (builder) errors- More information in
nequip-deploy info - GPU OOM offloading mode
- Minimum e3nn is now 0.4.4
--equivariance-testnow prints much more information, especially when there is a failure
- Git utilities when installed as ZIPed
.egg(#264)
- BETA! Support for stress in training and inference
EMTTestDatasetfor quick synthetic fake PBC data- multiprocessing for ASE dataset loading/processing
nequip-benchmarktimes dataset loading, model creation, and compilationvalidation_batch_size- support multiple metrics on same field with different
functionals - allow custom metrics names
- allow
e3nn==0.5.0 --verboseoption tonequip-deploy- print data statistics in
nequip-benchmark normalized_sumreduction inAtomwiseReduce
- abbreviate
node_features->hin loss titles - failure of permutation equivariance tests no longer short-circuts o3 equivariance tests
NequIPCalculatornow stores all relevant properties computed by the model regardless of requestedproperties, and does not try to access those not computed by the model, allowing models that only compute energy or forces but not both
- Equivariance testing correctly handles output cells
- Equivariance testing correctly handles one-node or one-edge data
report_init_validationnow runs on validation set instead of training set- crash when unable to find
os.sched_getaffinityon some systems - don't incorrectly log per-species scales/shifts when loading model (such as for deployment)
nequip-benchmarknow picks data frames deterministically- useful error message for
metrics_key: training_*withreport_init_validation: True(#213)
NequIPCalculatornow handles per-atom energies- Added
initial_model_state_strictYAML option load_model_statebuilder- fusion strategy support
cumulative_wallfor early stopping- Deploy model from YAML file directly
- Disallow PyTorch 1.9, which has some JIT bugs.
nequip-deploy buildnow requires--train-diroption when specifying the training session- Minimum Python version is now 3.7
- Better error in
Dataset.statisticswhen field is missing NequIPCalculatornow outputs energy as scalar rather than(1, 1)arraydataset: asenow treats automatically addskey_mappingkeys toinclude_keys, which is consistant with the npz dataset- fixed reloading models with
per_species_rescale_scales/shiftsset tonull/None - graceful exit for
-n 0innequip-benchmark - Strictly correct CSV headers for metrics (#198)
nequip-evaluate --repeatoption- Report number of weights to wandb
- defaults and commments in example.yaml and full.yaml, in particular longer default training and correct comment for E:F-weighting
- better metrics config in example.yaml and full.yaml, in particular will total F-MAE/F-RMSE instead of mean over per-species
- default value for
report_init_validationis nowTrue all_*_*metrics rename to ->psavg_*_*avg_num_neighborsdefaultNone->auto
- error if both per-species and global shift are used together
- Model builders may now process only the configuration
- Allow irreps to optionally be specified through the simplified keys
l_max,parity, andnum_features wandb.watchviawandb_watchoption- Allow polynomial cutoff p values besides 6.0
nequip-evaluatenow sets a defaultr_maxtaken from the model for the dataset config- Support multiple rescale layers in trainer
AtomicData.to_asesupports arbitrary fieldsnequip-evaluatecan now output arbitrary fields to an XYZ filenequip-evaluatereports which frame in the original dataset was used as input for each output frame
minimal.yaml,minimal_eng.yaml, andexample.yamlnow use the simplified irreps optionsl_max,parity, andnum_features- Default value for
resnetis nowFalse
- Handle one of
per_species_shifts/scalesbeingnullwhen the other is a dataset statistc include_framesnow works with ASE datasets- no training data labels in input_data
- Average number of neighbors no longer crashes sometimes when not all nodes have neighbors (small cutoffs)
- Handle field registrations correctly in
nequip-evaluate
compile_model
NequIPCalculatorcan now be built via anequip_calculator()function. This adds a minimal compatibility with vibes- Added
avg_num_neighbors: autooption - Asynchronous IO: during training, models are written asynchronously. Enable this with environment variable
NEQUIP_ASYNC_IO=true. dataset_seedto separately control randomness used to select training data (and their order).- The types may now be specified with a simpler
chemical_symbolsoption - Equivariance testing reports per-field errors
--equivariance-test ntests equivariance onnframes from the training dataset
- All fields now have consistant [N, dim] shaping
- Changed default
seedanddataset_seedin example YAMLs - Equivariance testing can only use training frames now
- Equivariance testing no longer unintentionally skips translation
- Correct cat dim for all registered per-graph fields
PerSpeciesScaleShiftnow correctly outputs when scales, but not shifts, are enabled— previously it was broken and would only output updated values when both were enabled.nequip-evaluateoutputs correct species to theextxyzfile when a chemical symbol <-> type mapping exists for the test dataset
- Allow e3nn 0.4.*, which changes the default normalization of
TensorProducts; this change should not affect typical NequIP networks - Deployed are now frozen on load, rather than compile
load_deployed_modelrespects global JIT settings
- Support for
e3nn'ssoft_one_hot_linspaceas radial bases - Support for parallel dataloader workers with
dataloader_num_workers - Optionally independently configure validation and training datasets
- Save dataset parameters along with processed data
- Gradient clipping
- Arbitrary atom type support
- Unified, modular model building and initialization architecture
- Added
nequip-benchmarkscript for benchmarking and profiling models - Add before option to SequentialGraphNetwork.insert
- Normalize total energy loss by the number of atoms via PerAtomLoss
- Model builder to initialize training from previous checkpoint
- Better error when instantiation fails
- Rename
npz_keystoinclude_keys - Allow user to register
graph_fields,node_fields, andedge_fieldsvia yaml - Deployed models save the e3nn and torch versions they were created with
- Update example.yaml to use wandb by default, to only use 100 epochs of training, to set a very large batch logging frequency and to change Validation_loss to validation_loss
- Name processed datasets based on a hash of their parameters to ensure only valid cached data is used
- Do not use TensorFloat32 by default on Ampere GPUs until we understand it better
- No atomic numbers in networks
dataset_energy_std/dataset_energy_meantodataset_total_energy_*nequip.dynamics->nequip.ase- update example.yaml and full.yaml with better defaults, new loss function, and switched to toluene-ccsd(t) as example data
use_scdefaults toTrueregister_fieldsis now innequip.data- Default total energy scaling is changed from global mode to per species mode.
- Renamed
trainable_global_rescale_scaletoglobal_rescale_scale_trainble - Renamed
trainable_global_rescale_shifttoglobal_rescale_shift_trainble - Renamed
PerSpeciesScaleShift_toper_species_rescale - Change default and allowed values of
metrics_keyfromlosstovalidation_loss. The old defaultlosswill no longer be accepted. - Renamed
per_species_rescale_trainabletoper_species_rescale_scales_trainableandper_species_rescale_shifts_trainable
- The first 20 epochs/calls of inference are no longer painfully slow for recompilation
- Set global options like TF32, dtype in
nequip-evaluate - Avoid possilbe race condition in caching of processed datasets across multiple training runs
- Removed
allowed_species - Removed
--update-config; start a new training and load old state instead - Removed dependency on
pytorch_geometric nequip-trainno longer prints the full config, which can be found in the training dir asconfig.yaml.nequip.datasets.AspirinDataset&nequip.datasets.WaterDataset- Dependency on
pytorch_scatter
to_asemethod inAtomicData.pyto convertAtomicDataobject to (list of)ase.Atomsobject(s)SequentialGraphNetworknow has insertion methodsnn.SaveForOutputnequip-evaluatecommand for evaluating (metrics on) trained modelsAtomicData.from_asenow catchesenergy/energiesarrays
- Nonlinearities now specified with
eandoinstead of1and-1 - Update interfaces for
torch_geometric1.7.1 ande3nn0.3.3 nonlinearity_scalarsnow also affects the nonlinearity used in the radial net ofInteractionBlock- Cleaned up naming of initializers
- Fix specifying nonlinearities when wandb enabled
Finalbackport for <3.8 compatability- Fixed
nequip-*commands when usingpip install - Default models rescale per-atom energies, and not just total
- Fixed Python <3.8 backward compatability with
atomic_save
- Option for which nonlinearities to use
- Option to save models every n epochs in training
- Option to specify optimization defaults for
e3nn
- Using
wandbno longer breaks the inclusion of special objects like callables in configs
iepochis no longer off-by-one when restarting a training run that hitmax_epochs- Builders, and not just sub-builders, use the class name as a default prefix
early_stopping_xxxarguments added to enable early stop for platued values or values that out of lower/upper bounds.
- Sub-builders can be skipped in
instantiateby setting them toNone - More flexible model initialization
- Add MD w/ Nequip-ASE-calculator + run-MD script w/ custom Nose-Hoover
- PBC must be explicit if a cell is provided
- Training now uses atomic file writes to avoid corruption if interupted
feature_embeddingrenamed tochemical_embeddingin default models
BesselBasisnow works on GPU whentrainable=False- Dataset
extra_fixed_fieldsare now added even ifget_data()returnsAtomicDataobjects
load_deployed_modelnow correctly loads all metadata