Support for MD calculation within PwBaseWorkChain#1200
Support for MD calculation within PwBaseWorkChain#1200tsthakur wants to merge 18 commits intoaiidateam:mainfrom
Conversation
- Added new MD functions module with initialisation file - Implemented concatenate_trajectory calcfunction for joining multiple trajectory data that are generated from sequential MD runs
- Implemented `get_structure_from_trajectory` calcfunction to continue MD simulations - The function extracts atomic positions and velocities from a given trajectory - This is to support restarting MD simulations from a given trajectory object instead of wavefunction through Quantum ESPRESSO - The positions are simply stored as a new StructureData node, while the velocities are stored in the settings Dict node which is what aiida_quantumespresso expects for MD restarts
- Implemented `get_last_step_from_trajectory` calcfunction for MD workflows - Extracts the final step from a thermalized trajectory for production phase as a StructureData object - This is an optional way to run MD simulation to separate equilibration from production in MD simulations for better provenance tracking
- Added MD-specific parsing logic using "Entering Dynamics:" keyword - Implemented velocity calculation from positions using velocity-Verlet algorithm used by Quantum ESPRESSO - Supports both force-based and simple velocity calculations - Now handles incomplete stdout for MD calculations allowing structure extraction from a trajectory in cases where restarting from wavefunction is not possible - Added MD parameters extraction (dt, iprint) to trajectory metadata - Added proper unit conversions for velocities and simulation time - Now marks problematic trajectories with 'never_concatenate' attribute to avoid issues when combining multiple MD runs - Backward compatibility is still preserved for non-MD calculations - Added suggestions for future improvements in comments to preserve compatibility with non-MD use-cases
Added complete MD workflow capabilities with MD specific restart criteria and error management. Major features: - **MD trajectory restart**: Support for continuing MD simulations from previous trajectories using `previous_trajectory` input parameter - **Energy fluctuation monitoring**: Add `total_energy_max_fluctuation` input to monitor and abort simulations with excessive energy drift - **MD step management**: Automatic tracking of completed vs remaining MD steps across restart cycles with `mdsteps_done` and `mdsteps_todo` counters - **Trajectory-based restart**: Extract positions and velocities from previous trajectory using `get_structure_from_trajectory` calcfunction - **MD-specific validation**: Add `validate_md_parameters()` method to handle MD-specific input validation and setup Workflow enhancements: - **Restart handling**: Enhanced `should_run_process()` to consider remaining MD steps and prevent premature termination - **Structure extraction**: Integrate trajectory processing to restart from final positions/velocities of previous calculations - **Settings propagation**: Proper handling of velocity settings for MD restarts - **Step counting**: Track MD progress across multiple calculation restarts Error handling improvements: - **Walltime recovery**: Enhanced `handle_out_of_walltime()` for MD calculations with proper trajectory-aware restart logic - **Energy validation**: New `check_energy_fluctuations()` method to monitor total energy stability during MD runs - **Progress tracking**: `update_mdsteps()` method to manage MD step counters and determine when to continue or finish workflow. Note that if restarting from wavefunctions, usig Quantum ESPRESSO's native `restart_mode` the steps keyword is ignored by QE, so this progress tracking is important when starting a calculation from a previously terminated MD run. Builder enhancements: - Add `total_energy_max_fluctuation` and `previous_trajectory` parameters to `get_builder_from_protocol()` method for MD workflow initialization - Support for MD continuation workflows with proper input handling This implementation enables interruptible MD simulations with automatic restart capabilities, beyond what vanilla QE provides with energy monitoring, and proper trajectory management for production molecular dynamics workflows.
- Ensures non-MD calculations (scf, relax, bands, etc.) use the results from BaseRestartWorkChain class - Maintains MD-specific trajectory concatenation and reporting for MD calculations - Adds clearer reporting messages distinguishing MD vs non-MD workflow completion - Preserves all existing functionality while fixing output attachment for standard calculations
Minor enhancements to MD workflow capabilities: **Thermalised trajectory support:** - Add `thermalised_trajectory` input parameter for equilibration trajectories - Enable starting production MD runs from pre-equilibrated configurations - Support separate thermalisation phases that don't affect final trajectory - Integrate with `get_last_step_from_trajectory` for equilibration endpoints **Workflow improvements:** - Add missing imports: WorkflowFactory, get_last_step_from_trajectory, md_utils - Enhanced trajectory validation with proper structure/formula checking - Improved preparation logic for both previous and thermalised trajectories - Better reporting for trajectory source identification **Input validation and setup:** - Extended `get_builder_from_protocol()` with thermalised_trajectory parameter - Enhanced `validate_md_parameters()` for thermalisation trajectory handling - Added recommendations for thermalisation when no trajectory provided - Improved trajectory compatibility checking across workflow restarts **Process preparation enhancements:** - Dual trajectory handling in `prepare_process()` method - Priority system: previous_trajectory > thermalised_trajectory > from scratch - Proper velocity extraction and settings propagation for both trajectory types - Clear reporting of trajectory source being used for MD initialization This enables MD workflows with separate equilibration and production phases, which is particularly useful in case one wants to first run a canonical MD to thermalise, and then run multiple microcanonical MDs starting from the thermalised configuration.
Bug: Parser was attempting to access 'md_parameters' key from parsed_trajectory
dictionary without checking if it exists, causing KeyError exceptions for all
non-MD calculations (SCF, relax, bands, etc.).
Additionally, PwBaseWorkChain was unconditionally adding MD-specific parameters
to all workflow protocols, including non-MD workflows like PwBandsWorkChain.
Fix:
- Use .pop('md_parameters', False) instead of .pop('md_parameters') in parser
- Add proper validation before accessing md_parameters['dt']
- Only add MD parameters to protocols when calculation type is 'md'/'vc-md'
- Provide sensible defaults when md_parameters is unavailable
Also updated the test to reflect this change. I recommend using gamma point only for MD simulations, but I believe it should be left to the user to decide, so here I use a very coarse k-point mesh as I assume the user will use a reasonably large supercell. Besides this, the only important change is setting up a good thermostat, in this case it is stochastic velocity rescaling with good time constant.
Forgot to do this before previous commits, so doing all of it now.
|
Thanks @tsthakur! Great that you make the effort to contribute. 🙏 There's quite a few changes, so will need to block some time to have a closer look. Probably will only be after we do a stable v5.0 release, which is my priority at the moment. Would that be ok? Two questions for now:
|
|
Thank you @mbercx ! I understand the priorities. Although can I tempt you into incorporating MD workflow for v5.0 release ^_^ I attach a jupyert notebook here to run MD on a Silicon structure.
I was indeed using an older version of QE. But this PR is separate from that workflow. This PR incorporates changes for the latest vanilla QE, and also works for sirius enabled QE. I was using these changes to run FPMD with QE v7.0 and for running FPMD on LUMI. After testing again today I noticed a couple of typos, so I will fix them in a new commit by today. |
- Fixed the typos in protocol and trajectory keys - Improved the logic of concatenating trajectory structures to avoid duplication of last step, when starting a new MD from previous one
Previously the check MD calculation was in the parse_stdout function, so MD calculations wouldn't be correctly caught when parsing through XML. Now the check if just before the `build_output_trajectory` function is called, so that the MD parameters are always correctly added to the trajectory data whenever relevant. I added an additional argument to this function instead of storing the relevant details i.e. `md_parameters` in the trajectory data dictionary, to avoid confusion and make the code cleaner.
XML and stdout parsers generate arrays (steps, cells, positions) under different keywords. The initial `position` values are set from keyword `atomic_positions_relax` or `atomic_fractionals_relax`, which are only available from stdout file and not from XML. This is not a problem normally as the XML values are taken over the stdout values, and the end results are correct. But this leads to a situation where the initial "wrong" positions cause issues when calculating velocities in MD simulations. For MD calculations, explicitly pop the correct arrays (steps, cells, positions) from parsed_trajectory early in the process, before they can be overwritten by inconsistent XML data. Future work should investigate the reason behind this roundabout way to generating TrajectoryData.
Previously, the function was updating the number of MD steps to do correctly but the single line needed to set the updated number of MD steps to do was missing.
|
Thanks for the quick response and notebook! :)
Hehe, I'm not sure. I'm contemplating removing more things from the v5.0 milestone since I don't have so much time at the moment and want to avoid maintaining two branches for too long. I've also blocked requests from others. That said, once v5.0 is out, there is nothing stopping us from merging this and quickly doing a v5.1 release. Minor releases are easy, so I would rather just make another release than block a release because we want to add more features to it, if that makes sense? Especially since I don't want to rush the review of this PR, we're adding quite a bit here. |
I see, yes in that case better to aim for 5.1 release.
Yes, that makes sense. Make the major release polished and essential, and then add new features in the minor release, let the community test it, iron out the kinks, and then rinse and repeat. Do you have any rough estimate of when v5.0 would be released, like in the coming weeks or after new years? |
My original timeline was end of next week. I'm mainly waiting for review for the PRs that add deprecation to the So I expect to come back to this PR, and some other open ones, in ~2 weeks, if that's ok? Have some tasks here in Aus that require my focus as well. |
|
Thanks for the sharing the planned timeline. |
|
Hehe, thanks for the subtle "merge ping", @tsthakur. As could be expected, there have been some delays, and I haven't been able to spend much time on |
|
Haha, no worries! I understand that delays happen :) |
|
Hello @mbercx do you have any ETA on the v5.0 release :) I am assuming you are sticking with the original plan of removing things from v5.0 and then doing a minor release with the MD workflow. |
Overview
This PR introduces an overhaul of the
PwBaseWorkChainto incorporate support for first-principles molecular dynamics (MD) simulations using Quantum ESPRESSO. Previously, there was no dedicated logic for running MD calculations with thePwBaseWorkChain, which required significant enhancements to handle the unique requirements of MD workflows.What's New
Complete MD Workflow Support
validate_md_parameters()method to handle MD-specific inputs and trajectory managementNew MD-specific Functions
concatenate_trajectory(): Seamlessly combine multiple MD trajectoriesget_last_step_from_trajectory(): Extract final configuration for restartsget_structure_from_trajectory(): Convert trajectory frames to AiiDA structuresEnhanced Error Handling
Key Features
Technical Details
MD-Specific Inputs