Support for MD calculation within PwBaseWorkChain by tsthakur · Pull Request #1200 · aiidateam/aiida-quantumespresso

tsthakur · 2025-10-16T23:00:03Z

Overview

This PR introduces an overhaul of the PwBaseWorkChain to incorporate support for first-principles molecular dynamics (MD) simulations using Quantum ESPRESSO. Previously, there was no dedicated logic for running MD calculations with the PwBaseWorkChain, which required significant enhancements to handle the unique requirements of MD workflows.

What's New

Complete MD Workflow Support

MD-specific validation: New validate_md_parameters() method to handle MD-specific inputs and trajectory management
Trajectory concatenation: Automatic concatenation of multiple MD runs with proper trajectory handling
Energy monitoring: Built-in energy fluctuation checks to detect runaway simulations
Restart capabilities: Support for restarting from previous MD trajectories with position and velocity continuity
Thermalisation support: Option to start MD runs from pre-equilibrated (thermalised) trajectories
New protocol: New protocol with suggested values for running a first-principles MD

New MD-specific Functions

concatenate_trajectory(): Seamlessly combine multiple MD trajectories
get_last_step_from_trajectory(): Extract final configuration for restarts
get_structure_from_trajectory(): Convert trajectory frames to AiiDA structures
Enhanced parser support for MD output with velocity calculations using velocity-Verlet algorithm

Enhanced Error Handling

MD-specific error handlers for common MD failures
Proper handling of interrupted MD runs with trajectory recovery
Energy fluctuation monitoring with configurable thresholds
Graceful handling of incomplete trajectories

Key Features

Previous trajectory support: Seamlessly continue MD runs from previous calculations
Thermalised trajectory input: Start production runs from equilibrated configurations
Energy fluctuation monitoring: Automatic detection of unstable simulations
Step counting and progress tracking: Accurate tracking of completed MD steps
Velocity calculation: Automatic velocity computation using QE's velocity-Verlet algorithm
Trajectory metadata: Rich metadata including simulation time, units, and validation flags
- FPMD protocol: A new extensively tested protocol that has been used in a large scale screening

Technical Details

MD-Specific Inputs

# New workflow inputs for MD calculations
total_energy_max_fluctuation: Float  # Energy stability threshold (eV)
previous_trajectory: TrajectoryData   # Continue from previous MD run
thermalised_trajectory: TrajectoryData # Start from an equilibrated state

- Added new MD functions module with initialisation file - Implemented concatenate_trajectory calcfunction for joining multiple trajectory data that are generated from sequential MD runs

- Implemented `get_structure_from_trajectory` calcfunction to continue MD simulations - The function extracts atomic positions and velocities from a given trajectory - This is to support restarting MD simulations from a given trajectory object instead of wavefunction through Quantum ESPRESSO - The positions are simply stored as a new StructureData node, while the velocities are stored in the settings Dict node which is what aiida_quantumespresso expects for MD restarts

- Implemented `get_last_step_from_trajectory` calcfunction for MD workflows - Extracts the final step from a thermalized trajectory for production phase as a StructureData object - This is an optional way to run MD simulation to separate equilibration from production in MD simulations for better provenance tracking

- Added MD-specific parsing logic using "Entering Dynamics:" keyword - Implemented velocity calculation from positions using velocity-Verlet algorithm used by Quantum ESPRESSO - Supports both force-based and simple velocity calculations - Now handles incomplete stdout for MD calculations allowing structure extraction from a trajectory in cases where restarting from wavefunction is not possible - Added MD parameters extraction (dt, iprint) to trajectory metadata - Added proper unit conversions for velocities and simulation time - Now marks problematic trajectories with 'never_concatenate' attribute to avoid issues when combining multiple MD runs - Backward compatibility is still preserved for non-MD calculations - Added suggestions for future improvements in comments to preserve compatibility with non-MD use-cases

Added complete MD workflow capabilities with MD specific restart criteria and error management. Major features: - **MD trajectory restart**: Support for continuing MD simulations from previous trajectories using `previous_trajectory` input parameter - **Energy fluctuation monitoring**: Add `total_energy_max_fluctuation` input to monitor and abort simulations with excessive energy drift - **MD step management**: Automatic tracking of completed vs remaining MD steps across restart cycles with `mdsteps_done` and `mdsteps_todo` counters - **Trajectory-based restart**: Extract positions and velocities from previous trajectory using `get_structure_from_trajectory` calcfunction - **MD-specific validation**: Add `validate_md_parameters()` method to handle MD-specific input validation and setup Workflow enhancements: - **Restart handling**: Enhanced `should_run_process()` to consider remaining MD steps and prevent premature termination - **Structure extraction**: Integrate trajectory processing to restart from final positions/velocities of previous calculations - **Settings propagation**: Proper handling of velocity settings for MD restarts - **Step counting**: Track MD progress across multiple calculation restarts Error handling improvements: - **Walltime recovery**: Enhanced `handle_out_of_walltime()` for MD calculations with proper trajectory-aware restart logic - **Energy validation**: New `check_energy_fluctuations()` method to monitor total energy stability during MD runs - **Progress tracking**: `update_mdsteps()` method to manage MD step counters and determine when to continue or finish workflow. Note that if restarting from wavefunctions, usig Quantum ESPRESSO's native `restart_mode` the steps keyword is ignored by QE, so this progress tracking is important when starting a calculation from a previously terminated MD run. Builder enhancements: - Add `total_energy_max_fluctuation` and `previous_trajectory` parameters to `get_builder_from_protocol()` method for MD workflow initialization - Support for MD continuation workflows with proper input handling This implementation enables interruptible MD simulations with automatic restart capabilities, beyond what vanilla QE provides with energy monitoring, and proper trajectory management for production molecular dynamics workflows.

- Ensures non-MD calculations (scf, relax, bands, etc.) use the results from BaseRestartWorkChain class - Maintains MD-specific trajectory concatenation and reporting for MD calculations - Adds clearer reporting messages distinguishing MD vs non-MD workflow completion - Preserves all existing functionality while fixing output attachment for standard calculations

Minor enhancements to MD workflow capabilities: **Thermalised trajectory support:** - Add `thermalised_trajectory` input parameter for equilibration trajectories - Enable starting production MD runs from pre-equilibrated configurations - Support separate thermalisation phases that don't affect final trajectory - Integrate with `get_last_step_from_trajectory` for equilibration endpoints **Workflow improvements:** - Add missing imports: WorkflowFactory, get_last_step_from_trajectory, md_utils - Enhanced trajectory validation with proper structure/formula checking - Improved preparation logic for both previous and thermalised trajectories - Better reporting for trajectory source identification **Input validation and setup:** - Extended `get_builder_from_protocol()` with thermalised_trajectory parameter - Enhanced `validate_md_parameters()` for thermalisation trajectory handling - Added recommendations for thermalisation when no trajectory provided - Improved trajectory compatibility checking across workflow restarts **Process preparation enhancements:** - Dual trajectory handling in `prepare_process()` method - Priority system: previous_trajectory > thermalised_trajectory > from scratch - Proper velocity extraction and settings propagation for both trajectory types - Clear reporting of trajectory source being used for MD initialization This enables MD workflows with separate equilibration and production phases, which is particularly useful in case one wants to first run a canonical MD to thermalise, and then run multiple microcanonical MDs starting from the thermalised configuration.

Bug: Parser was attempting to access 'md_parameters' key from parsed_trajectory dictionary without checking if it exists, causing KeyError exceptions for all non-MD calculations (SCF, relax, bands, etc.). Additionally, PwBaseWorkChain was unconditionally adding MD-specific parameters to all workflow protocols, including non-MD workflows like PwBandsWorkChain. Fix: - Use .pop('md_parameters', False) instead of .pop('md_parameters') in parser - Add proper validation before accessing md_parameters['dt'] - Only add MD parameters to protocols when calculation type is 'md'/'vc-md' - Provide sensible defaults when md_parameters is unavailable

Also updated the test to reflect this change. I recommend using gamma point only for MD simulations, but I believe it should be left to the user to decide, so here I use a very coarse k-point mesh as I assume the user will use a reasonably large supercell. Besides this, the only important change is setting up a good thermostat, in this case it is stochastic velocity rescaling with good time constant.

Forgot to do this before previous commits, so doing all of it now.

mbercx · 2025-10-22T00:43:14Z

Thanks @tsthakur! Great that you make the effort to contribute. 🙏 There's quite a few changes, so will need to block some time to have a closer look. Probably will only be after we do a stable v5.0 release, which is my priority at the moment. Would that be ok?

Two questions for now:

Would you have a Jupyter notebook/script that I can use to test these changes, and use as a start to add some documentation?
You were running a specific branch of QE, IIRC? For the flipper calculations? Are the changes here more general, i.e. do they support more recent QE versions?

tsthakur · 2025-10-22T12:36:19Z

Thank you @mbercx ! I understand the priorities. Although can I tempt you into incorporating MD workflow for v5.0 release ^_^

I attach a jupyert notebook here to run MD on a Silicon structure.
FPMD_test.ipynb

You were running a specific branch of QE, IIRC? For the flipper calculations? Are the changes here more general, i.e. do they support more recent QE versions?

I was indeed using an older version of QE. But this PR is separate from that workflow. This PR incorporates changes for the latest vanilla QE, and also works for sirius enabled QE. I was using these changes to run FPMD with QE v7.0 and for running FPMD on LUMI.

After testing again today I noticed a couple of typos, so I will fix them in a new commit by today.

- Fixed the typos in protocol and trajectory keys - Improved the logic of concatenating trajectory structures to avoid duplication of last step, when starting a new MD from previous one

Previously the check MD calculation was in the parse_stdout function, so MD calculations wouldn't be correctly caught when parsing through XML. Now the check if just before the `build_output_trajectory` function is called, so that the MD parameters are always correctly added to the trajectory data whenever relevant. I added an additional argument to this function instead of storing the relevant details i.e. `md_parameters` in the trajectory data dictionary, to avoid confusion and make the code cleaner.

XML and stdout parsers generate arrays (steps, cells, positions) under different keywords. The initial `position` values are set from keyword `atomic_positions_relax` or `atomic_fractionals_relax`, which are only available from stdout file and not from XML. This is not a problem normally as the XML values are taken over the stdout values, and the end results are correct. But this leads to a situation where the initial "wrong" positions cause issues when calculating velocities in MD simulations. For MD calculations, explicitly pop the correct arrays (steps, cells, positions) from parsed_trajectory early in the process, before they can be overwritten by inconsistent XML data. Future work should investigate the reason behind this roundabout way to generating TrajectoryData.

Previously, the function was updating the number of MD steps to do correctly but the single line needed to set the updated number of MD steps to do was missing.

mbercx · 2025-10-24T04:07:44Z

Thanks for the quick response and notebook! :)

Thank you @mbercx ! I understand the priorities. Although can I tempt you into incorporating MD workflow for v5.0 release ^_^

Hehe, I'm not sure. I'm contemplating removing more things from the v5.0 milestone since I don't have so much time at the moment and want to avoid maintaining two branches for too long. I've also blocked requests from others.

That said, once v5.0 is out, there is nothing stopping us from merging this and quickly doing a v5.1 release. Minor releases are easy, so I would rather just make another release than block a release because we want to add more features to it, if that makes sense? Especially since I don't want to rush the review of this PR, we're adding quite a bit here.

tsthakur · 2025-10-24T09:28:57Z

I'm contemplating removing more things from the v5.0 milestone

I see, yes in that case better to aim for 5.1 release.

so I would rather just make another release than block a release because we want to add more features to it

Yes, that makes sense. Make the major release polished and essential, and then add new features in the minor release, let the community test it, iron out the kinks, and then rinse and repeat.

Do you have any rough estimate of when v5.0 would be released, like in the coming weeks or after new years?

mbercx · 2025-10-28T04:05:20Z

Do you have any rough estimate of when v5.0 would be released, like in the coming weeks or after new years?

My original timeline was end of next week. I'm mainly waiting for review for the PRs that add deprecation to the v4.X support branch, and make the breaking changes on main. I'll probably start going ahead without review next week, make a beta release for testing in the field, and then do a stable v5.0.0 a week later.

So I expect to come back to this PR, and some other open ones, in ~2 weeks, if that's ok? Have some tasks here in Aus that require my focus as well.

tsthakur · 2025-10-28T17:05:06Z

Thanks for the sharing the planned timeline.
Yes it looks good! I would just like to wrap this up before the holidays. So looks like there is ample time for that, even if there are some delays.

mbercx · 2025-12-03T08:09:35Z

Hehe, thanks for the subtle "merge ping", @tsthakur. As could be expected, there have been some delays, and I haven't been able to spend much time on aiida-quantumespresso the past weeks. Next week I should have time though, so pinky-swear I'll come back to this PR then as well. :)

tsthakur · 2025-12-03T08:32:46Z

Haha, no worries! I understand that delays happen :)

tsthakur · 2026-02-16T14:30:06Z

Hello @mbercx do you have any ETA on the v5.0 release :) I am assuming you are sticking with the original plan of removing things from v5.0 and then doing a minor release with the MD workflow.
If you are not working on this anymore, could you tag someone else to have a look?
Thanks!

tsthakur and others added 12 commits October 16, 2025 14:28

Added MD trajectory processing functions

73729f9

- Added new MD functions module with initialisation file - Implemented concatenate_trajectory calcfunction for joining multiple trajectory data that are generated from sequential MD runs

Added the missing utility functions for MD runs

5059cbb

Merge branch 'aiidateam:main' into main

fe6297d

Automatic formatting with ruff pre-commit hook

a06a2f3

Forgot to do this before previous commits, so doing all of it now.

tsthakur requested a review from mbercx October 16, 2025 23:09

tsthakur and others added 5 commits October 22, 2025 16:33

Fixed typos and logic of concatenating trajectory

983f989

- Fixed the typos in protocol and trajectory keys - Improved the logic of concatenating trajectory structures to avoid duplication of last step, when starting a new MD from previous one

Added the missing line to update MD steps

5dc8618

Previously, the function was updating the number of MD steps to do correctly but the single line needed to set the updated number of MD steps to do was missing.

Merge branch 'main' into main

5c9d36a

Merge branch 'main' into main

b7efe9c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for MD calculation within PwBaseWorkChain#1200

Support for MD calculation within PwBaseWorkChain#1200
tsthakur wants to merge 18 commits intoaiidateam:mainfrom
tsthakur:main

tsthakur commented Oct 16, 2025

Uh oh!

mbercx commented Oct 22, 2025

Uh oh!

tsthakur commented Oct 22, 2025

Uh oh!

mbercx commented Oct 24, 2025

Uh oh!

tsthakur commented Oct 24, 2025

Uh oh!

mbercx commented Oct 28, 2025

Uh oh!

tsthakur commented Oct 28, 2025

Uh oh!

mbercx commented Dec 3, 2025

Uh oh!

tsthakur commented Dec 3, 2025

Uh oh!

tsthakur commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tsthakur commented Oct 16, 2025

Overview

What's New

Complete MD Workflow Support

New MD-specific Functions

Enhanced Error Handling

Key Features

Technical Details

MD-Specific Inputs

Uh oh!

mbercx commented Oct 22, 2025

Uh oh!

tsthakur commented Oct 22, 2025

Uh oh!

mbercx commented Oct 24, 2025

Uh oh!

tsthakur commented Oct 24, 2025

Uh oh!

mbercx commented Oct 28, 2025

Uh oh!

tsthakur commented Oct 28, 2025

Uh oh!

mbercx commented Dec 3, 2025

Uh oh!

tsthakur commented Dec 3, 2025

Uh oh!

tsthakur commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants