Skip to content

Support for MD calculation within PwBaseWorkChain#1200

Open
tsthakur wants to merge 18 commits intoaiidateam:mainfrom
tsthakur:main
Open

Support for MD calculation within PwBaseWorkChain#1200
tsthakur wants to merge 18 commits intoaiidateam:mainfrom
tsthakur:main

Conversation

@tsthakur
Copy link
Collaborator

Overview

This PR introduces an overhaul of the PwBaseWorkChain to incorporate support for first-principles molecular dynamics (MD) simulations using Quantum ESPRESSO. Previously, there was no dedicated logic for running MD calculations with the PwBaseWorkChain, which required significant enhancements to handle the unique requirements of MD workflows.

What's New

Complete MD Workflow Support

  • MD-specific validation: New validate_md_parameters() method to handle MD-specific inputs and trajectory management
  • Trajectory concatenation: Automatic concatenation of multiple MD runs with proper trajectory handling
  • Energy monitoring: Built-in energy fluctuation checks to detect runaway simulations
  • Restart capabilities: Support for restarting from previous MD trajectories with position and velocity continuity
  • Thermalisation support: Option to start MD runs from pre-equilibrated (thermalised) trajectories
  • New protocol: New protocol with suggested values for running a first-principles MD

New MD-specific Functions

  • concatenate_trajectory(): Seamlessly combine multiple MD trajectories
  • get_last_step_from_trajectory(): Extract final configuration for restarts
  • get_structure_from_trajectory(): Convert trajectory frames to AiiDA structures
  • Enhanced parser support for MD output with velocity calculations using velocity-Verlet algorithm

Enhanced Error Handling

  • MD-specific error handlers for common MD failures
  • Proper handling of interrupted MD runs with trajectory recovery
  • Energy fluctuation monitoring with configurable thresholds
  • Graceful handling of incomplete trajectories

Key Features

  • Previous trajectory support: Seamlessly continue MD runs from previous calculations
  • Thermalised trajectory input: Start production runs from equilibrated configurations
  • Energy fluctuation monitoring: Automatic detection of unstable simulations
  • Step counting and progress tracking: Accurate tracking of completed MD steps
  • Velocity calculation: Automatic velocity computation using QE's velocity-Verlet algorithm
  • Trajectory metadata: Rich metadata including simulation time, units, and validation flags
    • FPMD protocol: A new extensively tested protocol that has been used in a large scale screening

Technical Details

MD-Specific Inputs

# New workflow inputs for MD calculations
total_energy_max_fluctuation: Float  # Energy stability threshold (eV)
previous_trajectory: TrajectoryData   # Continue from previous MD run
thermalised_trajectory: TrajectoryData # Start from an equilibrated state

tsthakur and others added 12 commits October 16, 2025 14:28
- Added new MD functions module with initialisation file
- Implemented concatenate_trajectory calcfunction for joining multiple trajectory data that are generated from sequential MD runs
- Implemented `get_structure_from_trajectory` calcfunction to continue MD simulations
- The function extracts atomic positions and velocities from a given trajectory
- This is to support restarting MD simulations from a given trajectory object instead of wavefunction through Quantum ESPRESSO
- The positions are simply stored as a new StructureData node, while the velocities are stored in the settings Dict node which is what aiida_quantumespresso expects for MD restarts
- Implemented `get_last_step_from_trajectory` calcfunction for MD workflows
- Extracts the final step from a thermalized trajectory for production phase as a StructureData object
- This is an optional way to run MD simulation to separate equilibration from production in MD simulations for better provenance tracking
- Added MD-specific parsing logic using "Entering Dynamics:" keyword
- Implemented velocity calculation from positions using velocity-Verlet algorithm used by Quantum ESPRESSO
- Supports both force-based and simple velocity calculations
- Now handles incomplete stdout for MD calculations allowing structure extraction from a trajectory in cases where restarting from wavefunction is not possible
- Added MD parameters extraction (dt, iprint) to trajectory metadata
- Added proper unit conversions for velocities and simulation time
- Now marks problematic trajectories with 'never_concatenate' attribute to avoid issues when combining multiple MD runs
- Backward compatibility is still preserved for non-MD calculations
- Added suggestions for future improvements in comments to preserve compatibility with non-MD use-cases
Added complete MD workflow capabilities with MD specific restart criteria and error management.

Major features:
- **MD trajectory restart**: Support for continuing MD simulations from previous
  trajectories using `previous_trajectory` input parameter
- **Energy fluctuation monitoring**: Add `total_energy_max_fluctuation` input
  to monitor and abort simulations with excessive energy drift
- **MD step management**: Automatic tracking of completed vs remaining MD steps
  across restart cycles with `mdsteps_done` and `mdsteps_todo` counters
- **Trajectory-based restart**: Extract positions and velocities from previous
  trajectory using `get_structure_from_trajectory` calcfunction
- **MD-specific validation**: Add `validate_md_parameters()` method to handle
  MD-specific input validation and setup

Workflow enhancements:
- **Restart handling**: Enhanced `should_run_process()` to consider remaining
  MD steps and prevent premature termination
- **Structure extraction**: Integrate trajectory processing to restart from
  final positions/velocities of previous calculations
- **Settings propagation**: Proper handling of velocity settings for MD restarts
- **Step counting**: Track MD progress across multiple calculation restarts

Error handling improvements:
- **Walltime recovery**: Enhanced `handle_out_of_walltime()` for MD calculations
  with proper trajectory-aware restart logic
- **Energy validation**: New `check_energy_fluctuations()` method to monitor
  total energy stability during MD runs
- **Progress tracking**: `update_mdsteps()` method to manage MD step counters
  and determine when to continue or finish workflow. Note that if restarting from wavefunctions, usig Quantum ESPRESSO's native `restart_mode` the steps keyword is ignored by QE, so this progress tracking is important when starting a calculation from a previously terminated MD run.

Builder enhancements:
- Add `total_energy_max_fluctuation` and `previous_trajectory` parameters
  to `get_builder_from_protocol()` method for MD workflow initialization
- Support for MD continuation workflows with proper input handling

This implementation enables interruptible MD simulations with automatic
restart capabilities, beyond what vanilla QE provides with energy monitoring, and proper trajectory management for production molecular dynamics workflows.
- Ensures non-MD calculations (scf, relax, bands, etc.) use the results from BaseRestartWorkChain class
- Maintains MD-specific trajectory concatenation and reporting for MD calculations
- Adds clearer reporting messages distinguishing MD vs non-MD workflow completion
- Preserves all existing functionality while fixing output attachment for standard calculations
Minor enhancements to MD workflow capabilities:

**Thermalised trajectory support:**
- Add `thermalised_trajectory` input parameter for equilibration trajectories
- Enable starting production MD runs from pre-equilibrated configurations
- Support separate thermalisation phases that don't affect final trajectory
- Integrate with `get_last_step_from_trajectory` for equilibration endpoints

**Workflow improvements:**
- Add missing imports: WorkflowFactory, get_last_step_from_trajectory, md_utils
- Enhanced trajectory validation with proper structure/formula checking
- Improved preparation logic for both previous and thermalised trajectories
- Better reporting for trajectory source identification

**Input validation and setup:**
- Extended `get_builder_from_protocol()` with thermalised_trajectory parameter
- Enhanced `validate_md_parameters()` for thermalisation trajectory handling
- Added recommendations for thermalisation when no trajectory provided
- Improved trajectory compatibility checking across workflow restarts

**Process preparation enhancements:**
- Dual trajectory handling in `prepare_process()` method
- Priority system: previous_trajectory > thermalised_trajectory > from scratch
- Proper velocity extraction and settings propagation for both trajectory types
- Clear reporting of trajectory source being used for MD initialization

This enables MD workflows with separate equilibration and production phases, which is particularly useful in case one wants to first run a canonical MD to thermalise, and then run multiple microcanonical MDs starting from the thermalised configuration.
Bug: Parser was attempting to access 'md_parameters' key from parsed_trajectory
dictionary without checking if it exists, causing KeyError exceptions for all
non-MD calculations (SCF, relax, bands, etc.).

Additionally, PwBaseWorkChain was unconditionally adding MD-specific parameters
to all workflow protocols, including non-MD workflows like PwBandsWorkChain.

Fix:
- Use .pop('md_parameters', False) instead of .pop('md_parameters') in parser
- Add proper validation before accessing md_parameters['dt']
- Only add MD parameters to protocols when calculation type is 'md'/'vc-md'
- Provide sensible defaults when md_parameters is unavailable
Also updated the test to reflect this change.
I recommend using gamma point only for MD simulations, but I believe it should be left to the user to decide, so here I use a very coarse k-point mesh as I assume the user will use a reasonably large supercell.
Besides this, the only important change is setting up a good thermostat, in this case it is stochastic velocity rescaling with good time constant.
Forgot to do this before previous commits, so doing all of it now.
@tsthakur tsthakur requested a review from mbercx October 16, 2025 23:09
@mbercx
Copy link
Member

mbercx commented Oct 22, 2025

Thanks @tsthakur! Great that you make the effort to contribute. 🙏 There's quite a few changes, so will need to block some time to have a closer look. Probably will only be after we do a stable v5.0 release, which is my priority at the moment. Would that be ok?

Two questions for now:

  1. Would you have a Jupyter notebook/script that I can use to test these changes, and use as a start to add some documentation?
  2. You were running a specific branch of QE, IIRC? For the flipper calculations? Are the changes here more general, i.e. do they support more recent QE versions?

@tsthakur
Copy link
Collaborator Author

Thank you @mbercx ! I understand the priorities. Although can I tempt you into incorporating MD workflow for v5.0 release ^_^

I attach a jupyert notebook here to run MD on a Silicon structure.
FPMD_test.ipynb

You were running a specific branch of QE, IIRC? For the flipper calculations? Are the changes here more general, i.e. do they support more recent QE versions?

I was indeed using an older version of QE. But this PR is separate from that workflow. This PR incorporates changes for the latest vanilla QE, and also works for sirius enabled QE. I was using these changes to run FPMD with QE v7.0 and for running FPMD on LUMI.

After testing again today I noticed a couple of typos, so I will fix them in a new commit by today.

tsthakur and others added 5 commits October 22, 2025 16:33
- Fixed the typos in protocol and trajectory keys
- Improved the logic of concatenating trajectory structures to avoid duplication of last step, when starting a new MD from previous one
Previously the check MD calculation was in the parse_stdout function, so MD calculations wouldn't be correctly caught when parsing through XML. Now the check if just before the `build_output_trajectory` function is called, so that the MD parameters are always correctly added to the trajectory data whenever relevant. I added an additional argument to this function instead of storing the relevant details i.e. `md_parameters` in the trajectory data dictionary, to avoid confusion and make the code cleaner.
XML and stdout parsers generate arrays (steps, cells, positions) under different keywords.
The initial `position` values are set from keyword `atomic_positions_relax` or `atomic_fractionals_relax`, which are only available from stdout file and not from XML.
This is not a problem normally as the XML values are taken over the stdout values, and the end results are correct.
But this leads to a situation where the initial "wrong" positions cause issues when calculating velocities in MD simulations.
For MD calculations, explicitly pop the correct arrays (steps, cells, positions) from parsed_trajectory early in the process, before they can be overwritten by inconsistent XML data.
Future work should investigate the reason behind this roundabout way to generating TrajectoryData.
Previously, the function was updating the number of MD steps to do correctly but the single line needed to set the updated number of MD steps to do was missing.
@mbercx
Copy link
Member

mbercx commented Oct 24, 2025

Thanks for the quick response and notebook! :)

Thank you @mbercx ! I understand the priorities. Although can I tempt you into incorporating MD workflow for v5.0 release ^_^

Hehe, I'm not sure. I'm contemplating removing more things from the v5.0 milestone since I don't have so much time at the moment and want to avoid maintaining two branches for too long. I've also blocked requests from others.

That said, once v5.0 is out, there is nothing stopping us from merging this and quickly doing a v5.1 release. Minor releases are easy, so I would rather just make another release than block a release because we want to add more features to it, if that makes sense? Especially since I don't want to rush the review of this PR, we're adding quite a bit here.

@tsthakur
Copy link
Collaborator Author

I'm contemplating removing more things from the v5.0 milestone

I see, yes in that case better to aim for 5.1 release.

so I would rather just make another release than block a release because we want to add more features to it

Yes, that makes sense. Make the major release polished and essential, and then add new features in the minor release, let the community test it, iron out the kinks, and then rinse and repeat.

Do you have any rough estimate of when v5.0 would be released, like in the coming weeks or after new years?

@mbercx
Copy link
Member

mbercx commented Oct 28, 2025

Do you have any rough estimate of when v5.0 would be released, like in the coming weeks or after new years?

My original timeline was end of next week. I'm mainly waiting for review for the PRs that add deprecation to the v4.X support branch, and make the breaking changes on main. I'll probably start going ahead without review next week, make a beta release for testing in the field, and then do a stable v5.0.0 a week later.

So I expect to come back to this PR, and some other open ones, in ~2 weeks, if that's ok? Have some tasks here in Aus that require my focus as well.

@tsthakur
Copy link
Collaborator Author

Thanks for the sharing the planned timeline.
Yes it looks good! I would just like to wrap this up before the holidays. So looks like there is ample time for that, even if there are some delays.

@mbercx
Copy link
Member

mbercx commented Dec 3, 2025

Hehe, thanks for the subtle "merge ping", @tsthakur. As could be expected, there have been some delays, and I haven't been able to spend much time on aiida-quantumespresso the past weeks. Next week I should have time though, so pinky-swear I'll come back to this PR then as well. :)

@tsthakur
Copy link
Collaborator Author

tsthakur commented Dec 3, 2025

Haha, no worries! I understand that delays happen :)

@tsthakur
Copy link
Collaborator Author

Hello @mbercx do you have any ETA on the v5.0 release :) I am assuming you are sticking with the original plan of removing things from v5.0 and then doing a minor release with the MD workflow.
If you are not working on this anymore, could you tag someone else to have a look?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants