Skip to content

"extract_" protocols take a long time relative to simulationΒ #689

@lilyminium

Description

@lilyminium

During equilibration, by default we simulate 200 ps at a time. This seems to take on the order of a minute.

To check equilibration conditions, I usually set up conditions for Density and PotentialEnergy. The protocol of extracting the data often takes up to 7 minutes, and this happens in sequence. That means up to 90% the compute time is actually taken up by processing data. I believe the work happens in BaseAverageObservable._execute, which runs the following steps:

  • analyses the timeseries to find the equilibration point and statistical inefficiency using pymbar
  • gets uncorrelated points of data
  • bootstraps the observables

Can any time be saved in these steps? For equilibration purposes, we don't need the boostrapping -- potentially we could subclass BaseAverageObservable (or superclass it) and skip that last step.

Example logging with timestamps:

2025-08-05 12:49:00.748 INFO     dhmix_1064373607860850902|equilibration_simulation_mixture finished executing after 53063.88087599771 ms
2025-08-05 12:49:00.866 INFO     Executing dhmix_1064373607860850902|extract_1_Density_mixture
2025-08-05 12:56:02.345 INFO     dhmix_1064373607860850902|extract_1_Density_mixture finished executing after 421478.9291450288 ms
2025-08-05 12:56:02.348 INFO     Executing dhmix_1064373607860850902|multiply_1_Density_mixture
2025-08-05 12:56:02.348 INFO     dhmix_1064373607860850902|multiply_1_Density_mixture finished executing after 0.2952350187115371 ms
2025-08-05 12:56:02.350 INFO     Executing dhmix_1064373607860850902|absolute_1_Density_mixture
2025-08-05 12:56:02.350 INFO     dhmix_1064373607860850902|absolute_1_Density_mixture finished executing after 0.19303098088130355 ms
2025-08-05 12:56:02.745 INFO     Executing dhmix_1064373607860850902|extract_0_PotentialEnergy_mixture
2025-08-05 13:03:46.404 INFO     dhmix_1064373607860850902|extract_0_PotentialEnergy_mixture finished executing after 463659.129380947 ms
2025-08-05 13:03:46.408 INFO     Executing dhmix_1064373607860850902|multiply_0_PotentialEnergy_mixture
2025-08-05 13:03:46.408 INFO     dhmix_1064373607860850902|multiply_0_PotentialEnergy_mixture finished executing after 0.29414903838187456 ms
2025-08-05 13:03:46.410 INFO     Executing dhmix_1064373607860850902|absolute_0_PotentialEnergy_mixture
2025-08-05 13:03:46.410 INFO     dhmix_1064373607860850902|absolute_0_PotentialEnergy_mixture finished executing after 0.200728012714535 ms
2025-08-05 13:03:46.414 INFO     Executing dhmix_1064373607860850902|get_maximum_statistical_inefficiency_mixture

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions