Replies: 2 comments 1 reply
-
|
Hi @M15160058, several hours for a single frame sounds surprisingly slow. I had a look at your code and there are a few things you can do to speed it up:
It looks like what you're currently doing is:
Instead, you should only need to make one call to import MDAnalysis as mda
from MDAnalysis.analysis.hydrogenbonds import HydrogenBondAnalysis
u = mda.Universe("clpb_tetramer_traj1_npt_50.tpr", "concate_cdab.xtc")
chainA = "segid seg_6_Protein_chain_G"
chainB = "segid seg_7_Protein_chain_H"
chainC = "segid seg_8_Protein_chain_I"
chainD = "segid seg_9_Protein_chain_J"
hbonds = HydrogenBondAnalysis(
universe=u,
d_a_cutoff=3.5,
d_h_a_angle_cutoff=120,
between = [
[chainA, chainB],
[chainA, chainC],
[chainA, chainD],
[chainB, chainC],
[chainB, chainD],
[chainC, chainD],
],
)
hbonds.hydrogens_sel = hbonds.guess_hydrogens(f"({chainA}) or ({chainB}) or ({chainC}) or ({chainD})")
hbonds.acceptors_sel = hbonds.guess_acceptors(f"({chainA}) or ({chainB}) or ({chainC}) or ({chainD})")
# run the analysis over all frames and print progress
hbonds.run(verbose=True)
# Plot hbonds over time
plt.plot(hbonds.times, hbonds.count_by_time(), lw=2)I haven't tested this, but it should give you an idea of what to do. I'd also recommend reading through the user guide for |
Beta Was this translation helpful? Give feedback.
-
|
good to hear it works for you!
As you suggested, I think you're comparing atom indices in your results with atom numbers in your gro file, so there's an offset of 1. I would recommend working with the indices as these are guaranteed to be unique (in a gro file, atom numbers repeat once they reach 99999).
Does the current selection include atoms in both the backbone and the side chains? If so, there are two ways you could do this:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
I am trying to calculate hydrogen bonds using MDAnalysis but it is taking too much time (more than a day).Even for frame 1, it is taking 5or six hours.
Yes the xtc file is large like # Atoms 770667 and 300ns long. But I want to calculate the hbonds for only four chains(3600 atoms only).
`import pandas as pd
import MDAnalysis as mda
from MDAnalysis.analysis.hydrogenbonds import HydrogenBondAnalysis
import matplotlib.pyplot as plt
Load the Universe
u = mda.Universe("clpb_tetramer_traj1_npt_50.tpr", "concate_cdab.xtc")
Select specific parts (e.g., two chains)
selection = u.select_atoms(
"segid seg_6_Protein_chain_G or segid seg_7_Protein_chain_H or segid seg_8_Protein_chain_I or segid seg_9_Protein_chain_J"
)
Use the selection for analysis
print("Number of atoms in selection:", len(selection))
Define chain selections using segid
chains = {
"A": u.select_atoms("segid seg_6_Protein_chain_G"),
"B": u.select_atoms("segid seg_7_Protein_chain_H"),
"C": u.select_atoms("segid seg_8_Protein_chain_I"),
"D": u.select_atoms("segid seg_9_Protein_chain_J"),
}
Define all chain pair combinations (for inter-chain only)
chain_pairs = [
("A", "B"), ("A", "C"), ("A", "D"),
("B", "C"), ("B", "D"), ("C", "D")
]
Pre-calculate donor and acceptor selections
donor_selectors = {}
acceptor_selectors = {}
for chain_id in chains:
donor_selectors[chain_id] = f"segid {chains[chain_id].segids[0]} and (name N or name O)" # Example: Select N and O atoms as donors
acceptor_selectors[chain_id] = f"segid {chains[chain_id].segids[0]} and name O" # Example: Select O atoms as acceptors
File 1: H-Bond details for the first frame
first_frame_data = []
Process the first frame only
u.trajectory[0]
print("Processing the first frame:")
for i, (chain1, chain2) in enumerate(chain_pairs, start=1):
print(f" Chain pair {i}/{len(chain_pairs)}: {chain1}-{chain2}")
hbonds = HydrogenBondAnalysis(
universe=u,
donors_sel=donor_selectors[chain1],
acceptors_sel=acceptor_selectors[chain2],
d_a_cutoff=3.5,
d_h_a_angle_cutoff=120,
)
hbonds.run()
Save the first frame data to a CSV file
first_frame_df = pd.DataFrame(first_frame_data)
first_frame_df.to_csv("first_frame_hbonds.csv", index=False)
print("First frame H-Bond details saved to 'first_frame_hbonds.csv'.")
File 2: Time vs Number of H-Bonds across trajectory
time_data = []
Iterate over all frames in the trajectory
print("\nProcessing all trajectory frames:")
for ts in u.trajectory:
current_time = ts.time
print(f" Frame {u.trajectory.frame}/{len(u.trajectory)} at time {current_time:.2f} ps")
total_hbonds = 0
Save the time vs H-Bond data to a CSV file
time_df = pd.DataFrame(time_data)
time_df.to_csv("time_vs_hbonds.csv", index=False)
print("Time vs Number of H-Bonds saved to 'time_vs_hbonds.csv'.")
Plot the results
plt.figure(figsize=(10, 6))
plt.plot(time_df["Time (ps)"], time_df["Number of H-bonds"], marker='o')
plt.xlabel("Time (ps)")
plt.ylabel("Number of Hydrogen Bonds")
plt.title("Time vs. Number of Hydrogen Bonds")
plt.grid()
plt.savefig("time_vs_hbonds_plot.png")
plt.show()`
Beta Was this translation helpful? Give feedback.
All reactions