pythonpython-3.xmdanalysis

MDAnalysis: Trajectories saved from Readers of the same Universe are incorrect


I have three trajectory replicas (xtc) of a membrane protein in a simulated physiological environment (water, ions, membrane...) in a MDAnalysis' (2.2.0) Universe. I want to save other three additional xtcs that contain only the trajectory of the protein (of the atoms of the protein), one per each of the original xtc trajectories. When I try to iterate through each of the three MDAnalysis' Readers contained in the Universe, the first saved trajectory seems to be correct, but the other two have the same coordinates in all the frames. The starting, complete trajectories are correct. If my starting point is necessarily a Universe with the three Readers, how do I do this correctly and efficiently?

Code:

import MDAnalysis as mda
u = mda.Universe("11159_dyn_117.pdb", "11156_trj_117.xtc", "11157_trj_117.xtc", "11158_trj_117.xtc")
protein = u.select_atoms("protein")
protein.write("protein.pdb")

for num, reader in enumerate(u.trajectory.readers, 1):
    with mda.Writer(f"{num}.xtc", protein.n_atoms) as w:
        for ts in reader.trajectory:
            w.write(protein.atoms)

# Then check the generated individual trajectories by loading them in 
# Universes and checking the positions array. I checked them in PyMOL.

Files downloadable at: https://submission.gpcrmd.org/dynadb/dynamics/id/117/ (model file and trajectory files)


Solution

  • You can write a trajectory directly from an AtomGroup with the AtomGroup.write(name, frames=trajectory_iterator) method. Access the start/stop frames in the chained trajectory with the private ChainReader._start_frames attribute (not documented).

    import MDAnalysis as mda
    
    # example data
    from MDAnalysisTests import datafiles as data
    
    # create a chained trajectory and select some atoms
    u = mda.Universe(data.PSF, [data.DCD, data.DCD])
    protein = u.select_atoms("protein")
    
    # get start/stop frames: 
    # array([  0,  98, 196]) for this example
    sf = u.trajectory._start_frames
    
    # write each subtrajectory of the chained trajectory
    # to a new file in a different format (only containing
    # the atoms of the selected AtomGroup)
    for i, (start, stop) in enumerate(zip(sf[:-1], sf[1:])):
        protein.atoms.write(f"protein_{i}.xtc", frames=u.trajectory[start:stop])
    

    This will produce trajectories protein_0.xtc and protein_1.xtc. If you want to load them, don't forget to create a file that contains a minimal topology for the selection

    protein.write("protein.gro")
    

    so that you can load the new trajectories with

    p1 = mda.Universe("protein.gro", "protein_1.xtc")
    p2 = mda.Universe("protein.gro", "protein_2.xtc")
    

    Notes