pdbmdanalysis

Using MDAnalysis to extract coordinates in an array from pdb


I have a pdb file that is a subset of a much larger system. This pdb is special because I have some vectors based on this file's coordinate system. I want to draw these vectors and the basis vector of that system onto the pdb. Eventually I would like to visualize the vector and basis vectors as it moves through some MD simulation where I an update the vector position based on the trajectory over time.

To start, I would like to read a pdb that has coordinates that define the basis vectors that further define the other vectors I want to visualize. Right now I'm using this class in MDAnalysis: https://docs.mdanalysis.org/1.0.0/_modules/MDAnalysis/coordinates/PDB.html#PDBReader

molecule=mda.coordinates.PDB.PDBReader('molecule.pdb')

This works and it reads the pdb just fine, but returns with this variable type

coordinates.PDB.Reader

I suppose this isn't a surprise, but I want to be able to print this variable and get some array of coordinate positions and atom types. I'd love to see the bonds as well but that's not necessary. Now when I print I get

<PDBReader molecule.pdb with 1 frames of 60 atoms>

I want something that would look like

[atomtype1,x1,y1,z1]...[atomtypen,xn,yn,zn]

Thank you,


Solution

  • Loading data: Universe

    To get the coordinates in MDAnalysis you first load a Universe (you don't normally use the coordinate readers directly):

    import MDAnalysis as mda
    u = mda.Universe('molecule.pdb')
    

    The Universe contains "topology" (atom types, bonds if available, etc) and the "trajectory" (i.e., coordinates).

    Accessing per-atom data: Universe.atoms

    All the atoms are stored in u.atoms (they form an AtomGroup). All information about the atoms is available from an AtomGroup. For example, all positions are available as a numpy array with

    u.atoms.positions
    

    The names are

    u.atoms.names
    

    and there are many more attributes. (The same works for residues: u.residues or u.atoms.residues gives the residues. You can also slice/index an AtomGroup to get a new AtomGroup. For example, the residues that belong to the first 20 atoms are u.atoms[:20].residues... AtomGroups are the key to working with MDAnalysis.)

    Extracting atom names and positions

    To build the list that you asked for:

    names_positions = [[at] + list(pos) for at, pos in zip(u.atoms.names, u.atoms.positions)]
    

    For example, with the test file PDB that is included with the MDAnalysisTests package:

    import MDAnalysis as mda
    from MDAnalysisTests.datafiles import PDB
    u = mda.Universe(PDB)
    names_positions = [[at] + list(pos) for at, pos in zip(u.atoms.names, u.atoms.positions)]
    
    # show the first three entries
    print(names_positions[:3])
    

    gives

    [['N', 52.017, 43.56, 31.555], ['H1', 51.188, 44.112, 31.722], ['H2', 51.551, 42.828, 31.039]]
    

    Learning more...

    For a rapid introduction to MDAnalysis have a look at the Quickstart Guide, which explains most of these things in more detail. It also tells you how to select specific atoms and form new AtomGroups.

    Then you can have a look at the rest of the User Guide.

    Bonds are a bit more tricky (you can get them with u.atoms.bonds but if you want to use them you have to learn more about how MDAnalysis represents topologies — I'd suggest you start by asking on user mailing list, see participating in MDAnalysis because this is where MDAnalysis developers primarily answer questions.)