I am working with Gromacs .gro files in PyMol and running into problems with multi-stranded molecules. .Gro files do not have chain identifiers, which PyMol apparently needs to calculate cartoon representations, so I am trying to find a way to add chain identifiers to each strand in my scene.
PyMol seems to have some sort of internal molecule representation because I can click on Mouse > Selection Mode > Molecule
and then can manually select the molecules in my scene. I can also do select bymolecule id 1
and get the first chain, but I can't find a way to get the other chains without also knowing the relevant atom ids a priori.
So what I need is a series of PyMol commands which will iterate over all the molecules in a scene and then run alter (sele), chain='A/B/whatever'
on each one so I can use the cartoon representation.
Edit:
The following script works if you have a unique residue name which appears exactly once per chain (such as RX5 in RNA molecules). This is not a generally applicable solution because it requires there to be such a unique residue.
model = cmd.get_model("resn *5 & name O3'")
chainid = ord('A')
for a in model.atom:
cmd.select(f"bymolecule id {a.id}")
cmd.alter("sele", f"chain='{chr(chainid)}'")
chainid += 1
Second attempt; using as input 2ms2_no_short.gro
:
Great Red Owns Many ACres of Sand
216
1ALA N 1 -0.114 -11.023 7.562
1ALA H1 2 -0.135 -11.077 7.482
1ALA H2 3 -0.021 -10.986 7.555
1ALA H3 4 -0.121 -11.080 7.644
1ALA CA 5 -0.208 -10.913 7.572
1ALA HA 6 -0.302 -10.948 7.581
1ALA CB 7 -0.183 -10.823 7.694
1ALA HB1 8 -0.251 -10.749 7.696
1ALA HB2 9 -0.191 -10.877 7.778
1ALA HB3 10 -0.091 -10.784 7.689
1ALA C 11 -0.174 -10.834 7.444
1ALA O 12 -0.065 -10.849 7.390
2SER N 13 -0.281 -10.779 7.390
2SER H 14 -0.368 -10.804 7.432
2SER CA 15 -0.289 -10.687 7.277
2SER HA 16 -0.223 -10.612 7.278
2SER CB 17 -0.266 -10.751 7.140
2SER HB1 18 -0.260 -10.850 7.152
2SER HB2 19 -0.180 -10.717 7.103
2SER OG 20 -0.360 -10.733 7.035
2SER HG 21 -0.329 -10.782 6.953
2SER C 22 -0.435 -10.665 7.301
2SER O 23 -0.507 -10.764 7.319
3ASN N 24 -0.484 -10.544 7.316
3ASN H 25 -0.427 -10.462 7.307
3ASN CA 26 -0.625 -10.536 7.345
3ASN HA 27 -0.655 -10.619 7.392
3ASN CB 28 -0.650 -10.418 7.436
3ASN HB1 29 -0.607 -10.439 7.524
3ASN HB2 30 -0.749 -10.410 7.448
3ASN CG 31 -0.599 -10.285 7.390
3ASN OD1 32 -0.559 -10.265 7.274
3ASN ND2 33 -0.595 -10.190 7.483
3ASN HD21 34 -0.625 -10.211 7.576
3ASN HD22 35 -0.562 -10.099 7.460
3ASN C 36 -0.696 -10.523 7.214
3ASN O 37 -0.817 -10.507 7.215
4PHE N 38 -0.630 -10.533 7.100
4PHE H 39 -0.531 -10.550 7.100
4PHE CA 40 -0.700 -10.519 6.976
4PHE HA 41 -0.773 -10.451 6.987
4PHE CB 42 -0.602 -10.471 6.873
4PHE HB1 43 -0.550 -10.549 6.840
4PHE HB2 44 -0.539 -10.406 6.917
4PHE CG 45 -0.668 -10.403 6.756
4PHE CD1 46 -0.805 -10.395 6.743
4PHE HD1 47 -0.864 -10.437 6.812
4PHE CE1 48 -0.860 -10.330 6.636
4PHE HE1 49 -0.959 -10.325 6.627
4PHE CZ 50 -0.779 -10.270 6.540
4PHE HZ 51 -0.819 -10.222 6.462
4PHE CE2 52 -0.643 -10.279 6.555
4PHE HE2 53 -0.584 -10.236 6.487
4PHE CD2 54 -0.587 -10.345 6.663
4PHE HD2 55 -0.488 -10.351 6.672
4PHE C 56 -0.761 -10.654 6.937
4PHE O 57 -0.707 -10.727 6.853
5THR N 58 -0.869 -10.692 7.002
5THR H 59 -0.904 -10.632 7.075
5THR CA 60 -0.940 -10.814 6.976
5THR HA 61 -0.889 -10.853 6.899
5THR CB 62 -0.938 -10.900 7.098
5THR HB 63 -1.002 -10.976 7.091
5THR CG2 64 -0.797 -10.951 7.118
5THR HG21 65 -0.794 -11.009 7.200
5THR HG22 66 -0.769 -11.005 7.039
5THR HG23 67 -0.735 -10.874 7.130
5THR OG1 68 -0.987 -10.824 7.209
5THR HG1 69 -0.986 -10.881 7.291
5THR C 70 -1.083 -10.797 6.932
5THR OC1 71 -1.132 -10.876 6.895
5THR OC2 72 -1.141 -10.690 6.940
1ALA N 73 0.988 -11.380 6.601
1ALA H1 74 0.924 -11.449 6.635
1ALA H2 75 0.951 -11.338 6.519
1ALA H3 76 1.076 -11.423 6.580
1ALA CA 77 1.007 -11.278 6.703
1ALA HA 78 1.046 -11.321 6.785
1ALA CB 79 1.106 -11.168 6.655
1ALA HB1 80 1.117 -11.099 6.727
1ALA HB2 81 1.194 -11.209 6.634
1ALA HB3 82 1.069 -11.124 6.572
1ALA C 83 0.867 -11.219 6.727
1ALA O 84 0.766 -11.278 6.683
2SER N 85 0.867 -11.115 6.809
2SER H 86 0.955 -11.086 6.844
2SER CA 87 0.750 -11.039 6.852
2SER HA 88 0.705 -10.987 6.780
2SER CB 89 0.641 -11.123 6.912
2SER HB1 90 0.658 -11.142 7.008
2SER HB2 91 0.630 -11.209 6.862
2SER OG 92 0.525 -11.046 6.899
2SER HG 93 0.448 -11.096 6.937
2SER C 94 0.825 -10.971 6.965
2SER O 95 0.869 -11.032 7.062
3ASN N 96 0.864 -10.850 6.935
3ASN H 97 0.840 -10.812 6.846
3ASN CA 98 0.942 -10.772 7.024
3ASN HA 99 1.011 -10.834 7.060
3ASN CB 100 1.004 -10.658 6.953
3ASN HB1 101 1.067 -10.613 7.017
3ASN HB2 102 0.932 -10.594 6.927
3ASN CG 103 1.079 -10.697 6.832
3ASN OD1 104 1.165 -10.784 6.833
3ASN ND2 105 1.050 -10.635 6.721
3ASN HD21 106 0.980 -10.564 6.720
3ASN HD22 107 1.099 -10.658 6.636
3ASN C 108 0.850 -10.710 7.126
3ASN O 109 0.893 -10.687 7.239
4PHE N 110 0.725 -10.677 7.093
4PHE H 111 0.684 -10.714 7.010
4PHE CA 112 0.651 -10.587 7.176
4PHE HA 113 0.718 -10.522 7.212
4PHE CB 114 0.546 -10.522 7.086
4PHE HB1 115 0.471 -10.588 7.075
4PHE HB2 116 0.588 -10.504 6.998
4PHE CG 117 0.486 -10.392 7.137
4PHE CD1 118 0.545 -10.317 7.236
4PHE HD1 119 0.629 -10.350 7.279
4PHE CE1 120 0.489 -10.199 7.278
4PHE HE1 121 0.533 -10.145 7.351
4PHE CZ 122 0.373 -10.153 7.220
4PHE HZ 123 0.332 -10.067 7.250
4PHE CE2 124 0.314 -10.227 7.121
4PHE HE2 125 0.229 -10.195 7.079
4PHE CD2 126 0.371 -10.345 7.080
4PHE HD2 127 0.327 -10.398 7.008
4PHE C 128 0.591 -10.647 7.301
4PHE O 129 0.472 -10.675 7.311
5THR N 130 0.676 -10.664 7.401
5THR H 131 0.772 -10.640 7.387
5THR CA 132 0.635 -10.715 7.530
5THR HA 133 0.535 -10.718 7.523
5THR CB 134 0.696 -10.850 7.555
5THR HB 135 0.667 -10.885 7.644
5THR CG2 136 0.658 -10.947 7.446
5THR HG21 137 0.700 -11.036 7.465
5THR HG22 138 0.559 -10.958 7.443
5THR HG23 139 0.691 -10.913 7.358
5THR OG1 140 0.836 -10.835 7.555
5THR HG1 141 0.879 -10.924 7.571
5THR C 142 0.675 -10.627 7.645
5THR OC1 143 0.637 -10.647 7.735
5THR OC2 144 0.750 -10.532 7.634
1ALA N 145 -0.495 -10.949 6.857
1ALA H1 146 -0.399 -10.961 6.883
1ALA H2 147 -0.519 -10.852 6.863
1ALA H3 148 -0.553 -11.002 6.918
1ALA CA 149 -0.514 -10.995 6.720
1ALA HA 150 -0.490 -11.092 6.715
1ALA CB 151 -0.659 -10.980 6.673
1ALA HB1 152 -0.667 -11.013 6.578
1ALA HB2 153 -0.720 -11.033 6.732
1ALA HB3 154 -0.686 -10.883 6.676
1ALA C 155 -0.425 -10.900 6.638
1ALA O 156 -0.416 -10.783 6.681
2SER N 157 -0.355 -10.946 6.535
2SER H 158 -0.361 -11.044 6.512
2SER CA 159 -0.271 -10.861 6.452
2SER HA 160 -0.320 -10.776 6.434
2SER CB 161 -0.141 -10.830 6.521
2SER HB1 162 -0.078 -10.907 6.511
2SER HB2 163 -0.157 -10.811 6.617
2SER OG 164 -0.080 -10.716 6.462
2SER HG 165 0.007 -10.697 6.509
2SER C 166 -0.239 -10.941 6.330
2SER O 167 -0.232 -11.065 6.339
3ASN N 168 -0.236 -10.877 6.215
3ASN H 169 -0.266 -10.782 6.210
3ASN CA 170 -0.189 -10.945 6.094
3ASN HA 171 -0.173 -11.041 6.115
3ASN CB 172 -0.292 -10.943 5.981
3ASN HB1 173 -0.374 -10.990 6.013
3ASN HB2 174 -0.253 -10.994 5.904
3ASN CG 175 -0.334 -10.808 5.928
3ASN OD1 176 -0.279 -10.706 5.964
3ASN ND2 177 -0.433 -10.794 5.843
3ASN HD21 178 -0.484 -10.874 5.811
3ASN HD22 179 -0.458 -10.703 5.811
3ASN C 180 -0.065 -10.868 6.056
3ASN O 181 -0.003 -10.903 5.957
4PHE N 182 -0.020 -10.762 6.128
4PHE H 183 -0.075 -10.724 6.203
4PHE CA 184 0.109 -10.703 6.095
4PHE HA 185 0.118 -10.703 5.996
4PHE CB 186 0.114 -10.563 6.148
4PHE HB1 187 0.124 -10.568 6.248
4PHE HB2 188 0.027 -10.518 6.126
4PHE CG 189 0.229 -10.480 6.091
4PHE CD1 190 0.275 -10.498 5.961
4PHE HD1 191 0.239 -10.572 5.905
4PHE CE1 192 0.372 -10.413 5.911
4PHE HE1 193 0.405 -10.426 5.818
4PHE CZ 194 0.424 -10.311 5.988
4PHE HZ 195 0.494 -10.250 5.951
4PHE CE2 196 0.379 -10.295 6.117
4PHE HE2 197 0.417 -10.221 6.174
4PHE CD2 198 0.282 -10.379 6.169
4PHE HD2 199 0.251 -10.366 6.263
4PHE C 200 0.217 -10.785 6.164
4PHE O 201 0.271 -10.749 6.267
5THR N 202 0.254 -10.900 6.108
5THR H 203 0.208 -10.931 6.025
5THR CA 204 0.359 -10.981 6.165
5THR HA 205 0.404 -10.916 6.227
5THR CB 206 0.304 -11.100 6.236
5THR HB 207 0.365 -11.180 6.238
5THR CG2 208 0.280 -11.070 6.383
5THR HG21 209 0.243 -11.151 6.428
5THR HG22 210 0.366 -11.044 6.426
5THR HG23 211 0.214 -10.995 6.391
5THR OG1 212 0.194 -11.145 6.158
5THR HG1 213 0.153 -11.225 6.201
5THR C 214 0.457 -11.033 6.066
5THR OC1 215 0.530 -11.093 6.098
5THR OC2 216 0.451 -11.006 5.947
2.33514 1.38260 1.96702
with code:
import time
import pymol
from pymol import cmd
pymol.finish_launching()
# pymol.finish_launching(['pymol', '-q', '-W', '1200', '-H', '500'])
time.sleep(5)
def count_mols_in_sel(sel="sele"):
"""
Returns the number of distinct molecules in a given selection.
https://pymolwiki.org/index.php/Count_molecules_in_selection
"""
sel_copy = "__selcopy"
cmd.select(sel_copy, sel)
num_objs = 0
atoms_in_sel = cmd.count_atoms(sel_copy)
chains = ['A','B','C']
while atoms_in_sel > 0:
chainid = chains[num_objs]
cmd.alter(sel_copy, f"chain='{(chainid)}'")
num_objs += 1
# see bm. bymolecule https://pymol.org/dokuwiki/doku.php?id=selection:bymolecule
# see first https://pymol.org/dokuwiki/doku.php?id=selection:first
cmd.select(sel_copy, "%s and not (bm. first %s)" % (sel_copy, sel_copy))
atoms_in_sel = cmd.count_atoms(sel_copy)
print(atoms_in_sel)
print("There are %d distinct molecules in the selection '%s'." % (num_objs, sel))
return num_objs
cmd.load('2ms2_no.gro' , '2ms2_no')
print(cmd.get_names_of_type("object:molecule"))
print(cmd.get_names("objects"))
print(cmd.get_object_list('(all)'))
print(cmd.get_object_list())
print(count_mols_in_sel('all'))
cmd.util.cbc(selection='(all)')
cmd.save('2ms2_no_out.pdb' , '2ms2_no')
print('should be finished by now hopefully')
I can go from 2ms2_no_short.gro
:
to 2ms2_no_out.pdb
:
Still not sure of why it works, all my attempts to get the 3 different molecular objects (see print statements in code) as per Pymol
definition:
Molecule Concept A PyMOL Molecule is a set of atoms within a single molecular object joined by a connected graph of bonds. There are three molecules in the image below.
Molecular Object Concept A PyMOL Molecular Object is a special type of object that can contain atoms, bonds, and coordinate sets organized by state, and that can be shown in a variety of representations.
were in vain.
Only way to identify the three different (chains as molecular objects) is to use adapted script from Count molecules in selection see count_mols_in_sel
function in code***.
Worth noting (not sure how GROMACS handles broken chains) that a broken chain would results in different number of chains (depending on the number of gaps).
Givi it a try, and let me know
*** think same process is found in FilterByMol