Hello,
I'm trying to create a for loop to read a list of DNA sequences and get the value for all the pairs. The idea is to read the current and the next item to math it with a specific value for that pair and then append it to a final list.
This is an example:
AA= 5
AT=6
AC=13
AG=8
CA= 6
TG= 12
...[etc.]
DNA_seq= [A,A,C,A,T,G]
These 5 pairs (AA,AC,CA,AT,TG) should give me a value of 42So, this is what I'm trying; I first define a method to get the next item:
(I know there is a built-in next function, but it wasn't working either)
def nextbase():
next_base= next(base)
return next_base
And then:
AA=5
AT=4
AC=3
AG=2
TA=5
TT=4
TC=3
TG=2
CA=5
CT=4
CC=3
CG=2
GA=5
GT=4
GC=3
GG=2
stacking= []
for strand in dsDNA:
for b in strand:
base= iter(b)
if base =='A':
if nextbase() == 'A':
append.stacking(AA)
elif nextbase() == 'T':
append.stacking(AT)
elif nextbase() == 'C':
append.stacking(AC)
elif nextbase() == 'G':
append.stacking(AG)
elif base=='G':
if nextbase() == 'A':
append.stacking(GA)
elif nextbase() == 'T':
append.stacking(GT)
elif nextbase() == 'C':
append.stacking(GC)
elif nextbase() == 'G':
append.stacking(GG)
elif base=='c':
if nextbase() == 'A':
append.stacking(CA)
elif nextbase() == 'T':
append.stacking(CT)
elif nextbase() == 'C':
print('yes')
append.stacking(CC)
elif nextbase() == 'G':
append.stacking(CG)
elif base=='T':
if nextbase() == 'A':
append.stacking(TA)
elif nextbase() == 'T':
append.stacking(TT)
elif nextbase() == 'C':
append.stacking(TC)
elif nextbase() == 'G':
append.stacking(TG)
else:
print('eror')
print(stacking)
But is just not working it will just print error cause it's not recognising anything, does anyone know if there is any efficient way to do this? Thanks!!
This is not too hard to do: first create a dictionary with the 'weight' of each pair. Then loop over the dna-sequence and sum up the values retrieved from that dictionary:
dict={'AA':5,
'AT':4,
'AC':3,
'AG':2,
'TA':5,
'TT':4,
'TC':3,
'TG':2,
'CA':5,
'CT':4,
'CC':3,
'CG':2,
'GA':5,
'GT':4,
'GC':3,
'GG':2 }
DNA_seq= ['A','A','C','A','T','G']
total = sum([dict[DNA_seq[i]+DNA_seq[i+1]] for i in range(len(DNA_seq)-1) ])
print(total)
>>> 19