I tried to solve another of Rosalind(http://rosalind.info/problems/lexv/) problems by myself, however unfortunately I have to ask you for help.
Here is my approach:
First of all; function which creates all possible substrings of input string with length of n:
def get_substrings(input_string, l):
res_list = []
sub = []
for i in range(len(input_string)):
if l+i <= len(input_string):
for j in range(i,l+i):
sub.append(input_string[j])
sub = ''.join(sub)
res_list.append(sub)
sub = []
res_list = filter(None, res_list)
return res_list
Then main function which creates all combinations of current string with varying length:
from itertools import product
def lexv():
dna = str(raw_input())
n = int(raw_input())
subs = get_substrings(dna, n)
result = []
for i in range(len(subs)):
for j in range(1,n+1):
result = result + list(product(dna, repeat=j))
for i in range(len(result)):
result[i] = "".join(result[i])
print result[i]
The result of my code for data from Rosalind's 'Sample input' is:
D
N
A
DD
DN
DA
ND
NN
NA
AD
AN
AA
DDD
DDN
DDA
DND
DNN
DNA
DAD
DAN
DAA
NDD
NDN
NDA
NND
NNN
NNA
NAD
NAN
NAA
ADD
ADN
ADA
AND
ANN
ANA
AAD
AAN
AAA
My questions:
a) Is it possible to order my result as it should be (Rosalind's result)?
b) Is my approach correct? If not, could you give me some clue (but not the solution for this problem - I'd like to beat it by myself).
Very thanks!
For the generation of possible substrings, I would look deeper into the itertools
module. With chain
and product
you can easily make a one-lines that returns all combinations
for the sorting, I would replace the letters themselves with int
s, keeping this translation table in a dict
. enumerate
and str.split
are your friends here.
Now you have a list of tuples of ints, which you can sort. From what I can tell, the standard sorting order of tuples is useful.
When you have a sorted list of tuples, you will just need to do the backwards translation to strings
Another option is to use standard string sorting, and as key=
argument to sorted
, passing a function (or lambda) which replaces each letter in the string to it's place in the alphabet (the input_string
)
This can be done with str.translate
, enumerate
and a dict
comprehension