pythonbioinformaticsrosalind

Rosalind - Consensus and Profile - Issue with answer formatting


I am working on the Consensus and Profile problem on Rosalind, and I am so close to getting it done. My answer is correct, I have the right consensus string and the correct matrix, but I am having issues formatting my data for the answer. Rosalind expects the answer to look like:

ATGCAACT
A: 5 1 0 0 5 5 0 0
C: 0 0 1 4 2 0 6 1
G: 1 1 6 3 0 1 0 0
T: 1 5 0 0 0 1 1 6

My raw output looks like this:

{'A': [5, 3, 3, 3, 1, 4, 2, 1, 3, 5, 2, 2, 2, 3, 1, 3, 2, 2, 2, 4, 4, 4, 1, 2, 1, 3, 1, 2, 1, 2, 2, 3, 2, 1, 3, 5, 3, 4, 2, 2, 2, 3, 3, 2, 0, 0, 1, 2, 2, 4, 3, 5, 2, 4, 3, 1, 2, 2, 2, 3], 'C': [2, 1, 3, 2, 1, 2, 2, 1, 3, 2, 1, 2, 3, 2, 6, 3, 4, 1, 2, 0, 3, 2, 4, 2, 1, 3, 3, 3, 6, 2, 2, 1, 5, 5, 3, 0, 1, 1, 2, 3, 3, 5, 3, 2, 1, 2, 3, 5, 0, 2, 3, 2, 3, 2, 5, 3, 4, 3, 2, 4], 'G': [1, 3, 2, 4, 3, 2, 1, 3, 3, 0, 5, 3, 3, 2, 1, 2, 1, 5, 3, 2, 2, 2, 2, 4, 6, 3, 2, 3, 2, 3, 1, 3, 0, 2, 0, 3, 3, 3, 4, 2, 2, 2, 1, 3, 5, 2, 1, 0, 2, 1, 2, 1, 4, 2, 2, 3, 2, 0, 4, 2], 'T': [2, 3, 2, 1, 5, 2, 5, 5, 1, 3, 2, 3, 2, 3, 2, 2, 3, 2, 3, 4, 1, 2, 3, 2, 2, 1, 4, 2, 1, 3, 5, 3, 3, 2, 4, 2, 3, 2, 2, 3, 3, 0, 3, 3, 4, 6, 5, 3, 6, 3, 2, 2, 1, 2, 0, 3, 2, 5, 2, 1]}

And with some simple editing, I submit it as:

'A': [5, 3, 3, 3, 1, 4, 2, 1, 3, 5, 2, 2, 2, 3, 1, 3, 2, 2, 2, 4, 4, 4, 1, 2, 1, 3, 1, 2, 1, 2, 2, 3, 2, 1, 3, 5, 3, 4, 2, 2, 2, 3, 3, 2, 0, 0, 1, 2, 2, 4, 3, 5, 2, 4, 3, 1, 2, 2, 2, 3]
'C': [2, 1, 3, 2, 1, 2, 2, 1, 3, 2, 1, 2, 3, 2, 6, 3, 4, 1, 2, 0, 3, 2, 4, 2, 1, 3, 3, 3, 6, 2, 2, 1, 5, 5, 3, 0, 1, 1, 2, 3, 3, 5, 3, 2, 1, 2, 3, 5, 0, 2, 3, 2, 3, 2, 5, 3, 4, 3, 2, 4]
'G': [1, 3, 2, 4, 3, 2, 1, 3, 3, 0, 5, 3, 3, 2, 1, 2, 1, 5, 3, 2, 2, 2, 2, 4, 6, 3, 2, 3, 2, 3, 1, 3, 0, 2, 0, 3, 3, 3, 4, 2, 2, 2, 1, 3, 5, 2, 1, 0, 2, 1, 2, 1, 4, 2, 2, 3, 2, 0, 4, 2]
'T': [2, 3, 2, 1, 5, 2, 5, 5, 1, 3, 2, 3, 2, 3, 2, 2, 3, 2, 3, 4, 1, 2, 3, 2, 2, 1, 4, 2, 1, 3, 5, 3, 3, 2, 4, 2, 3, 2, 2, 3, 3, 0, 3, 3, 4, 6, 5, 3, 6, 3, 2, 2, 1, 2, 0, 3, 2, 5, 2, 1]

But the editing still doesn't matter because of the metric f**k ton of commas and brackets that would need to be manually deleted as well, especially considering the fact that you only have 5 minutes to submit your answer - I've tried and I have found it impossible to format my answer manually in the five-minute window.

I was wondering if anyone knew of some tips, tricks, or solutions that can help me get over this hurdle. I have seen some other solutions, but they essentially require me to take a different approach to logic, which pisses me off because I spent a lot of time thinking about this answer and also creating my own function that manages the FASTA file format from scratch.

Here is my source code:

data = open('/Users/danielpintard/Downloads/rosalind_cons (1).txt', 'r').read()

if '>' in data :
    data_array = data.split('>')
    for i in data_array:
        if i == '':
             data_array.remove(i)
    for i in data_array: data_array[data_array.index(i)] = i.split('\n', 2)
    
    
#create profile

prof_sequences = []

for i in data_array:
    data_array[data_array.index(i)] = i[1]
    prof_sequences.append(i[1])
    
n = len(prof_sequences[0])
 
profile_matrix = {
    'A': [0]*n,
    'C': [0]*n,
    'G': [0]*n,
    'T': [0]*n,
    }

for dna in prof_sequences:
    for position, nucleotide in enumerate(dna):
        profile_matrix[nucleotide][position] += 1

result = []
#still having a hard time understanding this block of code  
for position in range(n):
    max_count = 0
    max_nucleotide = None
    for nucleotide in profile_matrix:
        if profile_matrix[nucleotide][position] > max_count:
            max_count = profile_matrix[nucleotide][position]
            max_nucleotide = nucleotide
    result.append(max_nucleotide)
    
print(profile_matrix)
print(result)
    

And here is the data:

>Rosalind_7283
TATTCATTGATCATATGAAGCCTTGCGACCTGCCCGGTTCTGAAGTCAGCTAGCACATTA
GTGTCAAGGTATTAGTGTAGTTGCTGACTCGAACGTGTGTTAATATTCATGTAGGGGTCT
GGCGACCCAATAGGCGCGTGGTGTACCGAATTGTGCACACACACGTGTATTTCGAACGCA
AGATGCAGCCGAATCAGACCGTAGTAAACCGTTTGAGTGGCGTTTTGGCGTGAGAAGGCT
TAGGTGTTACAAGTGCAGCGCGGGTGCATTTTCTCCGGCTTGGAGCAATAGTCCCTATGC
ATCGGCCCCGTATATGAGGATCGCATTACGCAACATCGTAAGCCTTGCACATCTGGCAAA
TGCACGGCTCTCATTATAGTTGCCAAAAATCAGCCCTACCACACGTAATATTCAAGGCTG
TGCTTGTCCAACTAGTTGGCGAATGATCCTCCAAGATTGCGGCGGGGTATAATCCCGCAC
GTCCGAATACCAATGTTCGAGTGCGGCACTACCAAGATGCGAGTCGGCGTGATATCGAGG
TTCACATAGGGGACGTTTATGTCCTTTGGATGTCTCGCCAACTCCATTCTATCATTAGGT
TGGCGGTCAGCAGGATGGAGCATAGTCATGAGCTTGAGTACTGTGCGGCTCGGAAGAGGG
GCCGTATGGGTCTTCGGACAAACGTAGGTATTACAGGCCAAAAACGCTCAGAAAAAACGC
TATCTTAATGACCATTTGATAAACGTTCCCTTGCCGATTTAGAGTGACTTAGTGCAATGT
TGCGATTTCTACTACACTCAAGCTGTGTTAGGGATAATCCATAGCACAGGCCCGCTCGCC
CGTGCCCTGCCTTGCACGACAGGGCTAAGCGGCTCAAGAAGTTTCTACGCAACGTACCAC
GACCAGCTGGACCTACCGATAGACTTACCATATTCTAAGAATAAAACGGACCCTTATGTG
AGTGAGCGCAAGCAATATGGTTTGCCCGTTTGC
>Rosalind_6559
TGCGGCCTATGTGACGCTCAACCCGGGACGCACATGAGTACATCTTTCTTCAACGTCCGC
GAACACAGCCCAATTCGATTAATTGCCACGATTGTGTGGCACGCACTTACTGAAACCGGT
GGTGATAGGCATAGGTTTAGACCAGGGCTCGGGACAGCTGGTCTAGGTCGTGACTAATCA
ATGGTTTAAATGATGCACCCTTGTATCGGTATGCCTGTGTTTATCAGGAATGCCCATACA
TTTTGAGAAACGCTTATGGTTATTACACAAGCGAGGGAAGCGAGCTAGCGGCGTCCGAGA
ACTATAAAGGCAATCCTGACATGACGAGCGCAGAATCACCCCCTGAATCCCGGTTACGAT
ATGGGCCATTCGGGGAGCAAACGGCTGACTCTTCGGTAAAGTAATTTGCCAGGAACATTG
ATATATGCGCTGACCCTATTGATTATCCAACAAATACTACATTCAGCCCCAGGTCCCACG
TTAGGCGTAAGTTAAGAATTTATGTACGCAATCGCCAATATCCGCAAGACGTCCCCGCTG
ACAGTGAGGCTTAGTGGCGCCGATGGTATTCAGAAATGGAGCGCTCCTCTGTTGCACTCG
GCTCGTCAACCTTCTTGCTATTAACATATAAGAGTGAATGGGTGAGGTAGTAAACGTAAT
TGCCAAGATCGATAGAAGTGTTGGACGGAACATTGGAGCAAGGAACGCCGCTGAGCCAGG
GGACTAATGCCAGAGTGGAACCTGGTGGGAATTAAACACTTGTATACGTTGACAAGCTGA
GACATTCTAAAACACGTAATATAACATGCATCCACTAATGGATTCCCTTTCGCCTCTTGG
CTGGGATACATTTGCGCTTGGGAGCAGGAGATAGGGAGTCAAGGTGACATTGTGGGAATT
CACAAAGCTTCCTATCTAATGTTAGTACTTTAGCCACGGGTTAACCAGGACTGTTCTATA
GTTCCAACTCTCCATTATCCAAAACAAGCAGCA
>Rosalind_3098
GGCATAGGGACGCCGATGTAAGGAAATCCTCTAGTTTGAGCCCGGTGCTTACCGCAGTCC
TCGGCTTTCGTTCTGTTACAGACGTCCTAGGACTCAGTCGCCACCTACGCGGGGTGCATT
AGTCAGGTCCGAAGCCTCTATAGCGCTTTTTAGGAATGGAGCGTTTAAACGAGCCTGCGT
ATATTCGCTACCAAATCTCAGGGGCGGCTCAGATACAACGGGGTTCATCAGTTTGGATAT
CAGTGTTCGCTGGGTAAGTCTGACTCCGGCTCACAGATAGTTAGAAGGTCGCACATGATG
ATTACAACTTCTGCCGCTGACTTGGGAGTCTAGCGCTTGTCACAGACGCGCTAATGCGGC
ACATCTATTTCATAAAAGTACAAGCAATATGCCGCGAGGCCCCGTCGTATTTGATTCGAA
GGATTTAACTCATAACGCGGCCCTCAGCAGCTGCGGGCGTACGGAAGCCTCAACTTTGCG
ATTCTGTCGCACCTGCCTAGCTTAAGGAACCCCGATGCCGGTATCCACCGGACGTTTCGA
TTGCAAGATCTTGGCATGCCGCTACCTGTTGGAATTCAGTTATTAGTCCTACTCAGGAGG
GATAGCCGAACGGCACAAAGGCTTCGTTTGACAAGCACAGATGCATCTACTTAACTCGAT
AGCCTCAAAGAGTGTTTGCTCCGAGAGGGCCATCAGAGTAACTACCACGGCAAGAAGCGC
CTTCTTCCATGGCACACTCAAAAAGGTCATCTGAAGAGCCCATTTTTACCCACGGGATCC
CGCCACTAGACTCGTCACACTAAAACATAGAAGCAAGGCTGTAAGCGTACTCGGGTGTCC
CTAGCTACTTGACCCTGCGCTTTGATTTTCACCCAATCCAGCGCGTTAGCCAAACACCGG
CTCATGTGCGAGACACCTCTTGGACGGTACGAATACGCTTACTCCCACTCAGAACTGCTA
TCCGTGGGGTCCGTGGGGAGCCGGCGCAAAGAA
>Rosalind_2635
AACCTAAGCCACCTCGCGGTGTAACGCGCATCTGCAATCATCAGTTTCAGTCGGCGCAGC
GGAGCCCGGACAGCCTGTGCCGTACAAACCTGAAGCTGCTTACCTCGATTCATGCCAGGT
ATGAAGTATTCCGACGCTAATATCCTTTGGAATGGTTGCCAAGTCTCTACCAGCTACTCC
CATGACCGCATGACATATTCGACACGGTCTCTGAATGAGGTACGGTATTGCTTTCATTCT
AGTACGTTGCCCGACCTATGTACATCCGTCAACCACGGGGTGATCATACCTAAATTTGAA
TTAAAAAGTAGCGGAGCTACCGGACTGGTAGACTCCTCATCGCTCGGTTCAGTAGAAGGG
CTGGCCCTTTTCCTATCACTGTCCGTCCATTTCGTGTGTTTTAGGTGGTTTAGATATACC
TCTCATCGAAGAGTTGACCGTGTGATTAAATGAACGAACATTAAAGAGCGTGTGTTTAAA
TGCACGCAACACTAAAGGTGGAACATGGCGGTCGCCGTTATCGCATGGGTCTACTTGATC
GAAACTCAAGAGCATTGCAGACACAGGGACCCGTCAGGGTTTGTAAGCTGCGCGCTAATA
GTGCAACGTCCTAGGGTCGACTCCATGACGTAATGCAACTCTGGTTGACAATTCGTGAAG
TCGGAGTAAAGCTCCTGGCGCGCTGCACCCCCGGCTTCACCGTAGTTCCTACATTCTCGG
TCTAGTCGTGTGGGAATCACATCTGCTCCGAGGGTAAGGGGATTGGCATATAATGTGAGG
TAGCCGGCTAGGCGTATTAGCAACATCGTTGTCTATTGACTTGGAAGTTCTCTGTAGGAC
GTCGTCAGTCGGTAATCGCTGGTTTTAACTAAGGAGACACTGCTGGCACCGATGGCCGGG
GAGACCATTATGTATTCGGAGTGCCTCCGTTGTGGTGAATAACCAGGACTAATGAGGCCA
ACATAATACTAGACGTATACTATTTAGTGCGCT
>Rosalind_6087
ATTCGATGAATTTCCTCGATAGCGGCTCCGATTTAACACTACCTTGCCTTGACTCTCTAC
ACAGTAAGTACCCCCCGCAACTGGGGGACATTTTAGTGGCCCTTTGCGGAGTAGGGGTGT
TAGGTGTCGGCGTAAAGCGGATTCGATCAAACCCTGATCATCGGCTGAAATGGCCTCGAC
GGTGCTACTCTCAGTGACCTGCTGTTCCCGTAGCCTTTTAATACTCAATCCCTCGATCCG
CTATTCGACCAATCTCGAACTTGAATTCGGTGCGAATGAAACTCCAGTACGGTATGGCTT
GGACCGACGACGGAAGGAACTGCAACGTACCGACTTAATTTGGCTTCAATTCCTACCGAG
CATCATGCGGAAGCTACGCAATTGGATCTCAACAACCCCAAGAGACATTATAGTAGGACA
CACTTTATGGGATGCCGGGGACGGCATCTTCTGCAGGTTGGGAGGGCATCTTGCCTAGGT
GCCAACCTTCGGACGCTCAATGCTCTTACGGTCGGCAGGCTGTTCACGGAGGGCCTTATT
GGAAAAAGGTTATTTCACAAACGTTAAGTCCCTCAGATGACGTCTTGCGTCTCGCCAAGC
CTTTCTAGCTCCCGTCCAGGGCTTGAGCTTTCTTGACACGATAGCTTCCACGTTGACTCT
GAAAATCTCGAAAAACCGAAGGGGAGAGATGCGTCTTGGATCGTCCATAATGCTTCAGAC
GCTTCTAGCCTACCAGGTTGGTTAACAAGTTAATCCGCTAACTTATTGGCGCGTGAGCGA
CAGGACCGCGTCAGACTCATAGATACAGGGCTCATGGGGGCTATGTGTCTAATATGATCG
GCGACAAAGAGTTATGTAATGGCTTGGCTAGGAGACATAAAGGGGGACTTGATAGCGTTT
ACGAGCCTGTTCGGCCTCCCAAAGTTAACTAGATGAGACAGGATGTGCCCCGACACCCAC
GACTTCGTAAGGTAGAATAACGGACATAAGTCC
>Rosalind_4481
AAGGTGCTCAGAGACCTCGTTATGGATTGGTAACTATAGCAATTGCTTAAATCACGTTGT
TCAAATTTTGGGAACTGAATATGCTTCGGGCAATAGTATGAGTAGTCTAAATTGGGGAGT
GTAAGTGCGATTGGACACCACAAAGACAGGTAGTGAATGGGAGAGATTTGTTTGTAGCGC
GTTCGTGCGCGGGACGAGAAATGAATATCCTATTATCTGAAACCCGCCGCTGGGGCTGTA
GCGCCAAGAGCTTTCAGCGGGAGCTCCATGCGTGGAATCTTGCATCTACAATCACATATT
GGTAAGTAGCAACACTGACTGCAAGTACCACTCCCAGGAGAAGACTAGCCATTCAGTGTC
GCCGCTCACAAAGGGCGTAAAATGACATTCATGACGGCTAGCAGCGGACCACGATCCGTG
GCTCGCCGACACTCGGAACCATTCTTGTCTAATAGCTCAGCCCCAGGCTTTTCAACAGGG
GGCGACGCGACGAGCCTAATCGTTACGGATAAGGAGTGCGCACTAACTCGTCATCGGGGA
TAGACCAATTCTTGGAAAAGCAATCCTTAATATGATAGCTACTTGATGCATCTGTCGGCC
GGGGGACTGGACTGTCCTGAAATTGCTTAGGACTATATTTGAGCTTCCACTCCCACCCAG
GGGTGAGCAGATCCTGCCAAACGCGTATCCACTTAGATAAGCTCTTTAGCAAGGGGGCAG
CCTTTTTTCATCATGGTCTGCATTCGTGACTGAAATAATTCATCTCCACTGTACGTTACC
ATACCCTGACCACAATTTTTCCCAATGGGGTCATGCAAACGTACACACGTTTTGCGGCTG
GCTGAATTGCCGACTCATTTGTCCCGTATGCTAGCCCTGCTTGGATTCATAATTGTCTCG
CTCCGGACGTATTCGGGCCTGTGACAATCTTCCCACCTCATAGAACGCCCCAGAATACTC
GTTTTGCTGATGTCGCAGAACATTCTCCTCAGA
>Rosalind_0954
CTAATCTTGCGAATCAATCACAGGTGCGTTGATCCAGAGTCGTAGTTTTACAGTATGCAA
TGTATATTCTTTCTGATGGGACGAGTTTGCATGCAGTAGTTGGGTACTATGCCAGTGCGA
GACCGTCCCTCACCTAAATGCTATGCAGGGTTTCTCTACGATCAAATAGTCAAGTTGCTC
AGCCTCATCACATTGTGAATCACGGACAGACTGTAATTGTCAGCGTGTTCTCTAGGCAAA
TCGCCTTCCTTCTATCGACCTCCTTAGGTCCCCGTGAGGATCTCCTTATCCTGAAAAGTA
CAATCGGATACTTAGATTCTTCGCTCACTCTAATAGGTGGCTATACAGAAGTTTTATGGA
TAAGGGGTGTACGAAATCTTCGAGGGTGTATACCGCTGCTAGAACTCCATACATGATAAC
AACCAATCCTTAGCTAGTATACGAGGGATATGATAACGTTCCACCACCTCTTAAACTTTT
AAATTTGATCGCGGGTGGCCGTCGAAGTGTACGTATGAGATTGGGGCGGTTGTAGTTGCC
AGTGAAAGGCATATGCGGATGGCCTTTGGGTCCTGGTCATTCTTTCTCGCAGGTCGAGCC
AGTGCCTCAAATGAAATTTTCTCCTTAGCAACGACTCCTTAGTTAGAGAAACCAATCCCC
CCATGCCTGCGGATCGTGGTCAGCATGACGTCTGGTTGAACCCTTAGCTGAACAGATGGC
GTATTGCCGTACGAGGGGACCTTATAGGCGGCCTACCACACCAGACGAAGAGTCCGAAGG
TACGCCAAACGCATATTCAGGACGTAAGTGGGAGGACCCTGAGCCTCATTGCCGACTGAA
GGTGAATCGCTGGCCCACTGCTAGTTCCTCCCTTCGCTAATGGTCACGGGAATATCGCCA
CCTCGTCGATGACGCTCGATTAGACCTGTAGGAACACAACATACTAGGTGGACACGGGAC
ACCGATTTACCCACGCCGGACAGTCGTTCTTAT
>Rosalind_3750
ACAGTGTCATGGGATCTGGAGACGTATCCAAGCTAAACGCGCGTTCTATACAGACGTCGA
AACACGGGGGGCGAACTGCTTTAGCGACATGCTCTTACTGAAGTCTAGACGCTAAGGGCT
TTAGACAGCGAATAGTGGTTGATAGGTATTGAGCCATCCGTGTAGAGCGTTAGAAGGCCA
CGGCTTACTTGGTTAAAAGCTGATTTGGGCGGTTACATTCTGGGGTTTAAATACTATCGA
GTATCGATGCTTTTCTATGTATTGAAGACTGGTAAGCTTTCCCCGACCAGGTCGCGCCAT
CGTACCTTCTGGGGAAACTAATGCGGCTGAGTCGGCGACTTCAGGATGTCCCGATACACG
CAGCGTCACAGGTAAACTCGCCTTATAACGCGTCCCCGTCGATAAGGCCGACCCTTTCAG
ATGCGCGGTGCTCCTTCGATTGTTGACGACGCCATCCGAGGTCCAGACGTCTGAGGCCAC
GTGATCGGCCCCCTGTTACTGAGAAGCAGATTACCCCTAAGAATCGTCCGTCGCCTAGTA
GTTGCCGCAACCGACGATACTTCTCCAACATAATCTAGCGTATTTATCAAAGCGTCGTCG
TATCTAGCCTTACGGACGTAATACGAATACCCCCTGCTCAGTGGGCATGTAATACGCCAA
CCAAAAACACGCCAGTTACGAGGAGTGGCACTGCTATAAACCTAGATGAGATCGCTGATG
CCACGAGGAACCTTAGTTGAGTCCGCTGAACCCGCCAGTTGGCTTTGCAGGTCCGCGTTG
TTACTATGACTAAAATATATGATGGATACGCGGACCACTCCTACAGATGCTAAAAGTCAA
ACCGGCACCTATTAGATTTTTAACGGTGCACTTCTAACCGACATAGCCCGCGACCAGGGG
TGAAATTGCATTACATACGATATGATCGCTCCCAGGTCAATGACCACTTGACCTGTGAGT
TTGCTTATTAAGGTGGCTTTAGGCAGCGTAAGC
>Rosalind_9350
ATGAATTTTTAGCGCAAATGAACCGCCTGCTTCCATTAAGTCCCCGCTGCAGAAACCTCG
TTTGTATTCAGAAAGTTCACCTGACAACGGGGCATAGGGTAAATAGATGCTATGTAAATC
TTAGGGCTTACGCGGCGACTTTGACTTTTTCAGCGAACAGAGGCGAAGGCGACCAGCGTC
ATAGGTCTTCATACCGAAACAACAGGGGAGCATGGCCAATCACTGTCACTAACTCACGGG
ACTCCGCCTTGCTCGCCGGTGCCATATCGTACTGACGTAACTCATTGAATTCCATAGAAC
TTGGTTTAGGCCACCTCCGCCGAAACCCGTGGTGGTAAGTCAAGCGAGGACACCGGAAAT
TCCGACCCCGGTTCCCAACACAGGGCTATTCATCACATTTGGTGTACGTATTGATCCTTA
ATTGCCAGAGTCCTACTCGTTGATGTACGATCCACTTAAGTAAGGTCGGGCGTTCTACCG
CGCGGCGCATACCGGACATTATAGCTTAGGCCCCCCAGCTCTATTGTTATTACTATATCC
CTAATTCTAGAAGGGAAATTGTAAGATCAATTCCCGGCAGGTGGGCAGGAACAGACGTCG
AGCACCATTCGTAGTAAAGGTCTTTCTCGGTGTGTAGCGTTGACAAATCTGCAACCCAAC
CTTGTACTCTTCGCTGAACAATAGGTGCATTTCAAGACCGAGCTTGGCGCTGTTTCCTGA
CTGCAGCATGGGCAAAATTCTCGTAGGCAAGTGATCAATTAGCGGAACGCATTGGAAAAA
TTTGTTGGCACAATCCGGCACAGGTACTGATACCCCTCGATGTCGCAGTGCCGAGTCACC
CATCGCATGATCTGAGGTTGGTGCTGCCAGCGCTCTCCGAACAGGAGTCGTAGTTGCACT
CATGGCCGCTTTACGACGGGAGAAACTTACAGTAGCCTTGTAACAACTTTGTAAATCGTT
CATGGACTATCGTGAGGCAGACTTCTATTGTCC
>Rosalind_6074
CGAGGTAACAGTTGTCCGTTCTTTGTAGATTGCCTGGGGTGAAGGTACTAGTTAGCAATG
ATCAGAAGAAAATAGAGCCAGCCGGACTCTCGGGGCGGTACCAGGGTCGAGGAATCTGGG
TAAGTTTCCTATGTGATGAACAGGGTTTTCGATGGTAACGATGTGAACGACCCTGGGTCG
GGTTCAGCCCTCCTAACGAAACACGTGCTTCAGAAAAATAGTTGCAACCTGTTGTTGTCA
ACCTAGTCCTATAGAGTATGTTACTCGGCTATACTCAGGACCTATCCAGACCGCCACTCT
TTCTCTGTGTTAAAACCCCACCATATAAGATCCGTCCTCCCTTTTCACCGCCTTTACAGC
AGGGAGCCGTTGAGCAGGGCCAATGACGCCAAGACTTTACTAAAGTGACTGGTAGGTTCA
TTCTACCTATCCCTTTGCGTATTGATGTTTAGTCTGGTTTCAGGTACAGGTAAACCAGGT
GGCTGGTGCCATACTCGCTAAACAAATGTGGGGGCGCGAAAGATCTGGTGCAGGTTGACT
ACGATTTTATAGAGCAGTACACCGTGCTAGTCAGCATGAGTGGAGACACCTGAAATAAGT
GACGAGGTTGTCCAATGTATAGGACGACAGTTGCAGGGTGCACTGCAACAGAGTTATAAC
CATTACGTTGACTTAACACATGATTGTTAAAATGCTTCGACCCAAGACTCGGCGGGTCAA
AGTAAACCATTACGCGCGGGTGTCTGTAGCTACGGGTCAGCAGGGACCTAGCTATTACGA
GATAGGAAGGCCCACGTACCTAGGGGTCCCTTTTTCGGGTCTTTACCTGGTCAGCGAAGC
CCCGAAACGTGAACTCCAGTGATAACAGGTTAACGGCTTCTGGTGACGACTCTATCGAGT
TGTCAATGTAGCTTACAGGTACTATCGGGAATAATGTCGGGGGTGAACGTTGCGGTTTAA
AGTGGCTCAGCAAGCATATACACCTAGGTTGCG


Solution

  • Try using format strings:

    f'{expression}'

    str.join()

    dict.items()

    Code:


    d = {'A': [5, 3, 3, 3, 1, 4, 2, 1, 2, 3], 'C': [2, 1, 3, 2, 1, 2, 2, 1, 3, 3], 'G': [1, 3, 2, 4, 3, 2, 1, 3, 3, 0], 'T': [2, 3, 2, 1, 5, 2, 5, 5, 1, 3]}
    
    for k, v in d.items():             #loop over your output 
      g = " ".join(str(v) for v in v)  #join list values  
      print(f'{k:}: {g:2}')            #format text
    

    Result:


    A: 5 3 3 3 1 4 2 1 2 3
    C: 2 1 3 2 1 2 2 1 3 3
    G: 1 3 2 4 3 2 1 3 3 0
    T: 2 3 2 1 5 2 5 5 1 3