pythonpython-2.7bioinformaticsbiopythonfasta

Using Bio.SeqIO to write single-line FASTA


QIIME requests this (here) regarding the fasta files it receives as input:

The file is a FASTA file, with sequences in the single line format. That is, sequences are not broken up into multiple lines of a particular length, but instead the entire sequence occupies a single line.

Bio.SeqIO.write of course follows the format recommendations, and splits the sequence every 80 bps. I could write my own writer to write those "single-line" fastas - but my question is if there's a way that I missed to make SeqIO do that.


Solution

  • BioPython's SeqIO module uses the FastaIO submodule to read and write in FASTA format.

    The FastaIO.FastaWriter class can output a different number of characters per line but this part of the interface is not exposed via SeqIO. You would need to use FastaIO directly.

    So instead of using:

    from Bio import SeqIO
    SeqIO.write(data, handle, format)
    

    use:

    from Bio.SeqIO import FastaIO
    fasta_out = FastaIO.FastaWriter(handle, wrap=None)
    fasta_out.write_file(data)
    

    or

    for record in data:
        fasta_out.write_record(record)