Someone knows how I can get the scientific name (or all the features) from a data in the GenBank using only the GenBank code accession and biopython. For example:
>>> From Bio import Entrez
>>> Entrez.email = someuser@mail.com
>>> Input = Entrez.someFunction(db="nucleotide", term="AY851612")
>>> output = Entrez.read(Input)
>>> print output
"Austrocylindropuntia subulata"
Or well:
>>> print output
"LOCUS AY851612 892 bp DNA linear PLN 10-APR-2007
DEFINITION Opuntia subulata rpl16 gene, intron; chloroplast.
ACCESSION AY851612
VERSION AY851612.1 GI:57240072
KEYWORDS .
SOURCE chloroplast Austrocylindropuntia subulata
ORGANISM Austrocylindropuntia subulata
Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons;
Caryophyllales; Cactaceae; Opuntioideae; Austrocylindropuntia.
REFERENCE 1 (bases 1 to 892)
AUTHORS Butterworth,C.A. and Wallace,R.S.
..."
Thanks to all ! =)
Note that output
is a dictionary. You can access any appropriate fields if needed. Also, you would want to use efetch, as opposed to esearch.
In [1]: from Bio import Entrez
In [3]: Entrez.email = '##############'
In [28]: handle = Entrez.efetch(db="nucleotide", id="AY851612", rettype="gb", retmode="text")
In [29]: x = SeqIO.read(handle, 'genbank')
In [30]: print(x)
ID: AY851612.1
Name: AY851612
Description: Opuntia subulata rpl16 gene, intron; chloroplast.
Number of features: 3
/date=10-APR-2007
/sequence_version=1
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta', 'Tracheophyta', 'Spermatophyta', 'Magnoliophyta', 'eudicotyledons', 'Gunneridae', 'Pentapetalae', 'Caryophyllales', 'Cactineae', 'Cactaceae', 'Opuntioideae', 'Austrocylindropuntia']
/data_file_division=PLN
/references=[Reference(title='Molecular Phylogenetics of the Leafy Cactus Genus Pereskia (Cactaceae)', ...), Reference(title='Direct Submission', ...)]
/keywords=['']
/accessions=['AY851612']
/gi=57240072
/organism=Austrocylindropuntia subulata
/source=chloroplast Austrocylindropuntia subulata
Seq('CATTAAAGAAGGGGGATGCGGATAAATGGAAAGGCGAAAGAAAGAAAAAAATGA...AGA', IUPACAmbiguousDNA())
In [31]: x.description
Out[31]: 'Opuntia subulata rpl16 gene, intron; chloroplast.'