pythonbioinformaticstaxonomyncbiete3

ete3: How to get taxonomic rank names from taxonomy id?


I want to use this to convert a bunch of identifiers but I need to know exactly which taxonomic rank is assigned to each taxonomy code. Shown below is an example of conversion that makes sense but I don't know what to label some of the taxonomy calls. The basic taxonomic ranks are: (domain, kingdom, phylum, class, order, family, genus, and species) https://en.wikipedia.org/wiki/Taxonomic_rank.

For most cases it will be easy, but in the case of having subspecies and strains for bacteria this can get confusing.

How do I get ete3 to specify what rank the lineage IDs correspond to in the taxonomic rank?

import ete3
import pandas as pd

ncbi = ete3.NCBITaxa()
taxon_id = 505
lineage = ncbi.get_lineage(taxon_id)
Se_lineage = pd.Series(ncbi.get_taxid_translator(lineage), name=taxon_id)
Se_lineage[lineage]


1                       root
131567    cellular organisms
2                   Bacteria
1224          Proteobacteria
28216     Betaproteobacteria
206351          Neisseriales
481            Neisseriaceae
32257               Kingella
505          Kingella oralis
Name: 505, dtype: object

Solution

  • Use ncbi.get_rank() to get a dictionary of {id:name} then do some basic transformations to get {name:taxonomy}