rbioinformaticsbiomart

Can't convert dog ensembl IDs into gene names


I usually use biomaRt to convert gene ids to symbols. However, this time the ensembl IDs I have (for a dog) do not match the ensembl ids of biomart dataset "clfamiliaris_gene_ensembl".

I also tried to use the ensembl web portal, the dog dataset is called ROS_Cfam_1.0 there. Looks like my genes do not match the genes from their dataset. My genes look like this:

"ENSCAFG00000045440" "ENSCAFG00000000001" "ENSCAFG00000000002" "ENSCAFG00000041462" "ENSCAFG00000000005"

Here is my biomaRt code:

ensembl <- useMart("ensembl")
ensembl <- useDataset("clfamiliaris_gene_ensembl",mart=ensembl)
gene_id <- getBM(attributes = c('ensembl_gene_id', 'external_gene_name'),
                 values = rownames(mydata),
                 filters = c('ensembl_gene_id'), mart = ensembl)
gene_id
[1] ensembl_gene_id    external_gene_name
<0 rows> (or 0-length row.names)

it doesn't find my values. Should I use a different dataset for dogs?


Solution

  • These IDs are from the Boxer dog genome assembly: https://www.ensembl.org/Canis_lupus_familiaris/Info/Strains?db=core

    However, BioMart is not available for dog breeds (as well as other species and strains): https://www.ensembl.info/2021/01/20/important-changes-of-data-availability-in-ensembl-gene-trees-and-biomart/

    However, you can use the POST lookup/id REST API endpoint to retrieve the gene symbol for a list of gene IDs from any species: http://rest.ensembl.org/documentation/info/lookup_post