rblast

How to run blastp in rblast?


I'm trying to use rBlast for protein sequences but somehow it doesn't work. It works fine for nucleotide sequences but for proteins it just doesn't return any match (I used a sequence from the searched dataset, so there can't be no match). In the description it stands "This includes interfaces to blastn, blastp, blastx..." but in the help file in R studio it says "Description Execute blastn from blast+". Did anybody run rBlast for proteins?

Here's what I ran:

listF<-list.files("Trich_prot_fasta/")

fa<-paste0("Trich_prot_fasta/",listF[i])

makeblastdb(fa, dbtype = "prot", args="")

bl <- blast("Trich_prot_fasta/Tri5640_1_GeneModels_FilteredModels1_aa.fasta", type="blastp")

seq <- readAAStringSet("NDRkinase/testSeq.txt")

cl <- predict(bl, seq)

Result:


> cl <- predict(bl, seq)

Warning message: In predict.BLAST(bl, seq) : BLAST did not return a match!


Solution

  • Tried to reproduce the error but everything worked as expected on my system (macOS BigSur 11.6 / R v4.1.1 / Rstudio v1.4.1717).

    Given your blastn was successful, perhaps you are combining multiple fasta files for your protein fasta reference database? If that's the case, try concatenating them together and use the path to the file instead of an R object ("fa") when making your blastdb. Or perhaps:

    makeblastdb(file = "Trich_prot_fasta/Tri5640_1_GeneModels_FilteredModels1_aa.fasta", type = "prot)
    

    Instead of:

    makeblastdb(fa, dbtype = "prot", args="")
    

    Also, please edit your question to include the output from sessionInfo() (might help narrow things down).

    library(tidyverse)
    #BiocManager::install("Biostrings")
    #devtools::install_github("mhahsler/rBLAST")
    library(rBLAST)
    
    # Download an example fasta file:
    # https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/reference_proteomes/Eukaryota/UP000001542/UP000001542_5722.fasta.gz
    # Grab the first fasta sequence as "example_sequence.fasta"
    
    listF <- list.files("~/Downloads/Trich_example", full.names = TRUE)
    listF
    #> [1] "~/Downloads/Trich_example/UP000001542_5722.fasta"        
    #> [2] "~/Downloads/Trich_example/example_sequence.fasta"
    
    makeblastdb(file = "~/Downloads/Trich_example/UP000001542_5722.fasta", dbtype = "prot")
    
    bl <- blast("~/Downloads/Trich_example/UP000001542_5722.fasta", type = "blastp")
    seq <- readAAStringSet("~/Downloads/Trich_example/example_sequence.fasta")
    cl <- predict(bl, seq)
    
    cl
    #>                 QueryID              SubjectID Perc.Ident Alignment.Length
    #> 1    Example_sequence_1 tr|A2D8A1|A2D8A1_TRIVA    100.000              694
    #> 2    Example_sequence_1 tr|A2E4L0|A2E4L0_TRIVA     64.553              694
    #> 3    Example_sequence_1 tr|A2E4L0|A2E4L0_TRIVA     32.436              669
    #> 4    Example_sequence_1 tr|A2D899|A2D899_TRIVA     64.344              488
    #> 5    Example_sequence_1 tr|A2D899|A2D899_TRIVA     31.004              458
    #> 6    Example_sequence_1 tr|A2D899|A2D899_TRIVA     27.070              314
    #> 7    Example_sequence_1 tr|A2D898|A2D898_TRIVA     54.915              468
    #> 8    Example_sequence_1 tr|A2D898|A2D898_TRIVA     33.691              653
    #> 9    Example_sequence_1 tr|A2D898|A2D898_TRIVA     32.936              671
    #> 10   Example_sequence_1 tr|A2D898|A2D898_TRIVA     29.969              654
    #> 11   Example_sequence_1 tr|A2D898|A2D898_TRIVA     26.694              487
    #> 12   Example_sequence_1 tr|A2D898|A2D898_TRIVA     25.000              464
    #> 13   Example_sequence_1 tr|A2F4I3|A2F4I3_TRIVA     39.106              716
    #> 14   Example_sequence_1 tr|A2F4I3|A2F4I3_TRIVA     30.724              677
    #> 15   Example_sequence_1 tr|A2F4I3|A2F4I3_TRIVA     29.257              417
    #> 16   Example_sequence_1 tr|A2F4I3|A2F4I3_TRIVA     23.438              640
    #> 17   Example_sequence_1 tr|A2F4I3|A2F4I3_TRIVA     22.981              718
    #> 18   Example_sequence_1 tr|A2F4I3|A2F4I3_TRIVA     24.107              112
    #> 19   Example_sequence_1 tr|A2FI39|A2FI39_TRIVA     33.378              740
    #> 20   Example_sequence_1 tr|A2FI39|A2FI39_TRIVA     31.440              722
    #>      Mismatches Gap.Openings Q.start Q.end S.start S.end         E   Bits
    #> 1             0            0       1   694       1   694  0.00e+00 1402.0
    #> 2           243            2       1   692     163   855  0.00e+00  920.0
    #> 3           410           15      22   671       1   646  3.02e-94  312.0
    #> 4           173            1     205   692       1   487  0.00e+00  644.0
    #> 5           308            7      22   476       1   453  3.55e-55  198.0
    #> 6           196            5      13   294     173   485  4.12e-25  110.0
    #> 7           211            0       1   468     683  1150 8.48e-169  514.0
    #> 8           420           11       2   647     501  1147  1.61e-91  309.0
    #> 9           396           10       2   666     363   985  5.78e-89  301.0
    #> 10          406           11      16   664     195   801  1.01e-66  238.0
    #> 11          297           10     208   662      21   479  1.60e-36  147.0
    #> 12          316            7      11   469      29   465  3.04e-36  147.0
    #> 13          386            4       2   667     248   963 1.72e-149  461.0
    #> 14          411           10       2   625      66   737  8.34e-83  283.0
    #> 15          286            5     129   542      14   424  2.66e-52  196.0
    #> 16          421           15       5   607     365   972  3.07e-38  152.0
    #> 17          407           21      77   662      27   730  1.25e-33  138.0
    #> 18           81            3     552   661       3   112  2.10e-01   35.4
    #> 19          421            9       3   675     394  1128 1.12e-115  375.0
    #> 20          409           15       2   647     163   874  1.21e-82  285.0
    ...
    

    Created on 2021-09-30 by the reprex package (v2.0.1)