I am trying to run stitchr in R. For programs that run in Python, I use reticulate
. I create a conda environment named r-reticulate
, where I want to install stitchr
and run it.
I try the following:
if (!('r-reticulate' %in% reticulate::conda_list()[,1])){
reticulate::conda_create(envname = 'r-reticulate', packages = 'python=3.10')
}
reticulate::use_condaenv('r-reticulate')
reticulate::py_install("stitchr", pip = TRUE)
system("stitchr -h") # this does not work
But obviously enough, the system()
call does not work, with the message error in running command
.
What would be the right way to do this?
I had success in the past with anndata, for example. But this is an R package wrapper, so I can just do:
reticulate::use_condaenv('r-reticulate')
reticulate::py_install("anndata", pip = TRUE)
data_h5ad <- anndata::read_h5ad("file.h5ad")
How can I approach the stitchr
case?
EDIT:
So I retrieved stitchr.py
location during the package installation: /usr/local/Caskroom/miniconda/base/envs/r-reticulate/lib/python3.10/site-packages/Stitchr/stitchr.py
I tried all the following but nothing works (see error messages):
pyloc="/usr/local/Caskroom/miniconda/base/envs/r-reticulate/lib/python3.10/site-packages/Stitchr/stitchr.py"
reticulate::source_python(pyloc)
Error in py_run_file_impl(file, local, convert) : ImportError: attempted relative import with no known parent package Run
reticulate::py_last_error()
for details.
reticulate::py_run_file(pyloc)
Error in py_run_file_impl(file, local, convert) : ImportError: attempted relative import with no known parent package Run
reticulate::py_last_error()
for details.
reticulate::py_run_string(paste(pyloc, "-h"))
Error in py_run_string_impl(code, local, convert) : File "", line 1 /usr/local/Caskroom/miniconda/base/envs/r-reticulate/lib/python3.10/site-packages/Stitchr/stitchr.py -h SyntaxError: invalid syntax Run
reticulate::py_last_error()
for details.
I am absolutely clueless on how to proceed here.
Here is maybe what you expect.
shell:
conda create --name=testenv python
# or conda create --name=testenv python==3.10.13 if you want a specific version for jupyter for example
conda activate testenv
# to be sure which pip is:
whereis pip
~/anaconda3/envs/testenv/bin/pip
shell stitchr part, read from the doc of stitchr
pip install stitchr IMGTgeneDL
stitchrdl
stitchr -v TRBV7-3*01 -j TRBJ1-1*01 -cdr3 CASSYLQAQYTEAFF
It works with command line.
shell
cd ~
cp /home/extraits/anaconda3/envs/testenv/bin/stitchr ~/teststitchr.py
./teststitchr.py -v TRBV7-3*01 -j TRBJ1-1*01 -cdr3 CASSYLQAQYTEAFF
It works with command line.
Create ~/teststitchr2.py
filled by the content of https://jamieheather.github.io/stitchr/importing.html
~/teststitchr2.py
:
# import stitchr
from Stitchr import stitchrfunctions as fxn
from Stitchr import stitchr as st
# specify details about the locus to be stitched
chain = 'TRB'
species = 'HUMAN'
# initialise the necessary data
tcr_dat, functionality, partial = fxn.get_imgt_data(chain, st.gene_types, species)
codons = fxn.get_optimal_codons('', species)
# provide details of the rearrangement to be stitched
tcr_bits = {'v': 'TRBV7-3*01', 'j': 'TRBJ1-1*01', 'cdr3': 'CASSYLQAQYTEAFF',
'l': 'TRBV7-3*01', 'c': 'TRBC1*01',
'skip_c_checks': False, 'species': species, 'seamless': False,
'5_prime_seq': '', '3_prime_seq': '', 'name': 'TCR'}
# then run stitchr on that rearrangement
stitched = st.stitch(tcr_bits, tcr_dat, functionality, partial, codons, 3, '')
print(stitched)
# Which produces
(['TCR', 'TRBV7-3*01', 'TRBJ1-1*01', 'TRBC1*01', 'CASSYLQAQYTEAFF', 'TRBV7-3*01(L)'],
'ATGGG snip snip snip snip snip snip TTC',
0)
python in the shell
python ./teststitchr2.py
(['TCR', 'TRBV7-301', 'TRBJ1-101', 'TRBC101','CASSYLQAQYTEAFF','TRBV7-301(L)'],'ATG snip snip snip snip TTC', 0)
In R:
library(reticulate)
reticulate::use_condaenv('testenv')
py_run_file(file.path(path.expand('~'),'teststitchr2.py'))
names(py)
reticulate::py_run_file()
populates the variable py
: https://rstudio.github.io/reticulate/articles/calling_python.html#executing-code
Here is, by names(py)
, all functions and variables from reticulate prefixed by py$
c("chain", "codons", "functionality", "fxn", "partial", "r", "species", "st", "stitched", "tcr_bits", "tcr_dat")
In R:
print(py$stitched )
It works :)
[[1]]
[1] "TCR" "TRBV7-3*01" "TRBJ1-1*01" "TRBC1*01"
[5] "CASSYLQAQYTEAFF" "TRBV7-3*01(L)"
[[2]]
[1] "ATGGGCAC snip snip snip snip "
[[3]]
[1] 0
You can type myvar=py$stitched
to have it in a variable and use it later.
You can also try this: In R:
tcr_bits2= list(v = "TRBV7-3*01", j = "TRBJ1-1*01", cdr3 = "CASSYLQAQYTEAFF",
l = "TRBV7-3*01", c = "TRBC1*01", skip_c_checks = FALSE,
species = "HUMAN", seamless = FALSE, `5_prime_seq` = "",
`3_prime_seq` = "", name = "TCR")
py$st$stitch(tcr_bits2, py$tcr_dat,py$functionality, py$partial, py$codons, 3, '')
- 'TCR''TRBV7-301''TRBJ1-101''TRBC101''CASSYLQAQYTEAFF''TRBV7-301(L)'
- 'ATG snip snip snip snip ATTTC'
- 0
Be careful I mixed R variable, tcr_bits2
, and reticulate environment (py$
). You can type myvar2=py$st$stitch(bla bla)
to have it in a variable and use it later.
It works again :)
Edit:
And a bad trick, in the Python side, if you have an issue of import, before from Stitchr import
import os
os.chdir(os.path.join(os.path.expanduser('~'), 'anaconda3/envs/testenv/lib/python3.12/site-packages'))
But look at also How can I import a module dynamically given the full path?
This trick (os.chdir()
) is only for test, but try to not use it.