I have a cluster running with torque to distribute jobs. I want to run a job with tensorflow code and I am having problems with tensorflow not being recognized.
I installed tensorflow on my LDAP user using anaconda and so I can enter the tensorflow environment in any node and run my code manually. My problem is that the torque job doesn't open up the conda environment when it runs and so I get "ImportError: No module named tensorflow" and my code doesn't run. So they pythons code does run but cant find the tensorflow module when I import it in python because it doesn't run in the tensorflow conda environment
How can I tell the nodes to run my python file in a tensorflow conda environment?
This is how my torque job file looks
Note: Here I tried running the command that opens the environment, in other versions I didn't.
Thanks in advance for any help available.
I ended up just needing to add in my pbs file and export for the path to my conda bin folder bin folder so it will run python using my conda environments python binary.
Also I not related to this but might still be relevant to people doing this I ended up needing to export my cuda bin directory as well.
What I added:
export PATH="/home/my_user/anaconda3/bin:$PATH"
export PATH=$PATH:/usr/local/maui/bin:/usr/local/maui/sbin
export PATH=$PATH:/usr/local/cuda-8.0/bin