I'm new to HTCondor and I'm trying to run a python script on the condor system. I want to use cv2 and numpy in my code while being able to read my prints and my pickled data after completion.
Currently the code runs and completes (log file: return value 0). But the condor_bin.out
is empty where my prints should appear. And there is no file random_dat.pickle
transfered.
Am I doing something wrong?
Python script:
import numpy as np
import pickle
import cv2 as cv
print('test')
# setup cv2
sift = cv.SIFT_create()
img = cv.imread("0.jpg", cv.IMREAD_GRAYSCALE)
for i in range(25):
# calc cv2
kp, des = sift.detectAndCompute(img, None)
# calc np
norms = np.linalg.norm(des, axis=1)
# calc normal? python
index = []
for p in kp:
temp = (p.pt, p.size, p.angle, p.response, p.octave, p.class_id)
index.append(temp)
with open('./random_dat.pickle', 'wb') as handle:
pickle.dump((123456, index, des, norms), handle)
print("finished")
Condor setup file (test.info)
#Normal execution
Universe = vanilla
#I need just one CPU (which is the default)
RequestCpus = 1
#No GPU
RequestGPUs = 0
#I need disk spqce KB
RequestDisk = 150MB
#I need 2 GBytes of RAM (resident memory)
RequestMemory = 150MB
#It will not run longer than 1 day
+RequestWalltime = 100
#retrieve data
#should_transfer_files = YES
#when_to_transfer_output = ON_EXIT
#I'm a nice person, I think...
NiceUser = true
#Mail me only if something is wrong
Notification = Always
# The job will 'cd' to this directory before starting, be sure you can _write_ here.
initialdir = /users/students/r0xxxxxx/Documents/testing_condor/
# This is the executable or script I want to run
executable = /users/students/r0xxxxxx/Documents/testing_condor/main.py
#Output of condors handling of the jobs, will be in 'initialdir'
Log = condor_bin.log
#Standard output of the 'executable', in 'initialdir'
Output = condor_bin.out
#Standard error of the 'executable', in 'initialdir'
Error = condor_bin.err
#Standard error of the 'executable', in 'initialdir'
# Start just 1 instance of the job
Queue 1
I submitted it using condor_submit test.info
which resulted in the following log in condor_bin.log
:
...
000 (356.000.000) 2021-07-15 18:23:28 Job submitted from host: <10.xx.xx.xxx:xxxx?addrs=10.xx.xx.xxx-xxxx&alias=abcdefg.abcd.abcdefg.be&noUDP&sock=schedd_2422_de78>
...
000 (357.000.000) 2021-07-15 18:24:19 Job submitted from host: <10.xx.xx.xxx:xxxx?addrs=10.xx.xx.xxx-xxxx&alias=abcdefg.abcd.abcdefg.be&noUDP&sock=schedd_2422_de78>
...
040 (356.000.000) 2021-07-15 18:24:21 Started transferring input files
Transferring to host: <10.xx.xx.xx:xxxx?addrs=10.xx.xx.xx-xxxx&alias=other.abcd.abcdefg.be&noUDP&sock=slot1_1_123445_eb75_5374>
...
040 (356.000.000) 2021-07-15 18:24:21 Finished transferring input files
...
001 (356.000.000) 2021-07-15 18:24:22 Job executing on host: <10.xx.xx.xx:xxxx?addrs=10.xx.xx.xx-xxxx&alias=other.abcd.abcdefg.be&noUDP&sock=startd_2178_815c>
...
006 (356.000.000) 2021-07-15 18:24:22 Image size of job updated: 1
0 - MemoryUsage of job (MB)
0 - ResidentSetSize of job (KB)
...
040 (356.000.000) 2021-07-15 18:24:22 Started transferring output files
...
040 (356.000.000) 2021-07-15 18:24:22 Finished transferring output files
...
005 (356.000.000) 2021-07-15 18:24:22 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
0 - Run Bytes Sent By Job
803 - Run Bytes Received By Job
0 - Total Bytes Sent By Job
803 - Total Bytes Received By Job
Partitionable Resources : Usage Request Allocated
Cpus : 1 1
Disk (KB) : 13 153600 782129
Gpus (Average) : 0 0
Memory (MB) : 0 150 256
Job terminated of its own accord at 2021-07-15T16:24:22Z.
...
As you can see in the test.info, I've tried to use
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
But that didn't work.
How can I see my print statements and how can I see my pickled data after completion?
Thanks a lot for your help!
Adding #!/usr/bin/python
as @Greg suggested resulted in following error
Executable file 'my_file/path' is a script with CRLF (DOS/Windows) line endings.
This generally doesn't work, and you should probably run 'dos2unix myfile/path' -- or a similar tool -- before you resubmit.
I generated a new Python file on my Linux system which added following lines as a prefix
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Which successfully runs on condor when using the should_transfer_files = YES
and when_to_transfer_output = ON_EXIT
settings in the test.info
condor file.
TLDR; Running Python code generated in Windows can produce errors on a condor system running on Linux. Fix: Write/copy your code into a Linux generated Python file.