I'm trying to use spark-submit
to execute my python code in spark cluster.
Generally we run spark-submit
with python code like below.
# Run a Python application on a cluster
./bin/spark-submit \
--master spark://207.184.161.138:7077 \
my_python_code.py \
1000
But I wanna run my_python_code.py
by passing several arguments Is there smart way to pass arguments?
Yes: Put this in a file called args.py
#import sys
print sys.argv
If you run
spark-submit args.py a b c d e
You will see:
['/spark/args.py', 'a', 'b', 'c', 'd', 'e']