I'm calling this method from the Python boto2 library:
boto.emr.step.StreamingStep(name, mapper, reducer=None, combiner=None, action_on_failure='TERMINATE_JOB_FLOW', cache_files=None, cache_archives=None, step_args=None, input=None, output=None, jar='/home/hadoop/contrib/streaming/hadoop-streaming.jar')
I know that arguments after mapper
have default values so I don't have to specify their values. But what if I want to pass a value for just one argument at the very end? For example, I want to provide values for the name
, mapper
, and combiner
parameters, and just use the default value for reducer
.
Should I do this:
boto.emr.step.StreamingStep('a name', 'mapper name', None, 'combiner name')
Or should I expressly pass all arguments before it?
If there are 100 arguments, and I just want to pass a value for the very last argument, then I have to pass many default values to it. Is there a easier way to do this?
There are two ways to do it. The first, most straightforward, is to pass a named argument:
boto.emr.step.StreamingStep(name='a name', mapper='mapper name', combiner='combiner name')
(Note, because name
and mapper
were in order, specifying the argument name wasn't required)
Additionally, you can pass a dictionary with **
argument unpacking:
kwargs = {'name': 'a name', 'mapper': 'mapper name', 'combiner': 'combiner name'}
boto.emr.step.StreamingStep(**kwargs)