pythonargparse

Make argparse treat dashes and underscore identically


argparse replaces dashes in optional arguments by underscores to determine their destination:

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--use-unicorns', action='store_true')
args = parser.parse_args(['--use-unicorns'])
print(args)  # returns: Namespace(use_unicorns=True)

However the user has to remember whether the option is --use-unicorns or --use_unicorns; using the wrong variant raises an error.

This can cause some frustration as the variable args.use_unicorns in the code does not make it clear which variant was defined.

How can I make argparse accept both --use-unicorns and --use_unicorns as valid ways to define this optional argument?


Solution

  • parser.add_argument accepts more than one flag for an argument (link to documentation). One easy way to make the parser accept both variants is to declare the argument as

    parser.add_argument('--use-unicorns', '--use_unicorns', action='store_true')
    

    However both options will show up in the help, and it is not very elegant as it forces one to write the variants manually.

    An alternative is to subclass argparse.ArgumentParser to make the matching invariant to replacing dashes by underscore. This requires a little bit of fiddling, as both argparse_ActionsContainer._parse_optional and argparse_ActionsContainer._get_option_tuples have to be modified to handle this matching and abbrevations, e.g. --use_unic.

    I ended up with the following subclassed method, where the matching to abbrevations is delegated from _parse_optional to _get_option_tuples:

    from gettext import gettext as _
    import argparse
    
    
    class ArgumentParser(argparse.ArgumentParser):
    
        def _parse_optional(self, arg_string):
            # if it's an empty string, it was meant to be a positional
            if not arg_string:
                return None
    
            # if it doesn't start with a prefix, it was meant to be positional
            if not arg_string[0] in self.prefix_chars:
                return None
    
            # if it's just a single character, it was meant to be positional
            if len(arg_string) == 1:
                return None
    
            option_tuples = self._get_option_tuples(arg_string)
    
            # if multiple actions match, the option string was ambiguous
            if len(option_tuples) > 1:
                options = ', '.join([option_string
                    for action, option_string, explicit_arg in option_tuples])
                args = {'option': arg_string, 'matches': options}
                msg = _('ambiguous option: %(option)s could match %(matches)s')
                self.error(msg % args)
    
            # if exactly one action matched, this segmentation is good,
            # so return the parsed action
            elif len(option_tuples) == 1:
                option_tuple, = option_tuples
                return option_tuple
    
            # if it was not found as an option, but it looks like a negative
            # number, it was meant to be positional
            # unless there are negative-number-like options
            if self._negative_number_matcher.match(arg_string):
                if not self._has_negative_number_optionals:
                    return None
    
            # if it contains a space, it was meant to be a positional
            if ' ' in arg_string:
                return None
    
            # it was meant to be an optional but there is no such option
            # in this parser (though it might be a valid option in a subparser)
            return None, arg_string, None
    
        def _get_option_tuples(self, option_string):
            result = []
    
            if '=' in option_string:
                option_prefix, explicit_arg = option_string.split('=', 1)
            else:
                option_prefix = option_string
                explicit_arg = None
            if option_prefix in self._option_string_actions:
                action = self._option_string_actions[option_prefix]
                tup = action, option_prefix, explicit_arg
                result.append(tup)
            else:  # imperfect match
                chars = self.prefix_chars
                if option_string[0] in chars and option_string[1] not in chars:
                    # short option: if single character, can be concatenated with arguments
                    short_option_prefix = option_string[:2]
                    short_explicit_arg = option_string[2:]
                    if short_option_prefix in self._option_string_actions:
                        action = self._option_string_actions[short_option_prefix]
                        tup = action, short_option_prefix, short_explicit_arg
                        result.append(tup)
    
                underscored = {k.replace('-', '_'): k for k in self._option_string_actions}
                option_prefix = option_prefix.replace('-', '_')
                if option_prefix in underscored:
                    action = self._option_string_actions[underscored[option_prefix]]
                    tup = action, underscored[option_prefix], explicit_arg
                    result.append(tup)
                elif self.allow_abbrev:
                        for option_string in underscored:
                            if option_string.startswith(option_prefix):
                                action = self._option_string_actions[underscored[option_string]]
                                tup = action, underscored[option_string], explicit_arg
                                result.append(tup)
    
            # return the collected option tuples
            return result
    

    A lot of this code is directly derived from the corresponding methods in argparse (from the CPython implementation here). Using this subclass should make the matching of optional arguments invariant to using dashes - or underscores _.