pythonpython-2to3

How to handle strings transparently across pythons 2 and 3 without external modules?


What would be the simplest approach to universal string handling that would work in both python2 and python3 without having to use third-party modules like six?

I am ok with using if sys.version_info > (3, 0)... but can't come up with a way to cleanly overlay the string methods to make encoding/decoding to and from bytes transparent?

The goal is to find the minimum possible code that would allow writing self-contained version-agnostic scripts (without dependencies).


Solution

  • The six source code is not too complicated so why not just copy the string parts to your code base? That way you have a well established approach for uniform string handling. I.e. the following code should do:

    import sys
    
    PY2 = sys.version_info[0] == 2
    PY3 = sys.version_info[0] == 3
    
    if PY3:
        text_type = str
        binary_type = bytes
    else:
        text_type = unicode
        binary_type = str
    
    
    def ensure_binary(s, encoding='utf-8', errors='strict'):
        if isinstance(s, text_type):
            return s.encode(encoding, errors)
        elif isinstance(s, binary_type):
            return s
        else:
            raise TypeError("not expecting type '%s'" % type(s))
    
    
    def ensure_str(s, encoding='utf-8', errors='strict'):
        if not isinstance(s, (text_type, binary_type)):
            raise TypeError("not expecting type '%s'" % type(s))
        if PY2 and isinstance(s, text_type):
            s = s.encode(encoding, errors)
        elif PY3 and isinstance(s, binary_type):
            s = s.decode(encoding, errors)
        return s
    
    
    def ensure_text(s, encoding='utf-8', errors='strict'):
        if isinstance(s, binary_type):
            return s.decode(encoding, errors)
        elif isinstance(s, text_type):
            return s
        else:
            raise TypeError("not expecting type '%s'" % type(s))