What would be the simplest approach to universal string handling that would work in both python2 and python3 without having to use third-party modules like six?
I am ok with using if sys.version_info > (3, 0)...
but can't come up with a way to cleanly overlay the string methods to make encoding/decoding to and from bytes transparent?
The goal is to find the minimum possible code that would allow writing self-contained version-agnostic scripts (without dependencies).
The six source code is not too complicated so why not just copy the string parts to your code base? That way you have a well established approach for uniform string handling. I.e. the following code should do:
import sys
PY2 = sys.version_info[0] == 2
PY3 = sys.version_info[0] == 3
if PY3:
text_type = str
binary_type = bytes
else:
text_type = unicode
binary_type = str
def ensure_binary(s, encoding='utf-8', errors='strict'):
if isinstance(s, text_type):
return s.encode(encoding, errors)
elif isinstance(s, binary_type):
return s
else:
raise TypeError("not expecting type '%s'" % type(s))
def ensure_str(s, encoding='utf-8', errors='strict'):
if not isinstance(s, (text_type, binary_type)):
raise TypeError("not expecting type '%s'" % type(s))
if PY2 and isinstance(s, text_type):
s = s.encode(encoding, errors)
elif PY3 and isinstance(s, binary_type):
s = s.decode(encoding, errors)
return s
def ensure_text(s, encoding='utf-8', errors='strict'):
if isinstance(s, binary_type):
return s.decode(encoding, errors)
elif isinstance(s, text_type):
return s
else:
raise TypeError("not expecting type '%s'" % type(s))