pythonstringtime-complexity

Python string comparisons time complexity


I'm curious how Python performs string comparisons under the hood. For example if

if s1 == s2:
   print(True)
else:
   print(False)

is the same as

condition= True
for x,y in zip(s1, s2):
    if x != y:
        condition = False 
print(condition)

Perhaps under the hood python is able to use ord values more efficiently than O(n) traversals?


Solution

  • Python's string compare is implemented in unicodeobject.c. After a few checks such as string length and "kind" (Python may use 1, 2 or 4 bytes per character depending on unicode USC character size), it's just a call to the C lib memcmp.

    With a quick change to your Python code

    condition = True
    if len(s1) == len(s2):
        for x,y in zip(s1, s2):
            if x != y:
                condition = False
                break
    else:
        condition = False
    

    The Python code has the same O(n) time complexity as memcmp, it's just that Python has a much bigger O. Time complexity doesn't say anything about how long an operation takes, just how an operation scales with a larger input set n.

    memcmp is much faster than the python version because of inherent language overhead. But it scales the same. And when you think about it, each of the if x != y: compares in the second example runs the exact same code as the single s1 == s2 compare in the first.