pythonalgorithmperformance

Palindromes and string slicing. Performance


There are a lot of ways to check if string is a palindrome. Plenty of them listed here

This question is not about "how" but rather about performance.

I was assuing that is_palindrome should be twice faster than is_palindrome0 because it does len/2 iterations in the worst case.

However, in reality, is_palindrome takes more than 30 seconds to check all the strings while is_palindrome0 gets the job done in less than half of a second!

Test-case

def is_palindrome(s):
    for i in range(len(s) // 2):
        if s[i] != s[len(s)-i-1]:
            return 0
    return 1

def is_palindrome0(s):
    if s == s[::-1]:
        return 1
    else:
        return 0

N = 500
L = 99999

sss = '101' * L

import time

start = time.time()
print(sum([1 for i in range(N) if is_palindrome0(sss+sss[i:])]))
end = time.time()
print(f'{(end - start):.2f}')

start = time.time()
print(sum([1 for i in range(N) if is_palindrome(sss+sss[i:])]))
end = time.time()
print(f'{(end - start):.2f}')

Output

168
0.40
168
34.20

Any ideas why string slicing is so crazy fast? How to debug further?

Apologies if I missed answered question with in-depth performance comparison.

UPDATE. Taking into account Frank's comment.

def is_palindrome(s):
    l = len(s)
    for i in range(l // 2):
        if s[i] != s[l-i-1]:
            return 0
    return 1

def is_palindrome0(s):
    if s == s[::-1]:
        return 1
    else:
        return 0

N = 500
L = 99999

sss = '101' * L

import time

start = time.time()
print(sum([1 for i in range(N) if is_palindrome0(sss+sss[i:])]))
end = time.time()
print(f'{(end - start):.2f}')

start = time.time()
print(sum([1 for i in range(N) if is_palindrome(sss+sss[i:])]))
end = time.time()
print(f'{(end - start):.2f}')

A bit faster but still around 50x times slower.

168
0.41
168
25.11

Solution

  • is_palindrome0 (the fast one)

    s[::-1] uses C-level optimizations under the hood (it's implemented in highly efficient C code in CPython). String comparisons (==) are also optimized for short-circuiting; they stop early if a mismatch is found. The entire operation is happening in compiled code with no explicit Python loops.

    is_palindrome (the slow one):

    This is pure Python, interpreted line by line. Each index access like s[i] and s[len(s) - i - 1] is an individual bytecode operation. Python for loops and indexing are orders of magnitude slower than equivalent operations in C.

    The main performance killer here is actually in the line is_palindrome(sss+sss[i:]). For each iteration, in ur code: