There are a lot of ways to check if string is a palindrome. Plenty of them listed here
This question is not about "how" but rather about performance.
I was assuing that is_palindrome
should be twice faster than is_palindrome0
because it does len/2
iterations in the worst case.
However, in reality, is_palindrome
takes more than 30 seconds to check all the strings while is_palindrome0
gets the job done in less than half of a second!
Test-case
def is_palindrome(s):
for i in range(len(s) // 2):
if s[i] != s[len(s)-i-1]:
return 0
return 1
def is_palindrome0(s):
if s == s[::-1]:
return 1
else:
return 0
N = 500
L = 99999
sss = '101' * L
import time
start = time.time()
print(sum([1 for i in range(N) if is_palindrome0(sss+sss[i:])]))
end = time.time()
print(f'{(end - start):.2f}')
start = time.time()
print(sum([1 for i in range(N) if is_palindrome(sss+sss[i:])]))
end = time.time()
print(f'{(end - start):.2f}')
Output
168
0.40
168
34.20
Any ideas why string slicing is so crazy fast? How to debug further?
Apologies if I missed answered question with in-depth performance comparison.
UPDATE. Taking into account Frank's comment.
def is_palindrome(s):
l = len(s)
for i in range(l // 2):
if s[i] != s[l-i-1]:
return 0
return 1
def is_palindrome0(s):
if s == s[::-1]:
return 1
else:
return 0
N = 500
L = 99999
sss = '101' * L
import time
start = time.time()
print(sum([1 for i in range(N) if is_palindrome0(sss+sss[i:])]))
end = time.time()
print(f'{(end - start):.2f}')
start = time.time()
print(sum([1 for i in range(N) if is_palindrome(sss+sss[i:])]))
end = time.time()
print(f'{(end - start):.2f}')
A bit faster but still around 50x times slower.
168
0.41
168
25.11
is_palindrome0 (the fast one)
s[::-1]
uses C-level optimizations under the hood (it's implemented in highly efficient C code in CPython). String comparisons (==
) are also optimized for short-circuiting; they stop early if a mismatch is found. The entire operation is happening in compiled code with no explicit Python loops.
is_palindrome (the slow one):
This is pure Python, interpreted line by line. Each index access like s[i]
and s[len(s) - i - 1]
is an individual bytecode operation. Python for
loops and indexing are orders of magnitude slower than equivalent operations in C.
The main performance killer here is actually in the line is_palindrome(sss+sss[i:]). For each iteration, in ur code:
U create a new string via concatenation (sss+sss[i:])
This new string is ~200,000 characters long
Then ur function iterates through half of it