I tried to implement quick select to find the m'th smallest number in the list. When I run the program it returns the correct values sometime and incorrect values other times on the same array. What am I doing wrong her is the code
def select_mth_smallest(A, m):
pivot = np.random.choice(A)
# split into 3 partitions
A1 = []
A2 = []
A3 = []
for item in A:
if item < pivot:
A1.append(item)
if item > pivot:
A3.append(item)
else:
A2.append(item)
#find where m'th largest element is and recurse accordingly
if len(A1) >= m:
return select_mth_smallest(A1, m)
elif (len(A1) + len(A2)) >= m:
return pivot
else:
return select_mth_smallest(A3,m - (len(A1)+len(A2)))
Here is an input where the algorithm fails.
A = [1,2,3,4,5]
select_mth_smallest(A,5)
When I repeatedly execute this above statement I get, 5(correct) and 4(incorrect) alternatingly.
One thing that particularly baffles me (I am new to python) is that why I get different return values for the function when repeated with the same input. Looks rather sporadic.. BTW I am using Jupyter
You are adding some items to multiple partitions.
if item < pivot:
A1.append(item)
if item > pivot:
A3.append(item)
else:
A2.append(item)
A1
is the set of items less than the pivot. A3
is the set of items greater than the pivot. A2
, however, is the set of items less than or equal to the pivot, because the 2nd if
statement executes for all items, and one or the other branch will execute.
You want one, single if
statement with an elif
and else
clause here.
if item < pivot:
A1.append(item)
elif item > pivot:
A3.append(item)
else:
A2.append(item)