Interview hash map question on code fights, need help optimizing my brute force solution. Here is the problem:
Given a string str and array of pairs that indicates which indices in the string can be swapped, return the lexicographically largest string that results from doing the allowed swaps. You can swap indices any number of times.
Example
For str = "abdc" and pairs = [[1, 4], [3, 4]], the output should be
swapLexOrder(str, pairs) = "dbca".
By swapping the given indices, you get the strings: "cbda", "cbad", "dbac", "dbca". The lexicographically largest string in this list is "dbca".
My current solution
Brute force by continually adding all possibilities until the there are no new solutions. This is too slow for swapLexOrder('dznsxamwoj',[[1,2],[3,4],[6,5],[8,10]])
, doesn't finish on my machine. Any hints for optimizing? An easier test case that passes is swapLexOrder('abdc,[[1,4],[3,4]])= dbca
def swapLexOrder(str, pairs):
d = {}
d[str]=True
while True:
oldlen=len(d)
for x,y in pairs:
for s in d.keys():
d[swp(s,x,y)]=True
if len(d) == oldlen:
#no more new combinations.
return sorted(d)[-1]
def swp(str,x,y):
x=x-1
y=y-1
a=str[x]
b=str[y]
return str[0:x]+b+str[x+1:y]+a+str[y+1:]
My proposed solution would be to first try to 'link' as many pairs as possible to form sets of indices which can be interchanged - eg in your first example, [[1, 4], [3, 4]]
can become [[1, 3, 4]]
. Each of these subsets of indices can then be lexicographically sorted to form the output. The implementation comes to this:
def build_permitted_subs(pairs):
perm = []
for a, b in pairs:
merged = False
for ind, sub_perm in enumerate(perm):
if a in sub_perm or b in sub_perm:
sub_perm.add(a)
sub_perm.add(b)
merged = True
break
else:
perm.append(set([a, b]))
if merged:
for merge_perm_ind in reversed(range(ind + 1, len(perm))):
if perm[merge_perm_ind] & sub_perm:
sub_perm.update(perm[merge_perm_ind])
perm.pop(merge_perm_ind)
return list(map(sorted, perm))
def swap_lex_order(swap_str, _pairs):
pairs = [[a - 1, b - 1] for a, b in _pairs]
out = list(swap_str)
perm = build_permitted_subs(pairs)
for sub_perm in perm:
sorted_subset = sorted(sub_perm, key=lambda ind: swap_str[ind], reverse=True)
for sort, targ in zip(sorted_subset, sub_perm):
out[targ] = swap_str[sort]
return "".join(out)
print(swap_lex_order("dznsxamwoj", [[1, 2], [3, 4], [6, 5], [8, 10]]))
print(swap_lex_order("abdc", [[1, 4], [3, 4]]))
print(swap_lex_order("acxrabdz",[[1,3], [6,8], [3,8], [2,7]]))
with output:
zdsnxamwoj
dbca
zdxrabca
I've also renamed your parameters not to use str
, which is already a pretty fundamental Python builtin. Note that my code may not be as Pythonic as possible, but I think it works well enough to illustrate the algorithm, and it's not suffering from any major performance hits. I suspect this approach has a pretty low complexity - it's generally 'intelligent' in that it doesn't brute force anything, and uses O(n log n)
sorts. The first example seems to be right. Note that this transforms each pair to be 0-based as this is much easier for Python.
This relies a little on being able to form any permutation (sorting the linked pairs) from adjacent permutations (swapping pairs). This may not be entirely intuitive, but it might help to realise you can effectively perform insertion using only adjacent swaps in a list (by continually swapping an element in the direction for it to go). An example of permuting a list using adjacent swaps is bubble sort, and you might realise that if any permutation can be bubblesorted, that means all permutations can be reached by bubblesort.
If you have any questions, or anything doesn't work, let me know and I'll start elaborating/debugging. (As of 19:28 GMT I've already noticed one bug and edited it out : ). Bug #2 (with the duplicated z at test case 3) should also be fixed.
A little more on bug #1:
I hadn't sorted the indices returned by build_permitted_subs
, so it couldn't sort them properly with reference to swap_str
.
More on bug #2:
The build_permitted_subs
function wasn't working properly - specifically, if it met a pair that could go into two groups, meaning those groups should also join together, this didn't happen, and there would now be two groups that shouldn't be separate. This leads to z duplication as both groups can draw from the z. I've sloppily fixed this with a flag and a retroactive for loop.