pythonalgorithmprefix-tree

Remove all string items from a list, that are prefix of other string items in the list


I've got a list fo paths, and I'd like to keep only the items that are not prefix of any other item.

For example, In the following list:

private
private/etc
private/etc/pam.d
usr
usr/local
usr/local/lib
usr/local/lib/security

I want to keep only:

private/etc/pam.d
usr/local/lib/security

I prefer not to "invent the wheel" and implement prefix tree, but using a python package that already do this.

thanks!


Solution

  • If your list is already ordered, each item is a prefix of the following OR is not a prefix of any of the following.

    Therefore, you can write:

    ls.sort()
    [ls[i] for i in range(len(ls))[:-1] if ls[i] != ls[i+1][:len(ls[i])]] + [ls[-1]]
    

    Another implementation, using zip:

    [x for x, y in zip(ls[:-1], ls[1:]) if x != y[:len(x)]] + [ls[-1]]