python-3.xalphabetical-sort

How to get longest alphabetically ordered substring in python


I am trying to write a function that returns the longest substring of s in which the letters occur in alphabetical order. For example, if s = 'azcbobobegghakl', the function should return 'beggh'

Here is my function, which is still not complete but it does not return the list of sub; the return error is:

"IndexError: string index out of range"

def longest_substring(s):
    sub=[]
    for i in range (len(s)-1):
        subs=s[i]
        counter=i+1
        while ord(s[i])<ord(s[counter]):
            subs+=s[counter] 
            counter+=1
        sub.append(subs)
    return sub

Solution

  • It is not optimal (works in linear time O(n)) but i made some modification to your code (in Python 3):

    def longest_substring(s):
        length = len(s)
        if length == 0 :         # Empty string
            return s
        final = s[0]
        for i in range (length-1):
            current = s[i]
            counter = i+1
            while counter < length and ord(s[i]) <= ord(s[counter]):  
                current += s[counter] 
                counter +=1
                i+=1
            if len(final) < len(current):
                final = current
        return final
    s = 'azcbobobegghakl'
    print(longest_substring(s))
    

    Output:

    beggh
    

    Modifications:

    • You are comparing character with fixed position i.e. in while loop you are incrementing only counter not i so I incremented the ith position also.(So we avoid checking the characters which are already checked, So it does this in linear time O(n) I think..)
    • Also you are only checking less than for condition while ord(s[i])<ord(s[counter]): But you also have to check for equals too.
    • You created one list where you append every sequence which is unnecessary unless you want do any other calculations on the sequence, So I take string and if previous sequence's length is small then I updated it with new sequence.

    Note : If two sequence's length is same then 1st occurring sequence is shown as output.

    Another Input:

    s = 'acdb'
    

    Output:

    acd
    

    I hope this will help you.