pythonsearchtreetagger

Is 'search' causing 'String index out of range' ? (Python)


I'm trying to identify all instances of a specific syntactic pattern found in a text: RB + NN|NNS|NP|PP. That is to say, I'm looking for adverbs that are immediately followed by nouns. I've tagged my text using TreeTagger. The tagged text is stored in a list called 'tags' that looks like this:

    how  WRB
    hard JJ
    it   PP
    was  VBD

This is the relevant part of my code:

adverb = re.compile(r'RB$')
noun = re.compile(r'NN')
for n in range(len(tags)):                                                                                                                          
    w = tags[n]
    if adverb.search(w) != None and noun.search(w[n+1]) != None:
        print(' '.join(tags[n-2 : n+3]))

My problem is that the fifth line produces the following error:

     if adverb.search(w) != None and noun.search(w[n+1]) != None:
     IndexError: string index out of range

If the fourth line of code is this...

     if adverb.search(w) != None:

...then a list of adverbs is returned.

I'm really lost as to 1) why I am getting this mistake and 2) how I can fix it. Any guidance you guys can offer would be super appreciated.


Solution

  • Your problem is this:

    w[n+1]
    

    You are confusing your list tags with a string in that list, w. If you want to access another item in the list, you need to use tags[...], not w[...]. Also, you should make sure that the index you are using is inside the range of the list.