pythonpython-2.7tamil

how to search for all occurrences of a Tamil character in a line?


I am trying to write a program which checks the occurrence of a Tamil character in the line/sentence. I have written a code it checks if the character is present in the line and if so quits does not check for the second or third occurrence. Here is the code: (I have split the words into characters so I am checking with "word").

    count=0
    word="ஆ"
    f=open('input','r')
    for line in f.readlines():
        if word in line:
            count=count+1
    print count
    f.close()

the input file "input" has:

   ஆ ன் டை ன்  
   ஆ ன் டை னி ன் 
   ஆ ன் டொ வி ன் 
   ஆ ன் ட் டா ல ஜி 
   எ ன் றி ஆ ன் 
   ஆ ன் ட் ட ன் ஆ

the current output is:

count:6

but output should be:

count:7

In the last line it checks for the word it is present in the beginning, it quits after that. I want it to check the full line and count all occurrences. how should i modify it?


Solution

  • Currently you are only checking whether the character is in a line at all, but not counting the occurences. There's the count method for doing what you want: https://docs.python.org/2/library/stdtypes.html#str.count

    >> 'hello world'.count('l')
    3 
    

    Also, as Wooble already pointed out in his comment, you must take special care when using non-ascii characters in python2 (he provided you with the needed information).