pythoncsvthai

How can I use python to find specific Thai word in multiple csv file and return list of file name that's contain the word


I have like 100+ file in directory and I need to find out which files contain the word in Thai that I looking for

Thank you

I try this but it doesn't work `

import pandas as pd
import re
import os

FOLDER_PATH = r'C:\Users\project'

list = os.listdir(FOLDER_PATH)

 def is_name_in_csv(word,csv_file):
  with open(csv_file,"r") as f:
    data = f.read()
  return bool(re.search(word, data))

word = "บัญชีรายรับ"
for csv_file in list:
  if is_name_in_csv(word,csv_file):
    print(f"find the {word} in {csv_file}")

`


Solution

  • You don't need regex. You can simply check if word in fileContents. Also, I changed list to paths because list is a built-in python keyword.

    import os
    
    paths = os.listdir(r'C:\Users\project')
    
    def files_with_word(word:str, paths:list) -> str:
        for path in paths:
            with open(path, "r") as f:
                if word in f.read():
                    yield path
    
    #if you need to do something after each found file
    for filepath in files_with_word("บัญชีรายรับ", paths):
        print(filepath)
    
    #or if you just need a list
    filepaths = [fp for fp in files_with_word("บัญชีรายรับ", paths)]