pythonxlscjkxlrdhindi

Using xlrd to read Excel xls file containing Chinese and/or Hindi characters


http://scienceoss.com/read-excel-files-from-python/comment-page-1/#comment-1051

From the above link, I used this utility to read an XLS file. If the XLS file contains different language characters like Chinese or Hindi, it does not output them correctly. Is there a workaround for this?

After Googling, I found this:

import xlrd

def upload_xls(dir,file,request):
    try:
        global msg
        global row_num
        row_num = []
        header_arr = []
        global file_path
        file_path = dir
        #reader = csv.reader(open(file), delimiter='#', quotechar='"')
        book = xlrd.open_workbook('dodgy.xls',encoding='cp1252')   ##To specify UTF8-encoding
        wb.sheet_names()
        sh =  wb.sheet_by_index(0)
        valid_xl_format = 0
        invalid_xl_format = 0
     except:
        print "Error

But there is an error in the line book = open_workbook('dodgy.xls',encoding='cp1252'):

TypeError: open_workbook() got an unexpected keyword argument 'encoding'


Solution

  • According to the xlrd module documentation, the correct parameter is: encoding_override="cp1252" and not encoding="cp1252".

    From the way you are importing the xlrd module you should be calling the function as xlrd.open_workbook but in the example code you use the function directly, as if you had used from xlrd import *.