I am trying to achieve an excel translator using thread, queue and semaphore. It was working perfectly and was I using it to translate very large XLSX file in a very short time.
Today I was trying to translate one of the document, but my program was not working anymore. After hours of debugging, I found out that when the type datetime.datetime is found in an excel document, it blocks everything
Here is my code and the output to give you a better understanding:
print("feed raw aquired")
for row_index in range(sheet.shape[0]):
for col_index in range(sheet.shape[1]):
print ("START --------------------")
value = sheet.iat[row_index, col_index]
print (type(value))
print (value)
if type(value) == str:
print("str")
raw_data.put({ "row" : row_index, "col" : col_index, "value" : value })
elif math.isnan(value):
print("nan")
translated_data.put({ "row" : row_index, "col" : col_index, "value" : "" })
elif type(value) == float:
print("float")
translated_data.put({ "row" : row_index, "col" : col_index, "value" : value })
elif type(value) == numpy.int64 or type(value) == int:
print("int")
translated_data.put({ "row" : row_index, "col" : col_index, "value" : value })
else:
print("else")
translated_data.put({ "row" : row_index, "col" : col_index, "value" : value })
print ("END --------------------")
sem.release()
print("feed raw released")
The output is
START --------------------
<class 'float'>
nan
nan
END --------------------
START --------------------
<class 'numpy.float64'>
nan
nan
END --------------------
START --------------------
<class 'str'>
기저귀
str
END --------------------
START --------------------
<class 'str'>
M900009355
str
END --------------------
START --------------------
<class 'str'>
네이쳐러브메레 소프트핏 밴드 특대형 4팩
str
END --------------------
START --------------------
<class 'str'>
제조일자
str
END --------------------
START --------------------
<class 'str'>
2021-01-20
str
END --------------------
START --------------------
<class 'float'>
2.0
float
END --------------------
START --------------------
<class 'float'>
9110.0
float
END --------------------
START --------------------
<class 'int'>
18220
int
END --------------------
START --------------------
<class 'float'>
34900.0
float
END --------------------
START --------------------
<class 'str'>
리퍼상품
str
END --------------------
START --------------------
<class 'float'>
nan
nan
END --------------------
START --------------------
<class 'float'>
nan
nan
END --------------------
START --------------------
<class 'numpy.float64'>
nan
nan
END --------------------
START --------------------
<class 'str'>
기저귀
False 0
str
END --------------------
START --------------------
<class 'str'>
M900009357
str
END --------------------
START --------------------
<class 'str'>
네이쳐러브메레 소프트핏 팬티 특대형 4팩
str
END --------------------
START --------------------
<class 'str'>
유통기한
str
END --------------------
START --------------------
<class 'datetime.datetime'>
2024-01-12 00:00:00
When the function meet a datetime.datetime. it blocks everything it does not even continue and print the END--------------------
This is a very weird behavious I would like to understand, I don't think I have seen such a thing before. If you could help me that would be awesome :D
If you wanna see the full code of this translator here is it : https://github.com/mathias-vandaele/xlsx-translator
Thank you all
math.isnan(value)
when value is datetime.datetime seems to break silently
using pd.isna(value)
froms pandas
seems to have resolved the issue