pythonfileoperating-systemtensorflow2.0hidden-files

Hidden file in my training dataset which makes tensorflow return "Unknown image file format. One of JPEG, PNG, GIF, BMP required."


I have tensorflow model and during the first portions of training the first epoch it works until it reaches about the midpoint (735/2201 [=========>....................]) and then it returns the error in the title.

First I made a script to remove all the files in that directory which dont end with a .jpg but nothing changed.

import os
for file in os.listdir(path):
    if not file.endswith('.jpg'):
        os.remove(os.path.join(path,file))

Then I opened my macs bash and listed all the files in the directory to see any hidden files but it was all just jpgs.

EDIT:

Nessuno's answer is correct but you have to iterate over the absolute path and not just the file name, something like this should work

import os
import imghdr
#define your path
path = '' 

files = os.listdir(path)


for file in files:
    format = imghdr.what(os.path.join(path, file))
    if format != 'jpeg':
        os.remove(os.path.join(path, file))

I ended up removing 5 files which were not jpegs


Solution

  • There's some file in your path that has the .jpg extension but it contains a different file format.

    You can use the imghdr library (that comes with Python itself: https://docs.python.org/3/library/imghdr.html) and check if the header is equal to jpeg and remove the image int that case.

    In shorty you can change you script to something like:

    import os
    import imghdr
    
    for file in os.listdir(path):
        if imghdr.what(file) != 'jpeg':
            os.remove(os.path.join(path, file))