I am training a CNN model on the COCO dataset, and I am getting this error after a few iterations. The error is not consistent because I got this error in 1100 iterations, once in 4500 iterations and one time in 8900 iterations (all of them in 1 epoch).
I thought that this error might be a bug in the new version of imageio, so I changed the version to 2.3.0 but still, after 8900 iterations in 1 epoch, I am getting this error.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-46-4b33bec4a89e> in <module>()
52
53 # train for one epoch
---> 54 train_loss = train(train_loader, model, [criterion1, criterion2], optimizer)
55 print('train_loss: ',train_loss)
56
4 frames
/usr/local/lib/python3.7/dist-packages/torch/_utils.py in reraise(self)
432 # instantiate since we don't know how to
433 raise RuntimeError(msg) from None
--> 434 raise exception
435
436
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "<ipython-input-34-4c8722b5b16b>", line 143, in __getitem__
image = imageio.imread(img_path, pilmode='RGB')
File "/usr/local/lib/python3.7/dist-packages/imageio/core/functions.py", line 206, in imread
reader = read(uri, format, 'i', **kwargs)
File "/usr/local/lib/python3.7/dist-packages/imageio/core/functions.py", line 129, in get_reader
return format.get_reader(request)
File "/usr/local/lib/python3.7/dist-packages/imageio/core/format.py", line 168, in get_reader
return self.Reader(self, request)
File "/usr/local/lib/python3.7/dist-packages/imageio/core/format.py", line 217, in __init__
self._open(**self.request.kwargs.copy())
TypeError: _open() got an unexpected keyword argument 'pilmode'
I've had this error before. The TLDR is that you can't assume all of your data is clean and able to be parsed. You aren't loading the data in order as far as I can tell either. You may even have data shuffling enabled. With all of that in mind you should not expect it to fail determinisitically at iteration 100 or 102 or anything.
The issue comes down to one (or more) of the files in COCO dataset is either corrupted or is of a different format. You can process the images in order with a batchsize of 1 and print out the file name to see which one it is.
To "fix" this issue you can do one of several things:
See here as an example failure scenario when loading in images with imageio.