I have a torch.utils.data.DataLoader. I have created them with the following code.
transform_train = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])
trainset = CIFAR100WithIdx(root='.',
train=True,
download=True,
transform=transform_train,
rand_fraction=args.rand_fraction)
train_loader = torch.utils.data.DataLoader(trainset,
batch_size=args.batch_size,
shuffle=True,
num_workers=args.workers)
But when I run the following code I get an error.
train_loader_2 = []
for i, (inputs, target, index_dataset) in enumerate(train_loader):
train_loader_2.append((inputs, target, index_dataset))
The error is
Traceback (most recent call last):
File "main_superloss.py", line 460, in <module>
main()
File "main_superloss.py", line 456, in main
main_worker(args)
File "main_superloss.py", line 374, in main_worker
train_loader, val_loader = get_train_and_val_loader(args)
File "main_superloss.py", line 120, in get_train_and_val_loader
for i, (inputs, target, index_dataset) in enumerate(train_loader):
File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 804, in __next__
idx, data = self._get_data()
File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 771, in _get_data
success, data = self._try_get_data()
File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 724, in _try_get_data
data = self.data_queue.get(timeout=timeout)
File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 284, in rebuild_storage_fd
fd = df.detach()
File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/resource_sharer.py", line 58, in detach
return reduction.recv_handle(conn)
File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/reduction.py", line 185, in recv_handle
return recvfds(s, 1)[0]
File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/reduction.py", line 161, in recvfds
len(ancdata))
RuntimeError: received 0 items of ancdata
The reason I want to get the data inside a list is because I want to reorder the samples. And not in a random way but in a particular way. How can I do that?
I was facing a similar issue with my code and based on some discussions (check #1, #2, #3). I used ulimit -n 2048
to increase the maximum number of file descriptors a process can have. You can read more about ulimit here.
About the issue - The discussions suggest that it has to do something with pytorch’s forked multiprocessing code.
On the second part of your question, how to reorder a dataloader - You can refer this answer