[SOLVED] How to resolve the error: RuntimeError: received 0 items of ancdata

How to resolve the error: RuntimeError: received 0 items of ancdata

I have a torch.utils.data.DataLoader. I have created them with the following code.

transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

trainset = CIFAR100WithIdx(root='.',
                           train=True,
                           download=True,
                           transform=transform_train,
                           rand_fraction=args.rand_fraction)

train_loader = torch.utils.data.DataLoader(trainset,
                                           batch_size=args.batch_size,
                                           shuffle=True,
                                           num_workers=args.workers)

But when I run the following code I get an error.

train_loader_2 = []
for i, (inputs, target, index_dataset) in enumerate(train_loader):
    train_loader_2.append((inputs, target, index_dataset))

The error is

Traceback (most recent call last):
  File "main_superloss.py", line 460, in <module>
    main()
  File "main_superloss.py", line 456, in main
    main_worker(args)
  File "main_superloss.py", line 374, in main_worker
    train_loader, val_loader = get_train_and_val_loader(args)
  File "main_superloss.py", line 120, in get_train_and_val_loader
    for i, (inputs, target, index_dataset) in enumerate(train_loader):
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 804, in __next__
    idx, data = self._get_data()
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 771, in _get_data
    success, data = self._try_get_data()
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 724, in _try_get_data
    data = self.data_queue.get(timeout=timeout)
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 284, in rebuild_storage_fd
    fd = df.detach()
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/reduction.py", line 185, in recv_handle
    return recvfds(s, 1)[0]
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/reduction.py", line 161, in recvfds
    len(ancdata))
RuntimeError: received 0 items of ancdata

The reason I want to get the data inside a list is because I want to reorder the samples. And not in a random way but in a particular way. How can I do that?

Solution

I was facing a similar issue with my code and based on some discussions (check #1, #2, #3). I used ulimit -n 2048 to increase the maximum number of file descriptors a process can have. You can read more about ulimit here.

About the issue - The discussions suggest that it has to do something with pytorch’s forked multiprocessing code.

On the second part of your question, how to reorder a dataloader - You can refer this answer