I am working on a LIDAR problem where I am trying to map a set of 3D Histogram measurements (size [64,64,128]) to an original depth map (size [64,64]). I have a folder full of .mat files, each with their own measurement of the recorded histogram and ground truth depth.
In the example picture I have the .mat file of an example measurement. All that matters for now is SPAD and depth.
Example .mat
File:
I would like to create a Dataset class in PyTorch that has a SPAD measurement, but am stuck on how to do so given the folder.
You can read .mat
files in PyTorch using:
mat = scipy.io.loadmat('drive/My Drive/Colab Notebooks/LIDAR/SPAD_NYU/SPAD_Counts/spad_measurements_dining_room_0001a_0051.mat')
spad = mat['spad']
depth = mat['depth']
And I know you can create a custom Dataset using:
from torch.utils.data import Dataset, IterableDataset, Dataloader
class CTSet(Dataset):
def __init__(self):
def __len__(self):
return len(self.dataset)
def __getitem__(self,index):
return self.dataset[idx],self.labels[idx]
Followed by using Dataloader to train. I have the folder, SPAD_Counts, mounted to Drive in Google Colab. I would like to:
I have looked at other implementations of custom datasets, however they are all using CSV files that make the process easier. If I can't make a CSV file for this kind of info, since it's a 3D array mapped to a 2D array, what do I do?
Thank you!
Indeed you should write your own Dataset
for these .mat
files.
You can use os.listdir
to list all files in a subfolder. torchvision
library has several very useful transformations/augmentations that you can use. Specifically, torchvision.transforms.ToTensor
that converts np.array
s into torch.tensors
.
Overall, your custom Dataset
would look something like:
from torch.utils.data import Dataset, IterableDataset, Dataloader
class CTSet(Dataset):
def __init__(self, base_dir, transforms):
super(CTSet, self).__init__()
self.transforms = transforms # make sure transforms has at leat ToTensor()
self.data = [] # store all data here
# go over all files in base_dir
for file in os.listdir(base_dir):
if file.endswith('.mat'):
mat = scipy.io.loadmat(os.path.join(base_dir, file))
self.data.append((mat['spad'], mat['depth']))
def __len__(self):
return len(self.data)
def __getitem__(self,index):
spad, depth = self.data[index]
return self.transforms(spad), self.transform(depth)