[SOLVED] Error received when retrieving dataset in fast.ai: TypeError: '<' not supported between instances of 'L' and 'int'

Error received when retrieving dataset in fast.ai: TypeError: '<' not supported between instances of 'L' and 'int'

I am following this article on medium for this contest.

Everything seems to be fine up to the point where I am retrieving the dataset where I am getting a: TypeError: '<' not supported between instances of 'L' and 'int'

My code is:

img_pipe = Pipeline([get_filenames, open_ms_tif])
mask_pipe = Pipeline([label_func, partial(open_tif, cls=TensorMask)])

db = DataBlock(blocks=(TransformBlock(img_pipe), 
                       TransformBlock(mask_pipe)),
               splitter=RandomSplitter(valid_pct=0.2, seed=42)
              )

ds = db.datasets(source=train_files)
dl = db.dataloaders(source=train_files, bs=4)

train_files is a list of Paths. Here's the first 5.

[Path('nasa_rwanda_field_boundary_competition/nasa_rwanda_field_boundary_competition_source_train/nasa_rwanda_field_boundary_competition_source_train_09_2021_08/B01.tif'),
 Path('nasa_rwanda_field_boundary_competition/nasa_rwanda_field_boundary_competition_source_train/nasa_rwanda_field_boundary_competition_source_train_39_2021_04/B01.tif'),
 Path('nasa_rwanda_field_boundary_competition/nasa_rwanda_field_boundary_competition_source_train/nasa_rwanda_field_boundary_competition_source_train_12_2021_11/B01.tif'),
 Path('nasa_rwanda_field_boundary_competition/nasa_rwanda_field_boundary_competition_source_train/nasa_rwanda_field_boundary_competition_source_train_06_2021_10/B01.tif'),
 Path('nasa_rwanda_field_boundary_competition/nasa_rwanda_field_boundary_competition_source_train/nasa_rwanda_field_boundary_competition_source_train_08_2021_08/B01.tif')]

the full stack trace is:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [66], in <cell line: 10>()
      2 mask_pipe = Pipeline([label_func, partial(open_tif, cls=TensorMask)])
      4 db = DataBlock(blocks=(TransformBlock(img_pipe), 
      5                        TransformBlock(mask_pipe)),
      6                splitter=RandomSplitter(valid_pct=0.2, seed=42)
      7               )
---> 10 ds = db.datasets(source=train_files)
     11 dl = db.dataloaders(source=train_files, bs=4)

File /usr/local/lib/python3.9/dist-packages/fastai/data/block.py:147, in DataBlock.datasets(self, source, verbose)
    145 splits = (self.splitter or RandomSplitter())(items)
    146 pv(f"{len(splits)} datasets of sizes {','.join([str(len(s)) for s in splits])}", verbose)
--> 147 return Datasets(items, tfms=self._combine_type_tfms(), splits=splits, dl_type=self.dl_type, n_inp=self.n_inp, verbose=verbose)

File /usr/local/lib/python3.9/dist-packages/fastai/data/core.py:451, in Datasets.__init__(self, items, tfms, tls, n_inp, dl_type, **kwargs)
    442 def __init__(self, 
    443     items:list=None, # List of items to create `Datasets`
    444     tfms:list|Pipeline=None, # List of `Transform`(s) or `Pipeline` to apply
   (...)
    448     **kwargs
    449 ):
    450     super().__init__(dl_type=dl_type)
--> 451     self.tls = L(tls if tls else [TfmdLists(items, t, **kwargs) for t in L(ifnone(tfms,[None]))])
    452     self.n_inp = ifnone(n_inp, max(1, len(self.tls)-1))

File /usr/local/lib/python3.9/dist-packages/fastai/data/core.py:451, in <listcomp>(.0)
    442 def __init__(self, 
    443     items:list=None, # List of items to create `Datasets`
    444     tfms:list|Pipeline=None, # List of `Transform`(s) or `Pipeline` to apply
   (...)
    448     **kwargs
    449 ):
    450     super().__init__(dl_type=dl_type)
--> 451     self.tls = L(tls if tls else [TfmdLists(items, t, **kwargs) for t in L(ifnone(tfms,[None]))])
    452     self.n_inp = ifnone(n_inp, max(1, len(self.tls)-1))

File /usr/local/lib/python3.9/dist-packages/fastcore/foundation.py:98, in _L_Meta.__call__(cls, x, *args, **kwargs)
     96 def __call__(cls, x=None, *args, **kwargs):
     97     if not args and not kwargs and x is not None and isinstance(x,cls): return x
---> 98     return super().__call__(x, *args, **kwargs)

File /usr/local/lib/python3.9/dist-packages/fastai/data/core.py:361, in TfmdLists.__init__(self, items, tfms, use_list, do_setup, split_idx, train_setup, splits, types, verbose, dl_type)
    359 if isinstance(tfms,TfmdLists): tfms = tfms.tfms
    360 if isinstance(tfms,Pipeline): do_setup=False
--> 361 self.tfms = Pipeline(tfms, split_idx=split_idx)
    362 store_attr('types,split_idx')
    363 if do_setup:

File /usr/local/lib/python3.9/dist-packages/fastcore/transform.py:190, in Pipeline.__init__(self, funcs, split_idx)
    188 else:
    189     if isinstance(funcs, Transform): funcs = [funcs]
--> 190     self.fs = L(ifnone(funcs,[noop])).map(mk_transform).sorted(key='order')
    191 for f in self.fs:
    192     name = camel2snake(type(f).__name__)

File /usr/local/lib/python3.9/dist-packages/fastcore/foundation.py:136, in L.sorted(self, key, reverse)
--> 136 def sorted(self, key=None, reverse=False): return self._new(sorted_ex(self, key=key, reverse=reverse))

File /usr/local/lib/python3.9/dist-packages/fastcore/basics.py:619, in sorted_ex(iterable, key, reverse)
    617 elif isinstance(key,int): k=itemgetter(key)
    618 else: k=key
--> 619 return sorted(iterable, key=k, reverse=reverse)

TypeError: '<' not supported between instances of 'L' and 'int'

I'm not sure what thing is causing the issue. Let me know if you need more of the code.

I expected the data loader to create itself successfully.

Solution

I figured it out. It seems the TransformBlocks do not like accepting a Pipeline. I changed the

TransformBlock(img_pipe), TransformBlock(mask_pipe)

TransformBlock([get_filenames, open_ms_tif]), TransformBlock([label_func, partial(open_tif, cls=TensorMask)])

which removed the Pipeline wrapper.