regexpython-3.xlistimagedata

Python with regex


please, help me.

I have a list with path files as below:

[PosixPath('/home/angelo/Documentos/IA/Fast.ai-v3/nbs/dl1/papagaio/Papagaio_verdadeiro.jpg'),
 PosixPath('/home/angelo/Documentos/IA/Fast.ai-v3/nbs/dl1/papagaio/papagaio_amarelo.jpg'),
 PosixPath('/home/angelo/Documentos/IA/Fast.ai-v3/nbs/dl1/papagaio/zoom_RACAO_ALIMENTO_NUTROPICA_PAPAGAIO_AVES_PASSAROS1.jpg'),
 PosixPath('/home/angelo/Documentos/IA/Fast.ai-v3/nbs/dl1/papagaio/papagaio_ok.jpg'),
 PosixPath('/home/angelo/Documentos/IA/Fast.ai-v3/nbs/dl1/papagaio/alx_papagaio_20070327_01_original.jpeg')]

This list was created using get_image_files.

This is a list of images of parrots. Here in Brazil papagaio = parrot.

In order to use the filename for classification in machine learning, I tried to use the following regex:

pat = r'.[^\/.]+.jpg$'

However, after using it in an ImageDataBunch...

data_papagaio = ImageDataBunch.from_name_re(papagaio_path, papagaio_files, pat, ds_tfms=get_transforms(), size=224, bs=bs
                                  ).normalize(imagenet_stats)

I received the following error message in return:

IndexError: no such group

And I do not know how to solve it. Can someone help me?

Just for clarification, I'm trying to reproduce lesson 1 from the fast.ai course using some files on my hard drive.


Solution

  • The metod is looking for a capturing group value, you need to set a pair of unescaped parentheses around the file name pattern.

    Also, it seems you have both jpg and jpeg, so you need jpe?g, not just jpg.

    Use

    pat = r'([^/.]+)\.jpe?g$'
    

    See the regex demo