pythonmachine-learningpytorch

The algorithm cannot find the images


I'm developing an algorithm using Yolo-nas , I prepared the dataset with labelImg . I'm using Python 3.10.11 to do this algorithm together with super-gradient supervision. The problem is the following: The algorithm loads the data but when plotting the image it shows that it cannot find the image in the directory, I carried out some tests with other algorithms and it can find the path to the directory. I suspect it's the super-gradient version (3.7.1)

The error starts when I have to plot my training data

FileNotFoundError :dataset\\images\\train\\img1.png was not found. 
Please make sure that the dataset was downloaded and that the path is correct

note: the images in the dataset were pdfs and I converted them to png to be able to use them in the labelImg and identify the object classes

import torch
torch.__version__

from tqdm.notebook import tqdm
from super_gradients.training import dataloaders
from super_gradients.training.dataloaders.dataloaders import coco_detection_yolo_format_train, coco_detection_yolo_format_val
from super_gradients.training import models
from super_gradients.training.losses import PPYoloELoss
from super_gradients.training.metrics import DetectionMetrics_050
from super_gradients.training.models.detection_models.pp_yolo_e import PPYoloEPostPredictionCallback

dataset_params = {
    'data_dir': "nf/dataset", 
    'train_images_dir': "dataset/images/train",
    'train_labels_dir': "dataset/labels/train",
    'val_images_dir': "dataset/images/val",
    'val_labels_dir': "dataset/labels/val",
    'classes': ['cabecalho', 'assinatura', 'rodape']
}

MODEL_ARCH = 'yolo_nas_l'
DEVICE = 'cuda' if torch.cuda.is_available() else "cpu"
BATCH_SIZE = 10 
MAX_EPOCHS = 12
CHECKPOINT_DIR = '\checkpoint'
EXPERIMENT_NAME = "nf"


dados_treino = coco_detection_yolo_format_train(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['train_images_dir'],
        'labels_dir': dataset_params['train_labels_dir'],
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 1
    }
)

val_dados = coco_detection_yolo_format_val(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['val_images_dir'],
        'labels_dir': dataset_params['val_labels_dir'],  
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 1
    }
)

dados_treino.dataset.transforms

dados_treino.dataset.plot()

Solution

  • here is how you should pass datasets to the super_gradients. I have tested this folder structure on MacOS. On windows you need to rewrite all "/" symbols with the "\"

    # this is an example for macOS/Linux
    # tested with coco8 dataset
    # https://github.com/ultralytics/assets/releases/download/v0.0.0/coco8.zip
    
    dataset_params = {
        'data_dir': "dataset", 
        'train_images_dir': "images/val",
        'train_labels_dir': "labels/val",
        'val_images_dir': "images/val",
        'val_labels_dir': "labels/val",
        'classes': ['cabecalho', 'assinatura', 'rodape']
    }
    

    Folder structure is the following:

    - dataset
    - dataset/images
    - dataset/images/val
    - dataset/images/val/000000000049.jpg
    - ...
    - dataset/images/train
    - dataset/images/train/000000000034.jpg
    - ...
    - dataset/labels/val
    - dataset/labels/val/000000000049.txt 
    - ...
    - dataset/labels/train
    - dataset/labels/train/000000000034.txt 
    - ...