pythonyolov8roboflow

Detection small object in game with yolov8 model


I've just started learning about artificial intelligence image capture.I learned that I needed a dataset, so I took a few screenshots.I labeled the objects with roboflow and trained the model with yolov8. But no matter what I did, I could not detect the fish (small, moving, shadowy) object correctly.I think the Preprocessing and Augmentation sections are very important in roboflow. I applied tiling, but I didn't understand what I should do on the code side. I'm so confused :D I need your help.

roboflow dataset:

https://universe.roboflow.com/test-uifst/test55/dataset/3

The model was trained on : https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-yolov8-object-detection-on-custom-dataset.ipynb#

python :

import cv2
from ultralytics import YOLO
import supervision as sv
import mss
import numpy as np
import pyautogui
import time

model = YOLO('best.pt')
sct = mss.mss()
monitor = sct.monitors[1]

while True:
    screenshot = sct.grab(monitor)
    img = np.array(screenshot)

    # RGB (3 CHANNEL)
    img = cv2.cvtColor(img, cv2.COLOR_BGRA2RGB)

    results = model(img)

    for result in results:
        for bbox in result.boxes:
            
            if bbox.cls == 0:  # "fish" 
                x1, y1, x2, y2 = bbox.xyxy[0]


Solution

  • As discussed in the comments, it seems the resolution of your screen is big enough to make it more difficult for model to detect small objects.

    One solution is to take screenshot of smaller area and run inference on that instead of whole screen:

    import cv2
    from ultralytics import YOLO
    import supervision as sv
    import mss
    import numpy as np
    import pyautogui
    import time
    
    model = YOLO('best.pt')
    sct = mss.mss()
    monitor = sct.monitors[1]
    
    while True:
        screenshot = sct.grab((100, 100, 600, 600))
        img = np.array(screenshot)
    
        # RGB (3 CHANNEL)
        img = cv2.cvtColor(img, cv2.COLOR_BGRA2RGB)
    
        results = model(img)
    
        for result in results:
            for bbox in result.boxes:
                
                if bbox.cls == 0:  # "fish" 
                    x1, y1, x2, y2 = bbox.xyxy[0]
    

    However it would be a challenge to ensure your objects are in the area.

    You can also use Supervision InferenceSlicer

    Nice blog post from supervision on detecting small objects: https://supervision.roboflow.com/develop/how_to/detect_small_objects/#inference-slicer

    Below is your example modified to use slicer, I knocked it without checking if it runs so errors are expected, my intention was to give you an idea and a starting point:

    import cv2
    from ultralytics import YOLO
    import supervision as sv
    import mss
    import numpy as np
    import pyautogui
    import time
    
    
    model = YOLO('best.pt')
    
    def slicer_callback(image_slice: np.ndarray):
        h, w, *_ = image_slice.shape
        result = model(image_slice)
        detections = sv.Detections.from_inference({
            "predictions": [
                {
                    "class": bbox.class_name,  # your class name here, I guessed this property will be available
                    "class_id": bbox.cls,
                    "x": bbox.xyxy[0][0] + (bbox.xyxy[0][2] - bbox.xyxy[0][0]) // 2,  # center point
                    "y": bbox.xyxy[0][1] + (bbox.xyxy[0][3] - bbox.xyxy[0][1]) // 2,  # center point
                    "width": (bbox.xyxy[0][2] - bbox.xyxy[0][0]) // 2,
                    "height": (bbox.xyxy[0][3] - bbox.xyxy[0][1]) // 2,
                    "confidence": bbox.confidence,  # your confidence here, I guessed this property will be available
                },
                for bbox in result.boxes for result in results
            ],
            "image": {"width": w, "height": h}
        })
        return detections
    
    
    slicer = sv.InferenceSlicer(
        callback=slicer_callback,
        slice_wh=(slice_width, slice_height),
        overlap_ratio_wh=(overlap_ratio_width, overlap_ratio_height),
        iou_threshold=iou_threshold,
        thread_workers=1
    )
    
    
    while True:
        screenshot = sct.grab(monitor)
        img = np.array(screenshot)
    
        # RGB (3 CHANNEL)  -- check if this step is required, cv.imshow("", img) before and after cvtColor
        img = cv2.cvtColor(img, cv2.COLOR_BGRA2RGB)
    
        result: sv.Detections = slicer(image)
    
        fishes = result[result.class_id == 0]
        print(fishes.xyxy)