I am currently working on a project in Computer Vision that involves detecting license plates of vehicles, capturing them, and storing them in my folder. I have coded it so that when the license plate object is completely inside the rectangle box, the script captures and stores the image. However, I'm facing an issue where the script is storing multiple images for a single object. Stored images Video that runs on background
import cv2
import os
from ultralytics import YOLO
# Path to the source video file
video_path = 'sample.mp4'
# Directory to save captured images
output_directory = 'captured_images'
os.makedirs(output_directory, exist_ok=True)
# Create a YOLO object with your trained model
yolo = YOLO('./models/license_plate_detector.pt') # Replace 'best.pt' with the actual path to your trained weights
# Create a VideoCapture object
cap = cv2.VideoCapture(video_path)
# Check if the video capture is successful
if not cap.isOpened():
print("Error: Could not open video.")
exit()
# Get video properties
frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
aspect_ratio = frame_width / frame_height
# Set the window size to match the aspect ratio
window_height = 768
window_width = int(window_height * aspect_ratio)
cv2.namedWindow("Video with Rectangle", cv2.WINDOW_NORMAL)
cv2.resizeWindow("Video with Rectangle", window_width, window_height)
# Coordinates and dimensions of the rectangle
x, y, width, height = 2200, 1600, 700, 100
# Define a flag to keep track of whether the object is inside the rectangle
object_inside = False
image_counter = 1
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Detect objects using your trained YOLO model
results = yolo(frame)
# Access the detected boxes
detected_boxes = results[0].boxes.xyxy # Assuming the boxes are stored in results.boxes.xyxy
# Check if any detected box is inside the rectangle
object_inside = any(
x < (box[0] + box[2]) / 2 < x + width and y < (box[1] + box[3]) / 2 < y + height
for box in detected_boxes
)
for box in detected_boxes:
x_min, y_min, x_max, y_max = box[:4]
center_x = (x_min + x_max) / 2
center_y = (y_min + y_max) / 2
if x < center_x < x + width and y < center_y < y + height:
# Capture the frame portion corresponding to the detected object
object_frame = frame[int(y_min):int(y_max), int(x_min):int(x_max)]
# Save the object frame as an image
image_filename = os.path.join(output_directory, f"captured_object_{image_counter}.jpg")
cv2.imwrite(image_filename, object_frame)
image_counter += 1
# Draw the rectangle on the frame
cv2.rectangle(frame, (x, y), (x + width, y + height), (0, 255, 0), 2)
# Display the frame
cv2.imshow("Video with Rectangle", frame)
# Check for the 'q' key to exit the loop
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release the video capture object
cap.release()
# Close any open windows
cv2.destroyAllWindows()
I only need one image per object.
Well, that is what your code does: You break down your video into frames, and run the detector to identify the relevant object. Since many frames may have the same license plate, you save it each time it is detected.
What you might want to use is Tracker. You don't even have to train it again, you can use your existing model. This link contains everything you need.