Yolo v8 can help to detect keypoints, but how to count the number of the keypoints that cross a specified line

def extract_and_process_tracks(self, tracks):
    boxes = tracks[0].boxes.xyxy.cpu()
    clss = tracks[0].boxes.cls.cpu().tolist()
    track_ids = tracks[0].boxes.id.int().cpu().tolist()

    self.annotator = Annotator(self.im0, self.tf, self.names)
    self.annotator.draw_region(reg_pts=self.reg_pts, color=(0, 255, 0))

    for box, track_id, cls in zip(boxes, track_ids, clss):
        self.annotator.box_label(box, label=self.names[cls], color=colors(int(cls), True))  

        # Draw Tracks
        track_line = self.track_history[track_id]
        track_line.append((float((box[0] + box[2]) / 2), float((box[0] + box[2]) / 2))
        track_line.pop(0) if len(track_line) > 30 else None

        if self.draw_tracks:
            self.annotator.draw_centroid_and_tracks(track_line,
                                                    color=(0, 255, 0),
                                                    track_thickness=self.track_thickness)

The object_counter.py provided by ultralytics can achieve the counting job, the track_line.append store the center of the

box(float((box[0] + box[2]) / 2), float((box[0] + box[2]) / 2),

but how to change the center into the keypoints coordinate, e.g. I want to count the head keypoints of animal cross a specified line.

How to get the keypoints coordinate in yolo_pose?

Solution

How to get the keypoints coordinates in yolo_pose?

Using yolov8*-pose.pt model for object tracking task we can obtain keypoint coordinates just like this:

results = model.track(frame)
keypoints_in_pixel = results[0].keypoints.xy
keypoints_normalized = results[0].keypoints.xyn

In the extract_and_process_tracks() method it will look the same:

def extract_and_process_tracks(self, tracks):
    boxes = tracks[0].boxes.xyxy.cpu()
    clss = tracks[0].boxes.cls.cpu().tolist()
    track_ids = tracks[0].boxes.id.int().cpu().tolist()
    keypoints_in_pixel = tracks[0].keypoints.xy.cpu().tolist()
    
    # the rest of method logic

len(keypoints_in_pixel) now equals the number of detected objects in a frame. Using indexing, keypoints_in_pixel[0] will return a keypoints xy pairs list for the first detected object:

[[411.5274353027344, 182.98947143554688],
 [0.0, 0.0],
 [394.4708251953125, 170.76046752929688],
 [0.0, 0.0],
 [339.785400390625, 193.0745086669922],
 [0.0, 0.0],
 [333.6002197265625, 280.1023864746094],
 [0.0, 0.0],
 [432.16461181640625, 288.9073181152344],
 [0.0, 0.0],
 [468.3426513671875, 189.08206176757812],
 [0.0, 0.0],
 [0.0, 0.0],
 [0.0, 0.0],
 [0.0, 0.0],
 [0.0, 0.0],
 [0.0, 0.0]]

To get a particular keypoint xy coordinates pair we just need to index further:

xy_0 = keypoints_in_pixel[0][0]
# [411.5274353027344, 182.98947143554688]

More information about how to work with the pose estimation keypoints results and object tracking task in YOLOv8.