pythonazurecomputer-visionyolov8microsoft-custom-vision

How to change azure custom vision annotation format to yolo v8 format In python


I have scripted to download annotated image from azure custom vision with following API. API gives probability score of an area where a license plate is detected and its corresponding bounding box. there will be multiple probabilities as image may have multiple car in it.

# Code to get bounding box details and corresponding probability for single image
url="url to your custom vision model"
headers={'content-type':'header value'}
r =requests.post(url,data=open(r"path to image.jpg","rb"),headers=headers)
print(r.content)
{"id":"someID","project":"project ID","iteration":"iteration ID","created":"2023-08-03T13:03:12.831Z","predictions":[{"probability":0.7684734,"tagId":"tage_ID","tagName":"License_plate","boundingBox":{"left":0.4307156,"top":0.5326757,"width":0.15810284,"height":0.129749}},{"probability":0.026557693,"tagId":"tag_ID","tagName":"License_plate","boundingBox":{"left":0.47290865,"top":0.5626349,"width":0.07031235,"height":0.066358685}}

Above is the output Here I am trying to visualize the bounding box in image with following code.

path = r"path to image.jpg"
image = cv2.imread(path)
image_height, image_width,channel = image.shape
pred = json.loads(r.content)
for i in pred["predictions"]:
    if i["probability"] > 0.5:
        print(i['boundingBox'])
        left = i['boundingBox']['left']* image_width
        top = i['boundingBox']['top'] * image_height
        width = i['boundingBox']['width'] * image_width
        height = i['boundingBox']['height'] * image_height
color = (255, 0, 0)
#drawing bounding box
cv2.rectangle(image,(int(left), int(top)), (int(left + width), int(top + height)) ,(0, 0, 255), 5)
license_plate_crop = image[int(left):int(top), int(left + width):int(top + height), :]
cv2.imshow('image', license_plate_crop)
cv2.waitKey(0)

When I am visualizing the cropped image I am getting a different area other than the license plate.

When tested the same image on UI provided by custom vision it is correctly showing license plate.
What I think is there is something I am missing while converting the images to yolo format.

In yolo format we get x1,y1,x2,y2 and we can draw bounding box and crop only that area with following code.

cv2.rectangle(image, (x1, y1), (x2, y2), (0, 0, 255), 5)
license_plate_crop = image[int(y1):int(y2), int(x1):int(x2), :]

So here I am trying to get a cropped image of only license plate and the custom vision format is different from yolo format, where custom vision format is

{"left":0.4307156,"top":0.5326757,"width":0.15810284,"height":0.129749}

and yolo format is x1,y1,x2,y2.
Kindly assist in converting azure vision format to yolo format.
Also if there is any APIs or methods please add.


Solution

  • To crop a cv2 image, you need the coordinates which describe the top-left and the bottom-right corners of the bounding box. These coordinates are already included in the yolo results box in the format of absolute xyxy.

    For Azure custom vision, we have slightly different coordinates: left, top, width, and height in normalized form (divided by the image width and height). 'left' and 'top' are the top-left coordinates of the bounding box, but 'width' and 'height' are the actual lengths of the bounding box sides, not the coordinates of the bottom-right corner of the box. To get them and crop the image do the following:

    # having left, top, width, and height values from your script:
    '''
    left = i['boundingBox']['left']* image_width
    top = i['boundingBox']['top'] * image_height
    width = i['boundingBox']['width'] * image_width
    height = i['boundingBox']['height'] * image_height
    '''
    y1 = top 
    x1 = left
    y2 = top + height
    x2 = left + width
    
    license_plate_crop = image[int(y1):int(y2), int(x1):int(x2), :]