I have a question for azure custom vision. I have a custom vision project for object detection. And I use the python SDK to create the project (see that: https://learn.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/python-tutorial-od). But I found something wrong in the process of uploading. For example, there is a picture that has 3 persons in this picture. So I tag 3 same class “person” in this picture. But after uploading, I just found 1 "person" tagged in this picture at the custom vision website. But the other class is fine, such as can also have "person", "car", and "scooter" at this picture. It looks like that can only have single same class at the picture.
I tried to use python SDK (see that: https://learn.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/python-tutorial-od) to upload my picture and tag information.
A0_tag = trainer.create_tag(project.id, "A0")
A1_tag = trainer.create_tag(project.id, "A1")
A2_tag = trainer.create_tag(project.id, "A2")
A0_image_regions={
"0001.jpg":[0.432291667,0.28125,0.080729167,0.09765625],
"0001.jpg":[0.34765625,0.385742188,0.131510417,0.135742188],
"0001.jpg":[0.479166667,0.385742188,0.130208333,0.135742188],
"0003.jpg":[0.19921875,0.158203125,0.083333333,0.099609375]
}
The above code can see that I uploaded three "A0" class in 0001.jpg. But in the GUI interface on the website, I can only see that one "A0" class exists above 0001.jpg finally. Is there anything solution that can solve this problem?
Based on cthrash code. I made some changes to the code to make it work. Here is the modified code:
A0_tag = trainer.create_tag(project.id, "TestA")
A1_tag = trainer.create_tag(project.id, "TestB")
A2_tag = trainer.create_tag(project.id, "TestC")
A0_image_regions = {
A0_tag.id : [
("2300.png",[0.787109375,0.079681275,0.068359375,0.876494024]),
("0920.png",[0.2109375,0.065737052,0.059570313,0.892430279]),
("0920.png",[0.291015625,0.061752988,0.05859375,0.894422311]),
]
}
A1_image_regions = {
A1_tag.id : [
("2000.png",[0.067382813,0.073705179,0.030273438,0.878486056]),
("2000.png",[0.126953125,0.075697211,0.030273438,0.878486056]),
("2000.png",[0.184570313,0.079681275,0.030273438,0.878486056]),
("2000.png",[0.232421875,0.079681275,0.030273438,0.878486056]),
],
}
A2_image_regions = {
A2_tag.id : [
("1400.png",[0.649414063,0.065737052,0.104492188,0.894422311]),
("2300.png",[0.602539063,0.061752988,0.106445313,0.892430279]),
("0920.png",[0.634765625,0.067729084,0.124023438,0.88247012]),
("0800.png",[0.579101563,0.06374502,0.04296875,0.888446215]),
],
}
regions_map = {}
for tag_id in A0_image_regions:
for filename,[x,y,w,h] in A0_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A0_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
for tag_id in A1_image_regions:
for filename,[x,y,w,h] in A1_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A1_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
for tag_id in A2_image_regions:
for filename,[x,y,w,h] in A2_image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id=A2_tag.id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
tagged_images_with_regions = []
for filename in regions_map:
regions = regions_map[filename]
with open("<your path>" + filename, mode="rb") as image_contents:
tagged_images_with_regions.append(ImageFileCreateEntry(name=filename, contents=image_contents.read(), regions=regions))
upload_result = trainer.create_images_from_files(project.id, images=tagged_images_with_regions)
You've created A0_image_regions
but are overriding the key whenever you have more than one bounding box for any given image. So that's not going to work.
But perhaps more importantly, you need to call the trainer with the image as the primary objects, with all the associated image regions lumped together. In other words, in you're example 0001.jpg
has three instances of A0
, but it may also have instances of A1
and/or A2
, and this would neet to be a single ImageFile entry. So I'd modify the sample along the lines of the following:
A0_tag = trainer.create_tag(project.id, "A0")
A1_tag = trainer.create_tag(project.id, "A1")
A2_tag = trainer.create_tag(project.id, "A2")
image_regions = {
A0_tag.id : [
("0001.jpg", [0.432291667,0.28125,0.080729167,0.09765625]),
("0001.jpg", [0.34765625,0.385742188,0.131510417,0.135742188]),
("0001.jpg", [0.479166667,0.385742188,0.130208333,0.135742188]),
("0003.jpg", [0.19921875,0.158203125,0.083333333,0.099609375])
],
A1_tag.id : [] # add images/bounding boxes for A1
A2_tag.id : [] # add images/bounding boxes for A2
}
regions_map = {}
for tag_id in image_regions:
for filename,[x,y,w,h] in image_regions[tag_id]:
regions = regions_map.get(filename,[])
regions.append(Region(tag_id, left=x, top=y, width=w, height=h))
regions_map[filename] = regions
tagged_images_with_regions = []
for filename in regions_map:
regions = regions_map[filename]
with open(base_image_url + filename, mode="rb") as image_contents:
tagged_images_with_regions.append(ImageFileCreateEntry(name=filename, contents=image_contents.read(), regions=regions))
upload_result = trainer.create_images_from_files(project.id, images=tagged_images_with_regions)