pythonazuremicrosoft-custom-vision

How to uploading duplicate tags at some picture for azure custom vision?


I have a question for azure custom vision. I have a custom vision project for object detection. And I use the python SDK to create the project (see that: https://learn.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/python-tutorial-od). But I found something wrong in the process of uploading. For example, there is a picture that has 3 persons in this picture. So I tag 3 same class “person” in this picture. But after uploading, I just found 1 "person" tagged in this picture at the custom vision website. But the other class is fine, such as can also have "person", "car", and "scooter" at this picture. It looks like that can only have single same class at the picture.

I tried to use python SDK (see that: https://learn.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/python-tutorial-od) to upload my picture and tag information.

A0_tag = trainer.create_tag(project.id, "A0")
A1_tag = trainer.create_tag(project.id, "A1")
A2_tag = trainer.create_tag(project.id, "A2")

A0_image_regions={
"0001.jpg":[0.432291667,0.28125,0.080729167,0.09765625],
"0001.jpg":[0.34765625,0.385742188,0.131510417,0.135742188],
"0001.jpg":[0.479166667,0.385742188,0.130208333,0.135742188],
"0003.jpg":[0.19921875,0.158203125,0.083333333,0.099609375]
}

The above code can see that I uploaded three "A0" class in 0001.jpg. But in the GUI interface on the website, I can only see that one "A0" class exists above 0001.jpg finally. Is there anything solution that can solve this problem?


Based on cthrash code. I made some changes to the code to make it work. Here is the modified code:

A0_tag = trainer.create_tag(project.id, "TestA")
A1_tag = trainer.create_tag(project.id, "TestB")
A2_tag = trainer.create_tag(project.id, "TestC")

A0_image_regions = {
    A0_tag.id : [
                ("2300.png",[0.787109375,0.079681275,0.068359375,0.876494024]),
                ("0920.png",[0.2109375,0.065737052,0.059570313,0.892430279]),
                ("0920.png",[0.291015625,0.061752988,0.05859375,0.894422311]),
    ]
}

A1_image_regions = {
        A1_tag.id : [
                    ("2000.png",[0.067382813,0.073705179,0.030273438,0.878486056]),
                    ("2000.png",[0.126953125,0.075697211,0.030273438,0.878486056]),
                    ("2000.png",[0.184570313,0.079681275,0.030273438,0.878486056]),
                    ("2000.png",[0.232421875,0.079681275,0.030273438,0.878486056]),
    ],
}

A2_image_regions = {
        A2_tag.id : [
                ("1400.png",[0.649414063,0.065737052,0.104492188,0.894422311]),
                ("2300.png",[0.602539063,0.061752988,0.106445313,0.892430279]),
                ("0920.png",[0.634765625,0.067729084,0.124023438,0.88247012]),
                ("0800.png",[0.579101563,0.06374502,0.04296875,0.888446215]),
    ],
}



regions_map = {}
for tag_id in A0_image_regions:
    for filename,[x,y,w,h] in A0_image_regions[tag_id]:
        regions = regions_map.get(filename,[])
        regions.append(Region(tag_id=A0_tag.id, left=x, top=y, width=w, height=h))
        regions_map[filename] = regions

for tag_id in A1_image_regions:
     for filename,[x,y,w,h] in A1_image_regions[tag_id]:
        regions = regions_map.get(filename,[])
        regions.append(Region(tag_id=A1_tag.id, left=x, top=y, width=w, height=h))
        regions_map[filename] = regions


for tag_id in A2_image_regions:
     for filename,[x,y,w,h] in A2_image_regions[tag_id]:
        regions = regions_map.get(filename,[])
        regions.append(Region(tag_id=A2_tag.id, left=x, top=y, width=w, height=h))
        regions_map[filename] = regions




tagged_images_with_regions = []
for filename in regions_map:
    regions = regions_map[filename]
    with open("<your path>" + filename, mode="rb") as image_contents:



        tagged_images_with_regions.append(ImageFileCreateEntry(name=filename, contents=image_contents.read(), regions=regions))
upload_result = trainer.create_images_from_files(project.id, images=tagged_images_with_regions)

Solution

  • You've created A0_image_regions but are overriding the key whenever you have more than one bounding box for any given image. So that's not going to work.

    But perhaps more importantly, you need to call the trainer with the image as the primary objects, with all the associated image regions lumped together. In other words, in you're example 0001.jpg has three instances of A0, but it may also have instances of A1 and/or A2, and this would neet to be a single ImageFile entry. So I'd modify the sample along the lines of the following:

    A0_tag = trainer.create_tag(project.id, "A0")
    A1_tag = trainer.create_tag(project.id, "A1")
    A2_tag = trainer.create_tag(project.id, "A2")
    
    image_regions = {
        A0_tag.id : [
            ("0001.jpg", [0.432291667,0.28125,0.080729167,0.09765625]),
            ("0001.jpg", [0.34765625,0.385742188,0.131510417,0.135742188]),
            ("0001.jpg", [0.479166667,0.385742188,0.130208333,0.135742188]),
            ("0003.jpg", [0.19921875,0.158203125,0.083333333,0.099609375])
        ],
        A1_tag.id : [] # add images/bounding boxes for A1
        A2_tag.id : [] # add images/bounding boxes for A2
    }
    
    regions_map = {}
    for tag_id in image_regions:
        for filename,[x,y,w,h] in image_regions[tag_id]:
            regions = regions_map.get(filename,[])
            regions.append(Region(tag_id, left=x, top=y, width=w, height=h))
            regions_map[filename] = regions
    
    tagged_images_with_regions = []
    for filename in regions_map:
        regions = regions_map[filename]
        with open(base_image_url + filename, mode="rb") as image_contents:
            tagged_images_with_regions.append(ImageFileCreateEntry(name=filename, contents=image_contents.read(), regions=regions))
    
    upload_result = trainer.create_images_from_files(project.id, images=tagged_images_with_regions)