After I label images and have ".json" format file, I do not know how to merge image and these annotations. I'm a newbie in computer vision
I am trying to detect human activities and bounding box these activities in video without using YOLO.
I suppose you are using python and your json file looks like this:
{
"image1.jpg": "cat",
"image2.jpg": "dog",
...
}
If yes, I suggest you to use the library json:
import json
with open('annotations.json', 'r') as file:
annotations = json.load(file)
And then, with tf.data.Dataset you can pair the data with the labels:
import tensorflow as tf
# Assuming you have a list of image paths and corresponding labels
image_paths = list(annotations.keys())
labels = list(annotations.values())
dataset = tf.data.Dataset.from_tensor_slices((image_paths, labels))
If you want to create an input pipeline, I suggest you to explore: ImageDataGenerator, tf.data.Dataset , and the guides available on the TensorFlow website.