I was trying to make an Image Captioning model in a similar fashion as in here I used ResNet50 instead off VGG16 and also had to use progressive loading via model.fit_generator() method. I used ResNet50 from here and when I imported it by setting include_top = False, It gave me features of photo in shape of {'key': [[[[value1, value2, .... value 2048]]]]}, where "key" is the image id. Here's my code of captionGenerator function:-
def createCaptions(tokenizer, photoData, MaxLength, model):
for key, feature in photoData.items():
inSeq = "START"
for i in range(MaxLength):
sequence = tokenizer.texts_to_sequences([inSeq])[0]
sequence = pad_sequences([sequence], maxlen = MaxLength)
ID = model.predict([np.array(feature[0][0][0]), sequence])
ID = np.argmax(ID)
ID = word_for_id(ID)
if ID is None:
break
inSeq += " " + ID
if ID == "END":
break
print(inSeq)
The function word_for_id is :-
def word_for_id(integer, tokenizer):
for word, index in tokenizer.word_index.items():
if index == integer:
return word
return None
I had generated photoData via:-
features = {}
for images in os.listdir(args["image"]):
filename = args["image"] + '/' + images
image = load_img(filename, target_size = inputShape)
image = img_to_array(image)
image = np.expand_dims(image, axis = 0)
image = preprocess(image)
pred = resnet.predict(image)
image_id = images.split('.')[0]
features[image_id] = pred
print('>{}'.format(images))
features is my photoData dictionary.
The problem is, in training data photos descriptions which I generate through:-
def train_test_data(filename):
DataFile = open(filename, 'r')
Data = DataFile.read()
DataFile.close()
ImageID = []
textDataFile = pickle.load(open('descriptions.pkl', 'rb'))
for line in Data.split('\n'):
if len(line) < 1:
continue
ImageID.append(line.split('.')[0])
Data = {}
for key in textDataFile:
if key in ImageID:
Data[key] = textDataFile[key]
for ID in Data:
for i in range(len(Data[ID])):
l = Data[ID][i]
l = "START " + " ".join(l) + " END"
Data[ID][i] = l
return Data
Here, I added "START" and "END" at the begginning and end of each sentences of description respectively. But in tokenizer.word_index, "START" and "END" are not found as keys. That is:-
k = pickle.load(open('word_index.pkl', 'rb'))
print("START" in k)
This gives result as False. Please explain to me why this is happening. If I do:-
k = pickle.load(open('word_index.pkl', 'rb'))
print("start" in k)
The answer comes out True.
That is because by default the Tokenizer
lowers the words when fitting based on the lower=True
parameter. You can either use the lower case or pass lower=False
when creating the tokenizer, documentation.