pythontensorflowkerasctc

how to implement word beam search ctc to keras?


I am building a handwriting recognition model which currently has 88% validation accuracy. I came across this github page which can help the model achieve more accurate predictions using a dictionary.

The problem is I don't know how to implement this in my current model. This is my current ctc function which is copied from a keras tutorial. How can I modify this to add a dictionary?

class CTCLayer(keras.layers.Layer):
    def __init__(self, name=None):
        super().__init__(name=name)
        self.loss_fn = keras.backend.ctc_batch_cost

    def call(self, y_true, y_pred):
        batch_len = tf.cast(tf.shape(y_true)[0], dtype="int64")
        input_length = tf.cast(tf.shape(y_pred)[1], dtype="int64")
        label_length = tf.cast(tf.shape(y_true)[1], dtype="int64")

        input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
        label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")
        loss = self.loss_fn(y_true, y_pred, input_length, label_length)
        self.add_loss(loss)

        # At test time, just return the computed predictions.
        return y_pred

This is the implementation of word beam search on the original github page. To be specific my main problem is getting the loss from the function. And later on returning the loss to the ctc layer.


chars = ''.join(self.char_list)
word_chars = open('../model/wordCharList.txt').read().splitlines()[0]
corpus = open('../data/corpus.txt').read()

# decode using the "Words" mode of word beam search
from word_beam_search import WordBeamSearch
self.decoder = WordBeamSearch(50, 'Words', 0.0, corpus.encode('utf8'), chars.encode('utf8'),word_chars.encode('utf8'))

I tried looking at github pages that implement this in their projects but they seem to be using tensorflow v1 which is kind of confusing for me since I am a beginner on this field. Any response would be appreciated thank you.


Solution

  • Word beam search is only a decoder and not a loss function. For loss, you still use the "standard" CTC loss that ships with Keras. This means in your training code you don't even have to think about word beam search.

    Word beam search is applied in inference only. All you have to do is to convert the Tensors to numpy arrays, for details see documentation.