pythontensorflowspeech-recognitionkerasbeam-search

CTC Beam Search using Tensorflow Backend


The keras documentation and tensorflow provide a function ctc_decode which does the ctc beam search decoding for the output of the network. The documentation does not provide example usage for the decoder. https://github.com/igormq/ctc_tensorflow_example/blob/master/ctc_tensorflow_example.py provides an example usage, but I am not able to retrieve the text transcript decoded.

There are questions on stackoverflow for printing the output tensor, but I am not getting any output as the output of my tensor is of shape(?,?).

>>> pred.shape
(1, 489, 29)
>>> dec, logp = K.ctc_decode(K.variable(pred, dtype='float32'), 
K.variable([489],dtype='int32'),greedy=False)
>>> dec
[<tf.Tensor 'SparseToDense:0' shape=(?, ?) dtype=int64>]
>>> dec[0]
<tf.Tensor 'SparseToDense:0' shape=(?, ?) dtype=int64>
>>> s = tf.Session()
>>> s.run(tf.global_variables_initializer())
>>> print dec[0].eval(session=s)
[[0]]

The pred is the output of the neural network. Kindly help me understand what is it that is going wrong as I think I should be getting numeric values for the characters decoded for the prediction but what I am getting is [[0]].


Solution

  • If you are using Keras, try K.get_value(dec[0]).