recurrent-neural-networkbeam-search

What's the difference between a greedy decoder RNN and a beam decoder with k=1?


Given a state vector we can recursively decode a sequence in a greedy manner by generating each output successively, where each prediction is conditioned on the previous output. I read a paper recently that described using beam search during decoding with a beam size of 1 (k=1). If we are only retaining the best output at each step, isn't this just the same thing as greedy decoding, and offers none of the benefits typically provided by beam search?


Solution

  • Finally found an answer: beam size of 1 is the same as greedy search.

    From "Abstractive Sentence Summarization with Attentive Recurrent Neural Networks":

    "k refers to the size of the beam for generation; k = 1 implies greedy generation."