Over the past few months, I’ve been working on a RNN chatbot. However, I soon ran into a weird issue. In short, the network repeatedly outputted the same tokens (often <EOS> or <GO>). The longer version is on Stack Overflow.
After months of digging around, I’ve finally found the issue. When training a RNN (with TrainingHelper and BasicDecoder), Tensorflow expects the ground-truth inputs with <GO> tokens but then outputs without <GO>. Basically,
Encoder input: <GO> foo foo foo <EOS>
Decoder input/ground truth: <GO> bar bar bar <EOS>
Decoder output: bar bar bar <EOS> <EOS/PAD>
Since I used <GO> in both the inputs and outputs, the model repeated itself. (<GO> -> <GO>, bar -> bar).
After fixing this and a few other small issues, the chatbot started to produce acceptable results. I will be posting an update on the chatbot soon, as this is only a reminder to myself and a tip for the ones having the same issue.