Fixed
- Transformer models do not ignore `--num-embed` anymore as they did silently before.
As a result there is an error thrown if `--num-embed` != `--transformer-model-size`.
- Fixed the attention in upper layers (`--rnn-attention-in-upper-layers`), which was previously not passed correctly
to the decoder.
Removed
- Removed RNN parameter (un-)packing and support for FusedRNNCells (removed `--use-fused-rnns` flag).
These were not used, not correctly initialized, and performed worse than regular RNN cells. Moreover,
they made the code much more complex. RNN models trained with previous versions are no longer compatible.
- Removed the lexical biasing functionality (Arthur ETAL'16) (removed arguments `--lexical-bias`
and `--learn-lexical-bias`).