- Switched to a `fit_generator` implementation of generating sequences for training, instead of loading all sequences into memory. This will allow training large text files (10MB+) without requiring ridiculous amounts of RAM.
- Better `word_level` support:
- The model will only keep `max_words` words and discard the rest.
- The model will not train to predict words not in the vocabulary
- All punctuation (including smart quotes) are their own token.
- When generating, newlines/tabs have surrounding whitespace stripped. (this is not the case for other punctuation as there are too many rules around that)
- Training on single text no longer uses meta tokens to indicate the start/end of the text and does not use them when generating, which results in slightly better output.