------------------
* warning, breaking changes were introduced:
* the order of the parameters in CTMDataset was changed (now first is contextual embeddings)
* CTM takes in input bow_size, contextual_size instead of input_size and bert_size
* changed the name of the parameters in the dataset
* introduced early stopping
* introduced visualization with pyldavis