The package is rewritten in an object-oriented way for more readability, extensibility and flexibility.
1. All models are grouped together in the module `AugmentedSocialScientist.models`
To use a model (see README for the list)
python
from AugmentedSocialScientist.models import Bert
bert = Bert() instanciation
Everything else remains unchanged for users to train a model: `bert.encode()` to preprocess data, `bert.run_training()` to train, validate and save a model, `bert.predict_with_model()` to make predictions.
2. Flexibility
- To use a custom model from Hugging Face, set the `model_name` argument while instantiating the model
For example, to use the Danish BERT model from Hugging Face [DJSammy/bert-base-danish-uncased_BotXO-ai](https://huggingface.co/DJSammy/bert-base-danish-uncased_BotXO-ai).
python
from AugmentedSocialScientist.models import Bert
bert = Bert(model_name="DJSammy/bert-base-danish-uncased_BotXO-ai")
- Users can now set their own device to use, by providing a custom a `torch.Device` object to the `device` argument when instantiating the model.
python
from AugmentedSocialScientist.models import Bert
bert = Bert(device=...) your own device
3. The input classification labels can also be textual labels now. They will be automatically converted to the corresponding label ids (integers starting from 0) by the method `encode()`. The dictionary of labels `{label_name:label_id}` will be printed during the preprocessing and saved to the attribute `self.dict_labels`.
4. All dependancies on tensorflow and keras are removed.