Major updates
- Rewrote encoders to better support more generic encoders like a ``LabelEncoder``. Furthermore, added broad support for ``batch_encode``, ``batch_decode`` and ``enforce_reversible``.
- Rearchitected default reserved tokens to ensure configurability while still providing the convenience of good defaults.
- Added support to collate sequences with ``torch.utils.data.dataloader.DataLoader``. For example:
python3
from functools import partial
from torchnlp.utils import collate_tensors
from torchnlp.encoders.text import stack_and_pad_tensors
collate_fn = partial(collate_tensors, stack_tensors=stack_and_pad_tensors)
torch.utils.data.dataloader.DataLoader(*args, collate_fn=collate_fn, **kwargs)
- Added doctest support ensuring the documented examples are tested.
- Removed SRU support, it's too heavy of a module to support. Please use https://github.com/taolei87/sru instead. Happy to accept a PR with a better tested and documented SRU module!
- Update version requirements to support Python 3.6 and 3.7, dropping support for Python 3.5.
- Updated version requirements to support PyTorch 1.0+.
- Merged https://github.com/PetrochukM/PyTorch-NLP/pull/66 reducing the memory requirements for pre-trained word vectors by 2x.
Minor Updates
- Formatted the code base with YAPF.
- Fixed ``pandas`` and ``collections`` warnings.
- Added invariant assertion to ``Encoder`` via ``enforce_reversible``. For example:
Python3
encoder = Encoder().enforce_reversible()
Ensuring Encoder.decode(Encoder.encode(object)) == object
- Fixed the accuracy metric for PyTorch 1.0.