New:
- Added `export.py` module for exporting textacy/spacy objects into "third-party"
formats; so far, just gensim and conll-u
- Added `compat.py` module for Py2/3 compatibility hacks
- Added `TextDoc.merge()` and `spacy_utils.merge_spans()` for merging spans
into single tokens within a `spacy.Doc`, uses Spacy's recent implementation
Changed:
- Renamed `extract.noun_phrases()` to `extract.noun_chunks()` to match Spacy's API
- Changed extract functions to generators, rather than returning lists
Fixed:
- Whitespace tokens now always filtered out of `extract.words()` lists
- Some Py2/3 str/unicode issues fixed
- Broken tests in `test_extract.py` no longer broken