-------------------
- Added `scripts/comms2csv.py` and `scripts/csv2comms.py` script for
converting Communication archives to/from a CSV file of TJSONProtocol
encoded Communications
- Updated Thrift dependency from 0.10.0 to 0.11.0
- Added `--uri` flag for THttp/TJSONProtocol support to
`scripts/annotate-communication-client.py` and
`scripts/fetch-client.py`
- Added Concrete Service wrappers using Thrift THttp/TJSONProtocol
- CommunicationReader.__next__() now throws EOFError when reading a
truncated (invalid) Communication file.
- Improvements to `scripts/search-client.py`
- Command line argument for hostname changed from positional
argument to optional `--host` flag
- Command line argument for port changed from positional
argument to optional `--port` flag
- Added batch mode for command line queries
- Search terms with Non-ASCII characters now properly supported on
both Python 2 and Python 3
- Added `--with-scores` flag to print search result scores
- Improved newline handling for concrete.util.simple_comm.create_comm()
- Leading and trailing document newlines no longer generate empty
sentences
- Multiple lines of whitespace are now treated as section breaks;
previous behavior was that only '\n\n' was used for section
breaks
- Twitter (sometimes) uses the incorrect ISO-639-1 code for
Indonesian. The code should be 'id', but Twitter sometimes uses
'in'. `concrete.util.twitter.twitter_lid_to_iso639_3()` now
converts this incorrect ISO-639-1 code to the correct ISO-639-3 code
for Indonesian ('ind').
- Fixed import errors in `examples/annotate-communication-service.py`
caused by schema changes