------------------
([\108](https://github.com/pndurette/gTTS/issues/108))
Features
- The `gtts` module
- New logger (\"gtts\") replaces all occurrences of `print()`
- Languages list is now obtained automatically (`gtts.lang`) ([\91](https://github.com/pndurette/gTTS/issues/91), [#94](https://github.com/pndurette/gTTS/issues/94), [\#106](https://github.com/pndurette/gTTS/issues/106))
- Added a curated list of language sub-tags that have been
observed to provide different dialects or accents (e.g.
\"en-gb\", \"fr-ca\")
- New `gTTS()` parameter `lang_check` to disable language
checking.
- `gTTS()` now delegates the `text` tokenizing to the API request
methods (i.e. `write_to_fp()`, `save()`), allowing `gTTS`
instances to be modified/reused
- Rewrote tokenizing and added pre-processing (see below)
- New `gTTS()` parameters `pre_processor_funcs` and
`tokenizer_func` to configure pre-processing and tokenizing (or
use a 3rd party tokenizer)
- Error handling:
- Added new exception `gTTSError` raised on API request
errors. It attempts to guess what went wrong based on known
information and observed behaviour ([\60](https://github.com/pndurette/gTTS/issues/60), [\#106](https://github.com/pndurette/gTTS/issues/106))
- `gTTS.write_to_fp()` and `gTTS.save()` also raise
`gTTSError` on [gtts\_token]{.title-ref} error
- `gTTS.write_to_fp()` raises `TypeError` when `fp` is not a
file-like object or one that doesn\'t take bytes
- `gTTS()` raises `ValueError` on unsupported languages (and
`lang_check` is `True`)
- More fine-grained error handling throughout (e.g. [request
failed]{.title-ref} vs. [request successful with a bad
response]{.title-ref})
- Tokenizer (and new pre-processors):
- Rewrote and greatly expanded tokenizer (`gtts.tokenizer`)
- Smarter token \'cleaning\' that will remove tokens that only
contain characters that can\'t be spoken (i.e. punctuation and
whitespace)
- Decoupled token minimizing from tokenizing, making the latter
usable in other contexts
- New flexible speech-centric text pre-processing
- New flexible full-featured regex-based tokenizer
(`gtts.tokenizer.core.Tokenizer`)
- New `RegexBuilder`, `PreProcessorRegex` and `PreProcessorSub`
classes to make writing regex-powered text
[pre-processors]{.title-ref} and [tokenizer cases]{.title-ref}
easier
- Pre-processors:
- Re-form words cut by end-of-line hyphens
- Remove periods after a (customizable) list of known
abbreviations (e.g. \"jr\", \"sr\", \"dr\") that can be
spoken the same without a period
- Perform speech corrections by doing word-for-word
replacements from a (customizable) list of tuples
- Tokenizing:
- Keep punctuation that modify the inflection of speech (e.g.
\"?\", \"!\")
- Don\'t split in the middle of numbers (e.g. \"10.5\",
\"20,000,000\") ([\101](https://github.com/pndurette/gTTS/issues/101))
- Don\'t split on \"dotted\" abbreviations and accronyms (e.g.
\"U.S.A\")
- Added Chinese comma (\",\"), ellipsis (\"...\") to
punctuation list to tokenize on ([\86](https://github.com/pndurette/gTTS/issues/86))
- The `gtts-cli` command-line tool
- Rewrote cli as first-class citizen module (`gtts.cli`), powered
by [Click](http://click.pocoo.org)
- Windows support using [setuptool]{.title-ref}\'s
[entry\_points]{.title-ref}
- Better support for Unicode I/O in Python 2
- All arguments are now pre-validated
- New `--nocheck` flag to skip language pre-checking
- New `--all` flag to list all available languages
- Either the `--file` option or the `<text>` argument can be set
to \"-\" to read from `stdin`
- The `--debug` flag uses logging and doesn\'t pollute `stdout`
anymore
Bugfixes
- `_minimize()`: Fixed an infinite recursion loop that would occur
when a token started with the miminizing delimiter (i.e. a space) ([\86](https://github.com/pndurette/gTTS/issues/86))
- `_minimize()`: Handle the case where a token of more than 100
characters did not contain a space (e.g. in Chinese).
- Fixed an issue that fused multiline text together if the total
number of characters was less than 100
- Fixed `gtts-cli` Unicode errors in Python 2.7 (famous last words) ([\78](https://github.com/pndurette/gTTS/issues/78), [\#93](https://github.com/pndurette/gTTS/issues/93), [\#96](https://github.com/pndurette/gTTS/issues/96))
Deprecations and Removals
- Dropped Python 3.3 support
- Removed `debug` parameter of `gTTS` (in favour of logger)
- `gtts-cli`: Changed long option name of `-o` to `--output` instead
of `--destination`
- `gTTS()` will raise a `ValueError` rather than an `AssertionError`
on unsupported language
Improved Documentation
- Rewrote all documentation files as reStructuredText
- Comprehensive documentation writen for
[Sphinx](http://www.sphinx-doc.org), published to <http://gtts.readthedocs.io>
- Changelog built with [towncrier](https://github.com/hawkowl/towncrier)
Misc
- Major test re-work
- Language tests can read a `TEST_LANGS` enviromment variable so not
all language tests are run every time.
- Added [AppVeyor](https://www.appveyor.com) CI for Windows
- [PEP 8](https://www.python.org/dev/peps/pep-0008/) compliance