------------------
(`108 <https://github.com/pndurette/gTTS/issues/108>`_)
Features
~~~~~~~~
- The ``gtts`` module
- New logger ("gtts") replaces all occurrences of ``print()``
- Languages list is now obtained automatically (``gtts.lang``)
(`91 <https://github.com/pndurette/gTTS/issues/91>`_,
`94 <https://github.com/pndurette/gTTS/issues/94>`_,
`106 <https://github.com/pndurette/gTTS/issues/106>`_)
- Added a curated list of language sub-tags that
have been observed to provide different dialects or accents
(e.g. "en-gb", "fr-ca")
- New ``gTTS()`` parameter ``lang_check`` to disable language
checking.
- ``gTTS()`` now delegates the ``text`` tokenizing to the
API request methods (i.e. ``write_to_fp()``, ``save()``),
allowing ``gTTS`` instances to be modified/reused
- Rewrote tokenizing and added pre-processing (see below)
- New ``gTTS()`` parameters ``pre_processor_funcs`` and
``tokenizer_func`` to configure pre-processing and tokenizing
(or use a 3rd party tokenizer)
- Error handling:
- Added new exception ``gTTSError`` raised on API request errors.
It attempts to guess what went wrong based on known information
and observed behaviour
(`60 <https://github.com/pndurette/gTTS/issues/60>`_,
`106 <https://github.com/pndurette/gTTS/issues/106>`_)
- ``gTTS.write_to_fp()`` and ``gTTS.save()`` also raise ``gTTSError``
on `gtts_token` error
- ``gTTS.write_to_fp()`` raises ``TypeError`` when ``fp`` is not a
file-like object or one that doesn't take bytes
- ``gTTS()`` raises ``ValueError`` on unsupported languages
(and ``lang_check`` is ``True``)
- More fine-grained error handling throughout (e.g.
`request failed` vs. `request successful with a bad response`)
- Tokenizer (and new pre-processors):
- Rewrote and greatly expanded tokenizer (``gtts.tokenizer``)
- Smarter token 'cleaning' that will remove tokens that only contain
characters that can't be spoken (i.e. punctuation and whitespace)
- Decoupled token minimizing from tokenizing, making the latter usable
in other contexts
- New flexible speech-centric text pre-processing
- New flexible full-featured regex-based tokenizer
(``gtts.tokenizer.core.Tokenizer``)
- New ``RegexBuilder``, ``PreProcessorRegex`` and ``PreProcessorSub`` classes
to make writing regex-powered text `pre-processors` and `tokenizer cases`
easier
- Pre-processors:
- Re-form words cut by end-of-line hyphens
- Remove periods after a (customizable) list of known abbreviations
(e.g. "jr", "sr", "dr") that can be spoken the same without a period
- Perform speech corrections by doing word-for-word replacements
from a (customizable) list of tuples
- Tokenizing:
- Keep punctuation that modify the inflection of speech (e.g. "?", "!")
- Don't split in the middle of numbers (e.g. "10.5", "20,000,000")
(`101 <https://github.com/pndurette/gTTS/issues/101>`_)
- Don't split on "dotted" abbreviations and accronyms (e.g. "U.S.A")
- Added Chinese comma (","), ellipsis ("…") to punctuation list
to tokenize on (`86 <https://github.com/pndurette/gTTS/issues/86>`_)
- The ``gtts-cli`` command-line tool
- Rewrote cli as first-class citizen module (``gtts.cli``),
powered by `Click <http://click.pocoo.org>`_
- Windows support using `setuptool`'s `entry_points`
- Better support for Unicode I/O in Python 2
- All arguments are now pre-validated
- New ``--nocheck`` flag to skip language pre-checking
- New ``--all`` flag to list all available languages
- Either the ``--file`` option or the ``<text>`` argument can be set to
"-" to read from ``stdin``
- The ``--debug`` flag uses logging and doesn't pollute ``stdout``
anymore
Bugfixes
~~~~~~~~
- ``_minimize()``: Fixed an infinite recursion loop that would occur
when a token started with the miminizing delimiter (i.e. a space)
(`86 <https://github.com/pndurette/gTTS/issues/86>`_)
- ``_minimize()``: Handle the case where a token of more than 100
characters did not contain a space (e.g. in Chinese).
- Fixed an issue that fused multiline text together if the total number of
characters was less than 100
- Fixed ``gtts-cli`` Unicode errors in Python 2.7 (famous last words)
(`78 <https://github.com/pndurette/gTTS/issues/78>`_,
`93 <https://github.com/pndurette/gTTS/issues/93>`_,
`96 <https://github.com/pndurette/gTTS/issues/96>`_)
Deprecations and Removals
~~~~~~~~~~~~~~~~~~~~~~~~~
- Dropped Python 3.3 support
- Removed ``debug`` parameter of ``gTTS`` (in favour of logger)
- ``gtts-cli``: Changed long option name of ``-o`` to ``--output``
instead of ``--destination``
- ``gTTS()`` will raise a ``ValueError`` rather than an ``AssertionError``
on unsupported language
Improved Documentation
~~~~~~~~~~~~~~~~~~~~~~
- Rewrote all documentation files as reStructuredText
- Comprehensive documentation writen for `Sphinx <http://www.sphinx-doc.org>`_, published to http://gtts.readthedocs.io
- Changelog built with `towncrier <https://github.com/hawkowl/towncrier>`_
Misc
~~~~
- Major test re-work
- Language tests can read a ``TEST_LANGS`` enviromment variable so
not all language tests are run every time.
- Added `AppVeyor <https://www.appveyor.com>`_ CI for Windows
- `PEP 8 <https://www.python.org/dev/peps/pep-0008/>`_ compliance