========================================
This release contains a lot of bug fixes, improvements and new features to prepare the upcoming release candidate.
We recommend that every developer updates to this version.
Highlights:
- Moved Python 3.* minimum supported version from 3.3 to 3.4
- Replaced deprecated package ``nose-parameterized`` with up-to-date package ``parameterized`` for Theano requirements
- Theano now internally uses ``sha256`` instead of ``md5`` to work on systems that forbide ``md5`` for security reason
- Removed old GPU backend ``theano.sandbox.cuda``. New backend ``theano.gpuarray`` is now the official GPU backend
- Support more debuggers for ``PdbBreakpoint``
- Scan improvements
- Speed up Theano scan compilation and gradient computation
- Added meaningful message when missing inputs to scan
- Speed up graph toposort algorithm
- Faster C compilation by massively using a new interface for op params
- Faster optimization step
- Documentation updated and more complete
- Many bug fixes, crash fixes and warning improvements
A total of 65 people contributed to this release since 0.9.0, see list below.
Interface changes:
- Merged duplicated diagonal functions into two ops: ``ExtractDiag`` (extract a diagonal to a vector),
and ``AllocDiag`` (set a vector as a diagonal of an empty array)
- Renamed ``MultinomialWOReplacementFromUniform`` to ``ChoiceFromUniform``
- Removed or deprecated Theano flags:
- ``cublas.lib``
- ``cuda.enabled``
- ``enable_initial_driver_test``
- ``gpuarray.sync``
- ``home``
- ``lib.cnmem``
- ``nvcc.*`` flags
- ``pycuda.init``
- Changed ``grad()`` method to ``L_op()`` in ops that need the outputs to compute gradient
Convolution updates:
- Extended Theano flag ``dnn.enabled`` with new option ``no_check`` to help speed up cuDNN importation
- Implemented separable convolutions
- Implemented grouped convolutions
GPU:
- Prevent GPU initialization when not required
- Added disk caching option for kernels
- Added method ``my_theano_function.sync_shared()`` to help synchronize GPU Theano functions
- Added useful stats for GPU in profile mode
- Added Cholesky op based on ``cusolver`` backend
- Added GPU ops based on `magma library <http://icl.cs.utk.edu/magma/software/>`_:
SVD, matrix inverse, QR, cholesky and eigh
- Added ``GpuCublasTriangularSolve``
- Added atomic addition and exchange for ``long long`` values in ``GpuAdvancedIncSubtensor1_dev20``
- Support log gamma function for all non-complex types
- Support GPU SoftMax in both OpenCL and CUDA
- Support offset parameter ``k`` for ``GpuEye``
- ``CrossentropyCategorical1Hot`` and its gradient are now lifted to GPU
- Better cuDNN support
- Official support for ``v5.*`` and ``v6.*``
- Better support and loading on Windows and Mac
- Support cuDNN v6 dilated convolutions
- Support cuDNN v6 reductions
- Added new Theano flags ``cuda.include_path``, ``dnn.base_path`` and ``dnn.bin_path``
to help configure Theano when CUDA and cuDNN can not be found automatically.
- Updated ``float16`` support
- Added documentation for GPU float16 ops
- Support ``float16`` for ``GpuGemmBatch``
- Started to use ``float32`` precision for computations that don't support ``float16`` on GPU
New features:
- Added a wrapper for `Baidu's CTC <https://github.com/baidu-research/warp-ctc>`_ cost and gradient functions
- Added scalar and elemwise CPU ops for modified Bessel function of order 0 and 1 from ``scipy.special``.
- Added Scaled Exponential Linear Unit (SELU) activation
- Added sigmoid_binary_crossentropy function
- Added tri-gamma function
- Added modes ``half`` and ``full`` for ``Images2Neibs`` ops
- Implemented gradient for ``AbstractBatchNormTrainGrad``
- Implemented gradient for matrix pseudoinverse op
- Added new prop `replace` for ``ChoiceFromUniform`` op
- Added new prop ``on_error`` for CPU ``Cholesky`` op
- Added new Theano flag ``deterministic`` to help control how Theano optimize certain ops that have deterministic versions.
Currently used for subtensor Ops only.
- Added new Theano flag ``cycle_detection`` to speed-up optimization step by reducing time spending in inplace optimizations
- Added new Theano flag ``check_stack_trace`` to help check the stack trace during optimization process
- Added new Theano flag ``cmodule.debug`` to allow a debug mode for Theano C code. Currently used for cuDNN convolutions only.
Others:
- Added deprecation warning for the softmax and logsoftmax vector case
- Added a warning to announce that C++ compiler will become mandatory in next Theano release ``0.11``
Other more detailed changes:
- Removed useless warning when profile is manually disabled
- Added tests for abstract conv
- Added options for `disconnected_outputs` to Rop
- Removed ``theano/compat/six.py``
- Removed ``COp.get_op_params()``
- Support of list of strings for ``Op.c_support_code()``, to help not duplicate support codes
- Macro names provided for array properties are now standardized in both CPU and GPU C codes
- Started to move C code files into separate folder ``c_code`` in every Theano module
- Many improvements for Travis CI tests (with better splitting for faster testing)
- Many improvements for Jenkins CI tests: daily testings on Mac and Windows in addition to Linux