Major Features and Improvements



Auto Parallel

- Add distributed operators: Conv2D/Conv2DTranspose/Conv2DBackpropInput/MaxPool/AvgPool/BatchNorm/GatherD
- Support to configure shard strategy for dataset



- Add SlicePatchesOperation for Remote Sensing feature([!18179](https://e.gitee.com/mind_spore/repos/mindspore/mindspore/pulls/18179))


Running Data Recorder

GraphKernel Fusion


- [STABLE] Support MS_DIAGNOSTIC_DATA_PATH for profiler feature.(Ascend/GPU)


- [STABLE] Support MS_DIAGNOSTIC_DATA_PATH for dump feature.(Ascend/GPU/CPU)

API Change

Backwards Incompatible Change

Python API

Command Line Interface

Dump Config

Previously, we need to set the dump path in dump config file. To make the dump feature easier to use on cloud, we support new environment parameter `MS_DIAGNOSTIC_DATA_PATH`.


Major Features and Improvements


- [STABLE] Add CV models on Ascend: CPM, FCN8s, SSD-ResNet50-FPN, EAST, AdvancedEast.
- [STABLE] Add NLP models on Ascend: DGU, TextCNN, SentimentNet(LSTM).
- [STABLE] Add CV models on GPU: Faster-RCNN, FCN8s, CycleGAN, AdvancedEast.
- [BETA] Add CV models on Ascend: CycleGAN, PoseNet, SimCLR.
- [BETA] Add NLP models on Ascend: DGU, EmoTect, Senta, KT-Net.
- [BETA] Add NLP models on GPU: DGU, EmoTect.
- [BETA] Add EPP-MVSNet: a novel deep learning network for 3D reconstruction from multi-view stereo, which has won the first place in Tanks & Temples leaderboard(until April 1, 2021)(GPU).


- [STABLE] The default running mode of MindSpore is changed to Graph mode.
- [STABLE] Support interface `run_check` to check whether MindSpore is working properly or not.
- [STABLE] Support saving custom information in the checkpoint file.
- [STABLE] Normal class adds mean parameter.
- [STABLE] Support export YOLOv3-DarkNet53 and YOLOv4 ONNX model.
- [STABLE] Support 40+ operator export ONNX model.
- [STABLE] The Metric module supports `set_indexes` to select the inputs of `update` in the specified order.
- [STABLE] Switch `_Loss` to an external API `LossBase` as the base class of losses.

Auto Parallel

- [STABLE] Add distributed operators: Select/GatherNd/ScatterUpdate/TopK.
- [STABLE] Support basic pipeline parallelism.
- [STABLE] Optimize sharding strategy setting of `Gather`.
- [STABLE] Optimize mix precision and shared parameter scenarios.
- [STABLE] Optimize distributed prediction scenarios.


- [STABLE] Support unified runtime in GPU and CPU backend.
- [STABLE] MindSpore GPU support CUDA11 with cuDNN8.
- [STABLE] MindSpore GPU inference performance optimization by integrating TensorRT.
- [STABLE] MindSpore built on one Linux distribution can now be used on multiple Linux distributions with the same CPU architecture (e.g. EulerOS, Ubuntu, CentOS).
- [STABLE] MindSpore now supports Ascend310 and Ascend910 environments with one single wheel package and provides an alternate binary package for Ascend310 specifically.
- [STABLE] MindSpore Ascend support group convolution.


- [STABLE] Support caching over MindRecord dataset.
- [STABLE] Support new shuffle mode for MindRecord dataset.
- [STABLE] Support a cropper tool for MindSpore Lite to allow the user to customize MindData binary file according to their script.
- [STABLE] Support share memory mechanism to optimize the multi-processing efficiency of GeneratorDataset/Map/Batch.
- [STABLE] Add features for the GNN dataset to support molecular dynamics simulation scenarios.


- [STABLE] Support Cross-device federated learning framework.
- [STABLE] Support FL-Server distributed networking including TCP and HTTP communication.
- [STABLE] Support FL-Server distributed federated aggregation,support autoscaling and fault tolerance.
- [STABLE] Develop FL-Client framework.
- [STABLE] Supports local differential privacy algorithms.
- [STABLE] MPC-based security aggregation algorithm.
- [STABLE] MindSpore Lite Device-side Inference & Training Interconnection with FL-Client.

Running Data Recorder

- [STABLE] Provide records of multi-stage computational graphs, memory allocation information and graph execution order when a "Launch kernel failed" occurs. (CPU)

GraphKernel Fusion

- [STABLE] Add options to control the optimization level.
- [STABLE] Enhance the generalization ability on GPU. GraphKernel is enabled by default in 40+ networks which cover the field of NLP, CV, Recommender, NAS and Audio. The result shows their throughput is significantly improved, and you are Recommended enabling GraphKernel in your network.


- [STABLE] Unified dump function.

API Change

Backwards Incompatible Change

Python API

`mindspore.dataset.Dataset.device_que` interface removes unused parameter `prefetch_size`([!18973](https://gitee.com/mindspore/mindspore/pulls/18973))

Previously, we have a parameter `prefetch_size` in `device_que` to define the prefetch number of records ahead of the user's request. But indeed this parameter is never used which means it is an ineffective parameter. Therefore, we remove this parameter in 1.3.0 and users can set this configuration by [mindspore.dataset.config.set_prefetch_size](https://www.mindspore.cn/docs/api/en/r1.3/api_python/mindspore.dataset.config.html#mindspore.dataset.config.set_prefetch_size).

<td style="text-align:center"> 1.2.1 </td> <td style="text-align:center"> 1.3.0 </td>

device_que(prefetch_size=None, send_epoch_end=True, create_data_info_queue=False)


device_que(send_epoch_end=True, create_data_info_queue=False)


`mindspore.nn.optim.thor` interface changes to lowercase `thor` and adds two parameters `enable_clip_grad` and `frequency`([!17212](https://gitee.com/mindspore/mindspore/pulls/17212))

The parameter `enable_clip_grad` is used for gradient clipping and another parameter `frequency` is used to control the update interval of second order information matrix.

<td style="text-align:center"> 1.2.1 </td> <td style="text-align:center"> 1.3.0 </td>

THOR(net, learning_rate, damping, momentum, weight_decay=0.0, loss_scale=1.0, batch_size=32,
use_nesterov=False, decay_filter=lambda x: x.name not in [], split_indices=None)


thor(net, learning_rate, damping, momentum, weight_decay=0.0, loss_scale=1.0, batch_size=32,
use_nesterov=False, decay_filter=lambda x: x.name not in [], split_indices=None, enable_clip_grad=False,


Dump Config

Previously, we could only dump tensor data for one or all steps. To make the dump feature easier to use, we changed the dump configuration format and dump structure. View the [New Dump Tutorial](https://www.mindspore.cn/tutorials/experts/en/master/debug/dump.html#dump-introduction).


Major Features and Improvements


- [STABLE] Add MaskedSelect aicpu operation.(Ascend)

Auto Parallel

- [STABLE] Support distributed checkpoint loading.(Ascend/GPU)


Major Features and Improvements


- [STABLE] Add CV models on Ascend: 3D Unet, Unet++, SSD-Resnet50-fpn, SSD-VGG16, crnn_seq2seq_ocr for BSI, CTPN, resnet18, DPN
- [STABLE] Add CV models on GPU: Faster-RCNN
- [STABLE] Add NLP models on Ascend: NAML, Fasttext, GRU, LSTM
- [BETA] Add TPRR: Thinking Path Re-Ranker, an original ranked-base framework for Multi-Hop Question Answering which has won the first place in HotpotQA leaderboard.(Ascend)


- [STABLE] Support side effects expression to ensure that the perform order of user's semantics is correct.(Ascend/GPU/CPU)
- [STABLE] Support calculating the gradient for network that contain non-Tensor input parameters(int, float, bool, mstype,int, mstype.float, mstype.uint, mstype.bool_, tuple, list, dict).(Ascend/GPU/CPU)
- [STABLE] Support the inverse of a bool Tensor.(Ascend/GPU/CPU)
- [STABLE] Uniform the interface `isinstance`.(Ascend/GPU/CPU)
- [STABLE] Support negative indexes.(Ascend/GPU/CPU)
- [STABLE] Support 110+ Numpy-like interfaces in mindspore.numpy.(Ascend/GPU/CPU)
- [STABLE] Support export/load mindir model with a size greater than 2 GB.
- [STABLE] The optimizer supports gradient centralization.(Ascend)
- [STABLE] Support support auc metric, rou metric, bleu score metric, confusion matrix metric, cosine similarity metric, dice metric, hausdorff distance metric, occlusion sensitivity metric, perplexity metric, mean surface distance metric, root mean surface distance metric.
- [STABLE] Support use EmbeddingLookup with cache.(Ascend)
- [STABLE] Add MaskedSelect aicpu operation.(Ascend)

Auto Parallel

- [STABLE] Support AllGather and ReduceScatter fusion.(Ascend)
- [STABLE] Support gradient accumulation feature in auto parallel mode.(Ascend/GPU)
- [STABLE] Support running parallel optimizer with gradient accumulation.(Ascend)
- [STABLE] Add the configuration of communication operators' fusion.(Ascend)
- [STABLE] Support distributed checkpoint loading.(Ascend/GPU)


- [STABLE] Support inference with Nvidia GPU.
- [STABLE] Support data parallelism in PyNative mode.(Ascend/GPU)
- [STABLE] Optimize LSTM inference memory consumption in Graph mode with CPU.


- [STABLE] Add SPONGE modules for molecular dynamics simulation, including Bond, Angle, Dihedral, Non Bond 14, NeighborList, Particle Mesh Ewald, Langevin MD and LIUJIAN MD.(GPU)


- [STABLE] If the libnuma library is installed in the environment, you can run `export DATASET_ENABLE_NUMA=True` or `export MS_ENABLE_NUMA=True` to configure NUMA binding. In multi-card training scenarios, the training data processing speed can be improved, thereby improving the network training efficiency.
- [STABLE] Unify API Tensor structure of Training/Inference interfaces in C++ SDK.
- [STABLE] Optimize duplicated Decode in data preprocess using cache, improve preprocess efficiency.
- [STABLE] Support eager mode to run data augmentation in Python & C++.
- [STABLE] Support more data augmentation operators(e.g. Affine, Perspective) in MindSpore-Lite.
- [STABLE] Support light pipeline to process MindData in MindSpore-Lite training.
- [STABLE] Support more data preprossing operators based on DVPP hardware module and can be used on on Ascend310 platform.
- [STABLE] Support copy-free property for data in Ascend310 inference process scenarios.

Running Data Recorder

- [STABLE] Support running data recorder (RDR) for exception demarcation.
- [STABLE] Provide records of multi-stage computational graphs, memory allocation information, graph execution order, stream execution order and task debug information when a "run task error" or "distribute task failed" occurs. (Ascend)
- [STABLE] Provide records of multi-stage computational graphs, memory allocation information and graph execution order when a "SyncStream error" occurs. (GPU)

3D Feature

- [STABLE] Support 3D ops: Conv3D, Conv3DBackpropInput, Conv3DBackpropFilter, Conv3DTranspose, BiasAdd, BiasAddGrad, PReLU, Transpose, Reshape, transdata, StrideSlice, MaxPool3D, MaxPool3DGrad, BinaryCrossEntropy, SigmoidCrossEntropyWithLogits, SigmoidCrossEntropyWithLogitsGrad, SoftmaxCrossEntropyWithLogits, SigmoidCrossEntropyWithLogits, SigmoidCrossEntropyWithLogitsGrad, BatchNorm3d, BatchNorm3dGrad, Dropout3d.
- [STABLE] Support RMSELoss loss function, MAELoss loss function, FocalLoss loss function, DiceLoss binary loss function, and MultiClassDiceLoss multi-type loss function for 2D/3D network.
- [STABLE] Add optimizer: AdamApplyOne(3D), ApplyMomentum(3D), SGD(3D).

API Change

Backwards Incompatible Change

Python API

`mindspore.numpy.array()`, `mindspore.numpy.asarray()`, `mindspore.numpy.asfarray()`, `mindspore.numpy.copy()` now support GRAPH mode, but cannot accept `numpy.ndarray` as input arguments anymore([!12726](https://gitee.com/mindspore/mindspore/pulls/12726))

Previously, these interfaces can accept numpy.ndarray as arguments and convert numpy.ndarray to Tensor, but cannot be used in GRAPH mode.
However, currently MindSpore Parser cannot parse numpy.ndarray in JIT-graph. To support these interfaces in graph mode, we have to remove `numpy.ndarray` support. With that being said, users can still use `Tensor` to convert `numpy.ndarray` to tensors.

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

>>> import mindspore.numpy as mnp
>>> import numpy
>>> nd_array = numpy.array([1,2,3])
>>> tensor = mnp.asarray(nd_array) this line cannot be parsed in GRAPH mode


>>> import mindspore.numpy as mnp
>>> import numpy
>>> tensor = mnp.asarray([1,2,3]) this line can be parsed in GRAPH mode


mindspore.numpy interfaces remove support for keyword arguments `out` and `where`([!12726](https://gitee.com/mindspore/mindspore/pulls/12726))

Previously, we have incomplete support for keyword arguments `out` and `where` in mindspore.numpy interfaces, however, the `out` argument is only functional when `where` argument is also provided, and `out` cannot be used to pass reference to numpy functions. Therefore, we have removed these two arguments to avoid any confusion users may have. Their original functionality can be found in [np.where](https://www.mindspore.cn/docs/en/master/api_python/numpy/mindspore.numpy.where.html#mindspore.numpy.where)

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

>>> import mindspore.numpy as np
>>> a = np.ones((3,3))
>>> b = np.ones((3,3))
>>> out = np.zeros((3,3))
>>> where = np.asarray([[True, False, True],[False, False, True],[True, True, True]])
>>> res = np.add(a, b, out=out, where=where) `out` cannot be used as a reference, therefore it is misleading


>>> import mindspore.numpy as np
>>> a = np.ones((3,3))
>>> b = np.ones((3,3))
>>> out = np.zeros((3,3))
>>> where = np.asarray([[True, False, True],[False, False, True],[True, True, True]])
>>> res = np.add(a, b)
>>> out = np.where(where, x=res, y=out) instead of np.add(a, b, out=out, where=where)


Turn `ops.MakeRefKey` into an internal interface ([!12010](https://gitee.com/mindspore/mindspore/pulls/12010))

Previously MakeRefKey is an external interface that is not used, now make it an internal interface with the same usage. We do not recommend users to use this interface, and we will remove the relevant introduction of this interface from the official website.

`ops.ApplyFtrl`, `ops.ApplyMomentum`, `ops.ApplyRMSProp`, `ops.ApplyCenteredRMSProp` change the output on Ascend backend from multiple to a single. ([!11895](https://gitee.com/mindspore/mindspore/pulls/11895))

Previously the number of outputs of these operator is different on different backends. To unify their definition we change their output on Ascend backend from multiple to a single.

`P.FusedBatchNorm`, `P.FusedBatchNormEx` deleted ([!12115](https://gitee.com/mindspore/mindspore/pulls/12115))

The FusedBatchNorm and FusedBatchNormEx interface has been deleted. Please use the batchnorm operator to replace it.

`MetaTensor` deleted ([!10325](https://gitee.com/mindspore/mindspore/pulls/10325))

The MetaTensor interface has been deleted. The function of MetaTensor has been integrated into tensor.

`ControlDepend` is deleted, use `Depend` instead. The decorator `C.add_flags(has_effect=True)` does not work. ([!13793](https://gitee.com/mindspore/mindspore/pulls/13793))

Previously, we used ControlDepend to control the execution order of multiple operators. In version 1.2.0, mindspore introduces the auto-monad side effects expression to ensure that the perform order of user's semantics is correct. Therefore, ControlDepend is deleted and Depend is recommended.

In most scenarios, if operators have IO side effects (such as print) or memory side effects (such as assign), they will be executed according to the user's semantics. In some scenarios, if the two operators A and B have no order dependency, and A must be executed before B, we recommend using Depend to specify their execution order. See the API documentation of the Depend operator for specific usage.

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

In some side-effect scenarios, we need to ensure the execution order of operators.
In order to ensure that operator A is executed before operator B, it is recommended
to insert the Depend operator between operators A and B.

Previously, the ControlDepend operator was used to control the execution order.
Since the ControlDepend operator is deprecated from version 1.1, it is recommended
to use the Depend operator instead. The replacement method is as follows::

a = A(x) ---> a = A(x)
b = B(y) ---> y = Depend(y, a)
ControlDepend(a, b) ---> b = B(y)


In most scenarios, if operators have IO side effects or memory side effects,
they will be executed according to the user's semantics. In some scenarios,
if the two operators A and B have no order dependency, and A must be executed
before B, we recommend using Depend to specify their execution order. The
usage method is as follows::

a = A(x) ---> a = A(x)
b = B(y) ---> y = Depend(y, a)
---> b = B(y)


After the introduction of the auto-monad side effect expression feature, the decorator `C.add_flags(has_effect=True)` does not work. If the decorator is used in the script, please modify. Take the overflow identification operator (without side effects) as an example, the modification method is as follows:

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

def construct(self, *inputs):
loss = self.network(*inputs)
init = self.allo_status()


def construct(self, *inputs):
loss = self.network(*inputs)
init = self.allo_status()
init = F.depend(init, loss)
clear_status = self.clear_status(init)



C++ API support dual ABI now.([!12432](https://gitee.com/mindspore/mindspore/pulls/12432))

1.1.1 supports only the old ABI. Currently, both the new and the old are supported.

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>



add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0) old ABI are supported
add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=1) new ABI are supprrted, too
write nothing, use new ABI as default


Context refactor.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515))

The `Context` class is refactored. For details, see the API docs.

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

GlobalContext::SetGlobalDeviceTarget(kDeviceTypeAscend310); // set device target is ascend310
GlobalContext::SetGlobalDeviceID(0); // set device id is 0
auto model_context = std::make_shared<ModelContext>(); // create a model context
ModelContext::SetInsertOpConfigPath(model_context, "./aipp.cfg") // set aipp config file is ./aipp.cfg


auto model_context = std::make_shared<Context>(); // create a model context
auto ascend310_info = std::make_shared<Ascend310DeviceInfo>();
model_context.MutableDeviceInfo().push_back(ascend310_info ); // set device target is ascend310
ascend310_info->SetDeviceID(0); // set device id is 0
ascend310_info->SetInsertOpConfigPath("./aipp.cfg"); // set aipp config file is ./aipp.cfg


LoadModel interface changes.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515))

`LoadModel` is renamed `Load`. No exception is thrown new but the return status should be checked.

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

try {
auto graph = Serialization::LoadModel(model_file_path, kMindIR);
} catch (...) { ... }


Graph graph;
auto ret = Serialization::Load(model_file_path, kMindIR, &graph);
if (ret != kSuccess) { ... }


Model ctor changes.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515))

`Model` uses a non-parameter ctor now, and arguments are passed in through `Build`.

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

Model net(net_cell, model_context);
auto ret = net.Build();
if (ret != kSuccess) { ... }


Model net;
auto ret = net.Build(net_cell, model_context);
if (ret != kSuccess) { ... }


MSTensor::CreateTensor returns a native pointer now.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515))

`MSTensor::CreateTensor` and `MSTensor::CreateRefTensor` returns a native pointer now, need to be destroy by `DestroyTensorPtr`.

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

auto tensor = MSTensor::CreateTensor(xxx, xxx, ...);
auto name = tensor.Name();


auto tensor = MSTensor::CreateTensor(xxx, xxx, ...);
auto name = tensor->Name();


New features

Python API

- Add SPONGE functions: `mindspore.ops.operations.BondForceWithAtomEnergy`, `mindspore.ops.operations.AngleForceWithAtomEnergy`, `mindspore.ops.operations.DihedralForceWithAtomEnergy`, `mindspore.ops.operations.Dihedral14LJCFForceWithAtomEnergy`, `mindspore.ops.operations.LJForceWithPMEDirectForce`, `mindspore.ops.operations.PMEExcludedForce`, `mindspore.ops.operations.PMEReciprocalForce`,`mindspore.ops.operations.BondEnergy`, `mindspore.ops.operations.AngleEnergy`,`mindspore.ops.operations.DihedralEnergy`, `mindspore.ops.operations.Dihedral14LJEnergy`, `mindspore.ops.operations.Dihedral14CFEnergy`,`mindspore.ops.operations.LJEnergy`, `mindspore.ops.operations.PMEEnergy`. All operators are supported in `GPU`.


Python API

`nn.MatMul` is now deprecated in favor of `ops.matmul` ([!12817](https://gitee.com/mindspore/mindspore/pulls/12817))

[ops.matmul](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.matmul.html#mindspore.ops.matmul) follows the API of [numpy.matmul](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html) as closely as possible. As a function interface, [ops.matmul](https://www.mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.matmul.html#mindspore.ops.matmul) is applied without instantiation, as opposed to `nn.MatMul`, which should only be used as a class instance.

<td style="text-align:center"> 1.1.1 </td> <td style="text-align:center"> 1.2.0 </td>

>>> import numpy as np
>>> from mindspore import Tensor, nn
>>> x = Tensor(np.ones((2, 3)).astype(onp.float32)
>>> y = Tensor(np.ones((3, 4)).astype(onp.float32)
>>> nn.MatMul()(x, y)


>>> import numpy as np
>>> from mindspore import Tensor, ops
>>> x = Tensor(np.ones((2, 3)).astype(onp.float32)
>>> y = Tensor(np.ones((3, 4)).astype(onp.float32)
>>> ops.matmul(x, y)


Bug fixes


- fix the null pointer problem of evaluator in control flow.([!13312](https://gitee.com/mindspore/mindspore/pulls/13312))
- fix parameter naming conflict bug for CellList and SequentialCell. ([!13260](https://gitee.com/mindspore/mindspore/pulls/13260))


- fix executor pending task not execute in some heterogeneous cases.([!13465](https://gitee.com/mindspore/mindspore/pulls/13465))
- add passes to support frontend IR unification, including following operations: SliceGrad([!11783](https://gitee.com/mindspore/mindspore/pulls/11783)), ApplyFtrl, ApplyMomentum, ApplyRMSProp, CenteredRMSProp([!11895](https://gitee.com/mindspore/mindspore/pulls/11895)), AvgPoolGrad([!12813](https://gitee.com/mindspore/mindspore/pulls/12813)), BatchNorm([!12115](https://gitee.com/mindspore/mindspore/pulls/12115))


- Fix getter functions(e.g. GetDatasetSize) terminated abnormally when use python multi-processing. ([!13571](https://gitee.com/mindspore/mindspore/pulls/13571), [!13823](https://gitee.com/mindspore/mindspore/pulls/13823))
- Fix unclear error log of data augmentation operators. ([!12398](https://gitee.com/mindspore/mindspore/pulls/12398), [!12883](https://gitee.com/mindspore/mindspore/pulls/12883), [!13176](https://gitee.com/mindspore/mindspore/pulls/13176))
- Fix profiling performs abnormally when sink_size = False, as saving data is later than profiling analysis. ([!13944](https://gitee.com/mindspore/mindspore/pulls/13944))

MindSpore Lite

Major Features and Improvements

Converter and runtime

1. Support TensorFlow model in Converter except aware-training model.
2. Add fusion pattern for same horizontal operators in Converter.
3. Support Jar in x86_64 system for integrating into server with Java backend conveniently.
4. Provide unified runtime API for developer reusing their code between cloud side and end side.[BETA]
5. Improve control-flow capabilities continually: Support GRU fusion in Converter; Support weight-quant for control-flow model; Support control-flow model inference with half precision; Support nested control-flow model.[BETA]

ARM backend optimization

1. Add NLP dependent float16 operators(like lstm) to enhance inference performance.
2. Optimize operators: lstm, gru, depthwise.
3. Add 6 NPU operators(like FullConnection), and fix some bugs about buildIR failed.

OpenCL backend

1. Add new ops: add 10+ ops,total 72 ops;
2. Performance optimization: by memory layout optimize,block tiling,Performance improved by 30% compared to version 1.1 at Adreno GPU.
3. Initialization time optimization: initialization time improve 100% vs MSLITE Version1.1 by store kernel cache as binary.
4. Support Java call on Mali or Adreno GPU.

Post quantization

1. Support quantization of gather and lstm ops.
2. Support quantizatizing TF Lite models with sub-graph node.
3. Add quantiztion strategy to decide quantize ops or not,less accuracy loss and higher compression rate.

Training on Device

1. Virtual batching, use mini-batch to minic large batch in theorical with few RAM consumption.
2. Converter unify, do not compile tod and iod converter separately.
3. Performance optimization to BWD ops.
4. TrainLoop with Off-The-Shelf Functionality blocks, like LR scheduler, Loss Monitor, Ckpt Saver, Accuracy Monitor.
5. Integration of code with Minddata lite.
6. Support more networks (googlenet, densenet, shufflenetv2, nin, vgg) and operators.


1. Support 79 ops for the ARM platform and all CMSIS ops for Arm Cortex-M Series.
2. Multiplatform support, including Android, IoT Devices.
3. Support offline model weight preprocessing while compiling.
4. Support offline memory reuse computing for minimum runtime buffer size.
5. Support kernel register for custom op. Third-party hardware like NNIE can be accessed through it.

API Change

API Incompatible Change


Add header file named lite_types.h for some common data structs. ([!12262](https://gitee.com/mindspore/mindspore/pulls/12262))

Previously, some common data structs such as `CpuBindMode` and `DeviceType` are in context.h, this may cause cross-dependency between headers. So we create a new header named lite_types.h for some common data structs and move `CpuBindMode` and `DeviceType` from context.h into lite_types.h.

<td style="text-align:center"> lite_types.h </td>

namespace mindspore::lite {
/// \brief CpuBindMode defined for holding bind cpu strategy argument.
typedef enum {
NO_BIND, /**< no bind */
HIGHER_CPU, /**< bind higher cpu first */
MID_CPU /**< bind middle cpu first */
} CpuBindMode;

/// \brief DeviceType defined for holding user's preferred backend.
typedef enum {
DT_CPU, /**< CPU device type */
DT_GPU, /**< GPU device type */
DT_NPU /**< NPU device type */
} DeviceType;
} // namespace mindspore::lite


Add some new interfaces in ms_tensor.h for unified runtime API.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515))

Previously, users could not create `MSTensor` or modify ``MSTensor, all `MSTensor` are created and managed by framework. However users need to create or modify MSTensor sometimes such as pre-processing input data. So we provide two new interfaces in ms_tensor.h: `CreateTensor` interface for creating `MSTensor` by user and `set_shape` interface for modifying the shape of `MSTensor`.

<td style="text-align:center"> CreateTensor </td>

/// \brief Create a MSTensor.
/// \return Pointer to an instance of MindSpore Lite MSTensor.
static MSTensor *CreateTensor(const std::string &name, TypeId type, const std::vector<int> &shape, const void *data,
size_t data_len);


<td style="text-align:center"> set_shape </td>

/// \brief Set the shape of MSTensor.
virtual void set_shape(const std::vector<int> &shape) = 0;


Previously, users could access to data of `MSTensor` by interface named `MutableData`. However `MutableData` is not only returning data of tensor but also allocating data for tensor if its data is nullptr. So we provide a new interfaces in ms_tensor.h named `data` for returning data of tensor without allocating automatically.

<td style="text-align:center"> data </td>

/// \brief Get the pointer of data in MSTensor.
/// \note The data pointer can be used to both write and read data in MSTensor. No memory buffer will be
/// allocated.
/// \return the pointer points to data in MSTensor.
virtual void *data() = 0;


Delete `DimensionSize()` in ms_tensor.h.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515))

The interface named `DimensionSize` is fuinctionally overlapped with the interface named `shape`. For the simplicity of the interface, we delete `DimensionSize` and recommend users to use the new interface named `shape` instead.

<td style="text-align:center"> DimensionSize() </td>

/// \brief Get size of the dimension of the MindSpore Lite MSTensor index by the parameter index.
/// \param[in] index Define index of dimension returned.
/// \return Size of dimension of the MindSpore Lite MSTensor.
virtual int DimensionSize(size_t index) const = 0;


Move allocator from namespace mindspore::lite to namespace lite for unified runtime API.([!13515](https://gitee.com/mindspore/mindspore/pulls/13515))

Previously, class `Allocator` is in namespace mindspore::lite. Considering unified allocator interface for unified runtime API, we move `Allocator` to namespace mindspore.

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.2.0 </td>

namespace mindspore::lite {
/// \brief Allocator defined a memory pool for malloc memory and free memory dynamically.
/// \note List public class and interface for reference.
class Allocator;


namespace mindspore {
/// \brief Allocator defined a memory pool for malloc memory and free memory dynamically.
/// \note List public class and interface for reference.
class Allocator;


Bug fixes

1. Fix the bug that the array in kernel registrar is not initialized.
2. Fix segment fault caused by releasing of OpParameter in Crop kernel in mistake.
3. Fix the bug that the MINDIR aware-training model is finally interpreted as weight-quant model.


Thanks goes to these wonderful people:

Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, dong-li001, eric, Eric, fary86, fuzhiye, Gaoxiong, GAO_HYP_XYJ, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Islam Amin, Jesse, , Jiabin Liu, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, Lin Xh, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luopengting, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, Ming_blue, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, qianjiahong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wudenggang, wukesong, wuweikang, wuxuejian, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5huawei.com, zhanghuiyao, zhanghui_china, zhangxinfeng3, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhiqwang, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, zymaa.

Contributions of any kind are welcome!



Major Features and Improvements


- [STABLE] BGCF: a Bayesian Graph Collaborative Filtering(BGCF) framework used to model the uncertainty in the user-item interaction graph and thus recommend accurate and diverse items on Amazon recommendation dataset.(Ascend)
- [STABLE] GRU: a recurrent neural network architecture like the LSTM(Long-Short Term Memory) on Multi30K dataset.(Ascend)
- [STABLE] FastText: a simple and efficient text classification algorithm on AG's news topic classification dataset, DBPedia Ontology classification dataset and Yelp Review Polarity dataset.(Ascend)
- [STABLE] LSTM: a recurrent neural network architecture used to learn word vectors for sentiment analysis on aclImdb_v1 dataset.(Ascend)
- [STABLE] SimplePoseNet: a convolution-based neural network for the task of human pose estimation and tracking on COCO2017 dataset.(Ascend)


- [BETA] Support Tensor Fancy Index Getitem with tuple and list. (Ascend/GPU/CPU)

Backwards Incompatible Change

Python API

`ops.AvgPool`, `ops.MaxPool`, `ops.MaxPoolWithArgmax` change attr name from 'ksize', 'padding' to 'kernel_size', 'pad_mode' ([!11350](https://gitee.com/mindspore/mindspore/pulls/11350))

Previously the kernel size and pad mode attrs of pooling ops are named "ksize" and "padding", which is a little puzzling and inconsistent with convolution ops. So they are rename to "kernel_size" and "pad_mode".

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

>>> import mindspore.ops as ops
>>> avg_pool = ops.AvgPool(ksize=2, padding='same')
>>> max_pool = ops.MaxPool(ksize=2, padding='same')
>>> max_pool_with_argmax = ops.MaxPoolWithArgmax(ksize=2, padding='same')


>>> import mindspore.ops as ops
>>> avg_pool = ops.AvgPool(kernel_size=2, pad_mode='same')
>>> max_pool = ops.MaxPool(kernel_size=2, pad_mode='same')
>>> max_pool_with_argmax = ops.MaxPoolWithArgmax(kernel_size=2, pad_mode='same')


`ops.TensorAdd`, change API name to `ops.Add` ([!11568](https://gitee.com/mindspore/mindspore/pulls/11568))

The operator name TensorAdd is not standardized, it is changed to Add. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface.

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

>>> import mindspore.ops as ops
>>> add = ops.TensorAdd()


>>> import mindspore.ops as ops
>>> add = ops.Add()


`ops.Gelu`, `ops.GeluGrad`, `ops.FastGelu`, `ops.FastGeluGrad`, change API name to `ops.GeLU`, `ops.GeLUGrad`, `ops.FastGeLU`, `ops.FastGeLUGrad` ([!11603](https://gitee.com/mindspore/mindspore/pulls/11603))

Gelu, GeluGrad, FastGelu, and FastGeluGrad names are unified into ReLU naming rules, "lu" is changed to the uppercase "LU". The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface.

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

>>> import mindspore.ops as ops
>>> gelu = ops.Gelu()
>>> gelu_grad = ops.GeluGrad()
>>> fast_gelu = ops.FastGelu()
>>> fast_gelu_grad = ops.FastGeluGrad()


>>> import mindspore.ops as ops
>>> gelu = ops.GeLU()
>>> gelu_grad = ops.GeLUGrad()
>>> fast_gelu = ops.FastGeLU()
>>> fast_gelu_grad = ops.FastGeLUGrad()


`ops.GatherV2`, change API name to `ops.Gather` ([!11713](https://gitee.com/mindspore/mindspore/pulls/11713))

GatherV2 is changed to Gather. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface.

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

>>> import mindspore.ops as ops
>>> gather = ops.GatherV2()


>>> import mindspore.ops as ops
>>> gather = ops.Gather()


`ops.Pack`、`ops.Unpack`, change API name to `ops.Stack`、`ops.Unstack` ([!11828](https://gitee.com/mindspore/mindspore/pulls/11828))

Pack is changed to Stack, and Unpack is changed to Unstack. The old interface can be used continuously, but will be deleted in subsequent versions, it is recommended to use and switch to the latest interface.

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

>>> import mindspore.ops as ops
>>> pack= ops.Pack()
>>> unpack= ops.Unpack()


>>> import mindspore.ops as ops
>>> stack= ops.Stack()
>>> unstack= ops.Unstack()


`ops.ControlDepend`, add deprecated to ControlDepend ([!11844](https://gitee.com/mindspore/mindspore/pulls/11844))

ControlDepend is deprecated and will be removed in a future version, use Depend instead.

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

This operation does not work in `PYNATIVE_MODE`.


This operation does not work in `PYNATIVE_MODE`.
`ControlDepend` is deprecated from version 1.1 and will be removed in a future version, use `Depend` instead.


`ops.Depend`, add operator description and use case ([!11815](https://gitee.com/mindspore/mindspore/pulls/11815)), ([!11879](https://gitee.com/mindspore/mindspore/pulls/11879))

Since the ControlDepend operator will be deprecated from version 1.2, it is recommended to use the Depend operator instead.

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

Depend is used for processing side-effect operations.

- **value** (Tensor) - the real value to return for depend operator.
- **expr** (Expression) - the expression to execute with no outputs.

Tensor, the value passed by last operator.

Supported Platforms:
``Ascend`` ``GPU`` ``CPU``


Depend is used for processing dependency operations.

In some side-effect scenarios, we need to ensure the execution order of operators.
In order to ensure that operator A is executed before operator B, it is recommended
to insert the Depend operator between operators A and B.

Previously, the ControlDepend operator was used to control the execution order.
Since the ControlDepend operator will be deprecated from version 1.2, it is
recommended to use the Depend operator instead. The replacement method is as follows::

a = A(x) ---> a = A(x)
b = B(y) ---> y = Depend(y, a)
ControlDepend(a, b) ---> b = B(y)

- **value** (Tensor) - the real value to return for depend operator.
- **expr** (Expression) - the expression to execute with no outputs.

Tensor, the value passed by last operator.

Supported Platforms:
``Ascend`` ``GPU`` ``CPU``

>>> import numpy as np
>>> import mindspore
>>> import mindspore.nn as nn
>>> import mindspore.ops.operations as P
>>> from mindspore import Tensor
>>> class Net(nn.Cell):
... def __init__(self):
... super(Net, self).__init__()
... self.softmax = P.Softmax()
... self.depend = P.Depend()
... def construct(self, x, y):
... mul = x - y
... y = self.depend(y, mul)
... ret = self.softmax(y)
... return ret
>>> x = Tensor(np.ones([4, 5]), dtype=mindspore.float32)
>>> y = Tensor(np.ones([4, 5]), dtype=mindspore.float32)
>>> net = Net()
>>> output = net(x, y)
>>> print(output)
[[0.2 0.2 0.2 0.2 0.2]
[0.2 0.2 0.2 0.2 0.2]
[0.2 0.2 0.2 0.2 0.2]
[0.2 0.2 0.2 0.2 0.2]]



change namespace from `mindspore::api` to `mindspore` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574))

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

namespace ms = mindspore::api;


namespace ms = mindspore;


`Context` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574))

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>





rename `Tensor` to `MSTensor` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574))

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

ms::Tensor a;


ms::MSTensor a;


`Model` move setting of model options from `Build` to ctor `Model` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574))

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

ms::Model model(graph_cell);


ms::Model model(graph_cell, model_context);


`Model` modify `GetInputsInfo`, `GetOutputsInfo` to `GetInputs`, `GetOutputs` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574))

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

std::vector<std::string> names;
std::vector<ms::DataType> types;
std::vector<std::vector<int64_t>> shapes;
std::vector<size_t> mem_sizes;
model.GetInputsInfo(&names, &types, &shapes, &mem_sizes);
std::cout << "Input 0 name: " << names[0] << std::endl;


auto inputs = model.GetInputs();
std::cout << "Input 0 name: " << inputs[0].Name() << std::endl;


`Model` modify `Predict` parameters type from `Buffer` to `MSTensor` ([!11574](https://gitee.com/mindspore/mindspore/pulls/11574))

<td style="text-align:center"> 1.1.0 </td> <td style="text-align:center"> 1.1.1 </td>

std::vector<ms::Buffer> inputs;
std::vector<ms::Buffer> outputs;
model.Predict(inputs, &outputs);


std::vector<ms::MSTensor> inputs;
std::vector<ms::MSTensor> outputs;
model.Predict(inputs, &outputs);



Python API

`ops.SpaceToBatch`, `ops.BatchToSpace` are deprecated in favor of `ops.SpaceToBatchND`, `ops.BatchToSpaceND`([!11527](https://gitee.com/mindspore/mindspore/pulls/11527))

The `ops.SpaceToBatchND`, `ops.BatchToSpaceND` are more general and have same behavior as `ops.SpaceToBatch`, `ops.BatchToSpace` when `block_shape` is a int.

`ops.DepthwiseConv2dNative` is deprecated in favor of `nn.Conv2D`([!11702](https://gitee.com/mindspore/mindspore/pulls/11702))

The `ops.DepthwiseConv2dNative` is only supported by Ascend, it is recommended to directly use `nn.Conv2D`. If `group` is equal to `in_ channels` and `out_channels`, the 2D convolution layer is also a 2D depthwise convolution layer.


Thanks goes to these wonderful people:

Adel, AGroupofProbiotocs, anthonyaje, anzhengqi, askmiao, baihuawei, baiyangfan, bai-yangfan, bingyaweng, BowenK, buxue, caifubi, CaoJian, caojian05, caozhou, Cathy, changzherui, chenbo116, chenfei, chengxianbin, chenhaozhe, chenjianping, chenzomi, chenzupeng, chujinjin, cj, cjh9368, Corleone, damon0626, danish, Danish, davidmc, dayschan, doitH, eric, Eric, fary86, fuzhiye, Gaoxiong, gengdongjie, Gogery, gongdaguo, gray0v0, gukecai, guoqi, gzhcv, hangq, hanhuifeng2020, Harshvardhan, He, heleiwang, hexia, Hoai, HuangBingjian, huangdongrun, huanghui, huangxinjing, huqi, huzhifeng, hwjiaorui, Jesse, jianghui58, jiangzhiwen, Jiaqi, jin-xiulang, jinyaohui, jjfeing, John, Jonathan, jonyguo, JulyAi, jzg, kai00, kingfo, kingxian, kpy, kswang, laiyongqiang, leonwanghui, Li, liangchenghui, liangzelang, lichen_101010, lichenever, lihongkang, lilei, limingqi107, ling, linqingke, liubuyu, liuwenhao4, liuxiao78, liuxiao93, liuyang_655, liuzhongkai, Lixia, lixian, liyanliu, liyong, lizhenyu, luoyang, lvchangquan, lvliang, lz, mahdi, Mahdi, maning202007, Margaret_wangrui, mayang, mengyuanli, nhussain, ougongchang, panfengfeng, panyifeng, Payne, Peilin, peixu_ren, Pengyongrong, qianlong, r1chardf1d0, riemann_penn, rmdyh, Sheng, shenwei41, simson, Simson, Su, sunsuodong, tao_yunhao, tinazhang, VectorSL, , Wan, wandongdong, wangdongxu, wangmin, wangnan39huawei.com, wangyue01, wangzhe, wanyiming, Wei, wenchunjiang, wilfChen, WilliamLian, wsc, wukesong, wuweikang, wuxuejian, Xiaoda, xiefangqi, xinyunfan, xuanyue, xulei2020, Xun, xuyongfei, yanghaitao, yanghaitao1, yanghaoran, YangLuo, yangruoqi713, yankai, yanzhenxiang2020, yao_yf, yepei6, yeyunpeng, Yi, yoni, yoonlee666, yuchaojie, yujianfeng, yuximiao, zengzitao, Zhang, zhanghaibo5huawei.com, zhanghuiyao, zhangyihui, zhangz0911gm, zhanke, zhanyuan, zhaodezan, zhaojichen, zhaoting, zhaozhenlong, zhengjun10, zhoufeng, zhousiyi, zhouyaqiang, zhouyifengCode, Zichun, Zirui, Ziyan, zjun, ZPaC, zymaa

Contributions of any kind are welcome!



Major Features and Improvements


- [STABLE] GNMT v2: similar to the model described in Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, which is mainly used for corpus translation, on WMT Englis-German dataset.(Ascend)
- [STABLE] MaskRCNN: a conceptually simple, flexible, and general framework for object instance segmentation on COCO2017 dataset.(Ascend)
- [STABLE] YOLOv4: a state-of-the-art detector which is faster and more accurate than all available alternative detectors on MS COCO dataset.(Ascend)
- [STABLE] Openpose: proposes a bottom-up human attitude estimation algorithm using Part Affinity Fields on COCO2017 dataset.(Ascend)
- [STABLE] CNN-CTC: proposes three major contributions to addresses scene text recognition (STR) on MJSynth and SynthText dataset.(Ascend)
- [STABLE] CenterFace: a practical anchor-free face detection and alignment method for edge devices on WiderFace dataset.(Ascend)
- [STABLE] ShuffleNetV2: a much faster and more accurate network than the previous networks on ImageNet 2012 dataset.(GPU)
- [STABLE] EfficientNet-B0: a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient on ImageNet 2012 dataset.(GPU)
- [BETA] SSD-GhostNet: based on an Ghost module structure which generate more features from cheap operations on Oxford-IIIT Pet dataset.(Ascend)
- [BETA] DS-CNN: Depthwise separable convolutional neural network on Speech commands dataset.(Ascend)
- [BETA] DeepPotentialH2O: A neural network model for molecular dynamics simulations. (Ascend)
- [BETA] GOMO: A classical numerical method called GOMO for ocean simulation. (GPU)


- [STABLE] Refactor the MINDIR to support 310 inference(Ascend).
- [STABLE] The execution backend of sparse operations in optimizer can be set through 'target'. (Ascend/GPU/CPU)
- [STABLE] Support saving specified network to checkpoint and filtering parameters according to prefix when load checkpoint. (Ascend/GPU/CPU)
- [STABLE] Allow users choose whether to load parameter into network strictly.(Ascend/GPU/CPU)
- [STABLE] Before training, in graph mode, in order to have the same network initialization parameter values ​​for all devices, broadcast the parameters on device 0 to other devices. (Ascend/GPU)
- [STABLE] Support if by if of control flow subgraph. (Ascend/GPU)
- [STABLE] Support the judgment that whether a tensor is in a list. (Ascend/GPU/CPU)
- [STABLE] Support to get a value by using the corresponding key in a dictionary in the network; Support to get keys and values of a dictionary in the network. (Ascend/GPU/CPU)
- [STABLE] Support Tensor in enumerate. (Ascend/GPU/CPU)
- [STABLE] Support multilevel index assignment. (Ascend/GPU/CPU)
- [STABLE] Support the 'expand_as','view','abs','mean' method of Tensor. (Ascend/GPU/CPU)
- [STABLE] Support ResizeBilinear operation transfer ratio. (Ascend)
- [STABLE] nn.Matmul supports matrix-vector product and batched matrix multiply. (Ascend/GPU)
- [STABLE] nn.Dense supports input tensor whose dimension can be greater than 2. (Ascend/GPU)
- [BETA] Support higher order differentiation for partial operators.(CPU/GPU/Ascend)
- [STABLE] Support Tensor Augassign.(Ascend/GPU)
- [BETA] Support 22 numpy native interfaces.

Auto Parallel

- [STABLE] Support parallel optimizer with weight shard. (Ascend/GPU)
- [STABLE] Support distributed operators: element-wise series, UnsortedSegmentSum, UnsortedSegmentMin, Split, BroadcastTo and Unique etc. (Ascend/GPU)
- [STABLE] Support distributed model prediction. (Ascend/GPU)
- [STABLE] Support auto mixed precision level "O2" in auto and semi auto parallel mode. (Ascend/GPU)
- [STABLE] Add MultiFieldEmbeddingLookup high-level interface. (Ascend/GPU)


- [STABLE] ResNet50 performance optimize. (GPU)
- [STABLE] Support modelzoo net in PyNative mode(Ascend 29, GPU 23, CPU 2).(Ascend/GPU/CPU)
- [STABLE] Support PyNative mode on CPU.(CPU)
- [STABLE] Optimize performance in PyNative mode.(Ascend/GPU/CPU)
- [STABLE] Support Safe Optimized Memory Allocation Solver (SOMAS) on Ascend to improve the memory-reuse, the batch size of Bert large model (128 sequence length) is increased from 160 to 208.(Ascend)
- [BETA] Support second order differentiation in PyNative mode.(Ascend/GPU)
- [DEMO] Add distributed trainning in PyNative mode.(Ascend/GPU)


- [STABLE] Add new operators for Ascend and GPU: IGamma, LGamma, DiGamma;
- [STABLE] Add new distributions for Ascend and GPU: LogNormal, and Logistic;
- [BETA] Add new distributions for Ascend only: Gumbel, Cauchy, Gamma, Beta, and Poisson; Add Categorical distribution for GPU;
- [STABLE] Add new bijectors for Ascend and GPU: GumbelCDF, Invert;
- [STABLE] Add Bayesian layer realized by local reparameterization method for Ascend and GPU;
- [STABLE] Add Anomaly Detection Toolbox based on VAE for Ascend and GPU.


- [STABLE] Support single node multi-p distributed cache data sharing
- [STABLE] Support GPU profiling with data processing
- [STABLE] Support YOLOV3 dynamic shape in sink mode with dataset
- [STABLE] Support unique processing in the data processing pipeline
- [STABLE] Python layer parameter verification error information unified

API Change

Backwards Incompatible Change

Python API

Delete shape and dtype of class Initializer ([!7373](https://gitee.com/mindspore/mindspore/pulls/7373/files))

Delete shape and dtype attributes of Initializer class.

Modify the return type of initializer ([!7373](https://gitee.com/mindspore/mindspore/pulls/7373/files))

Previously, the return type of initializer function may be string, number, instance of class Tensor or subclass of class Initializer.

After modification, initializer function will return instance of class MetaTensor, class Tensor or subclass of class Initializer.

Noted that the MetaTensor is forbidden to initialize parameters, so we recommend that use str, number or subclass of Initializer for parameters initialization rather than the initializer functions.

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> import mindspore.nn as nn
>>> from mindspore.common import initializer
>>> from mindspore import dtype as mstype
>>> def conv3x3(in_channels, out_channels)
>>> weight = initializer('XavierUniform', shape=(3, 2, 32, 32), dtype=mstype.float32)
>>> return nn.Conv2d(in_channels, out_channels, weight_init=weight, has_bias=False, pad_mode="same")


>>> import mindspore.nn as nn
>>> from mindspore.common.initializer import XavierUniform
>>> 1) using string
>>> def conv3x3(in_channels, out_channels)
>>> return nn.Conv2d(in_channels, out_channels, weight_init='XavierUniform', has_bias=False, pad_mode="same")
>>> 2) using subclass of class Initializer
>>> def conv3x3(in_channels, out_channels)
>>> return nn.Conv2d(in_channels, out_channels, weight_init=XavierUniform(), has_bias=False, pad_mode="same")


After modification, we can use the same instance of Initializer to initialize parameters of different shapes, which was not allowed before.

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> import mindspore.nn as nn
>>> from mindspore.common import initializer
>>> from mindspore.common.initializer import XavierUniform
>>> weight_init_1 = XavierUniform(gain=1.1)
>>> conv1 = nn.Conv2d(3, 6, weight_init=weight_init_1)
>>> weight_init_2 = XavierUniform(gain=1.1)
>>> conv2 = nn.Conv2d(6, 10, weight_init=weight_init_2)


>>> import mindspore.nn as nn
>>> from mindspore.common import initializer
>>> from mindspore.common.initializer import XavierUniform
>>> weight_init = XavierUniform(gain=1.1)
>>> conv1 = nn.Conv2d(3, 6, weight_init=weight_init)
>>> conv2 = nn.Conv2d(6, 10, weight_init=weight_init)


Modify get_seed function ([!7429](https://gitee.com/mindspore/mindspore/pulls/7429/files))

Modify get_seed function implementation

Previously, if seed is not set, the value of seed is default, parameters initialized by the normal function are the same every time.

After modification, if seed is not set, the value of seed is generated randomly, the initialized parameters change according to the random seed.

If you want to fix the initial value of parameters, we suggest to set seed.

>>> from mindspore.common import set_seed
>>> set_seed(1)

`nn.LinSpace` ([!9494](https://gitee.com/mindspore/mindspore/pulls/9494)) has been removed and modify `ops.LinSpace` ([!8920](https://gitee.com/mindspore/mindspore/pulls/8920))

The `nn.LinSpace` interface only support passing the value by args previously. For the convenience, we provided enhancive `ops.LinSpace` interface, which support passing the value by the inputs at the latest version. So there is no need for `nn.LinSpace`.

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> from mindspore import nn
>>> start = 1
>>> stop = 10
>>> num = 5
>>> linspace = nn.LinSpace(start, stop, num)
>>> output = linspace()


>>> import mindspore
>>> from mindspore import Tensor
>>> from mindspore import ops
>>> linspace = ops.LinSpace()
>>> start = Tensor(1, mindspore.float32)
>>> stop = Tensor(10, mindspore.float32)
>>> num = 5
>>> output = linspace(start, stop, num)


Parts of `Optimizer` add target interface ([!6760](https://gitee.com/mindspore/mindspore/pulls/6760/files))

The usage of the sparse optimizer is changed.

The target interface is used to set the execution backend of the sparse operator.

The add_primitive_attr interface is no longer allowed.

The following optimizers add the target interface: Adam, FTRL, LazyAdam, ProximalAdagrad

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> from mindspore.nn import Adam
>>> net = LeNet5()
>>> optimizer = Adam(filter(lambda x: x.requires_grad, net.get_parameters()))
>>> optimizer.sparse_opt.set_device("CPU")


>>> from mindspore.nn import Adam
>>> net = LeNet5()
>>> optimizer = Adam(filter(lambda x: x.requires_grad, net.get_parameters()))
>>> optimizer.target = 'CPU'


`export` Modify the input parameters and export's file name ([!7385](https://gitee.com/mindspore/mindspore/pulls/7385), [!9057](https://gitee.com/mindspore/mindspore/pulls/9057/files))

Export the MindSpore prediction model to a file in the specified format.

The reference includes: `net`, `*inputs`, `file_name`, `file_format`, `**kwargs`.

Input parameters can be input according to specific export requirements.

Add the file name extension based on the format.

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> from mindspore.train.quant import quant
>>> network = LeNetQuant()
>>> inputs = Tensor(np.ones([1, 1, 32, 32]), mindspore.float32)
>>> quant.export(network, inputs, file_name="lenet_quant.mindir", file_format='MINDIR')


>>> import mindspore as ms
>>> network = LeNetQuant()
>>> inputs = Tensor(np.ones([1, 1, 32, 32]), mindspore.float32)
>>> ms.export(network, inputs, file_name="lenet_quant", file_format='MINDIR', quant_mode='AUTO')


`Dense`, `Conv2dBnAct`, `DenseBnAct`, `DenseQuant` support setting the activation attribute as an instance of a class derived from `nn.Cell` or `Primtive` ([!7581](https://gitee.com/mindspore/mindspore/pulls/7581))

activation (Union[str, Cell, Primitive]): activate function applied to the output of the fully connected layer

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> import mindspore.nn as nn
>>> dense = nn.Dense(1, 1, activation='relu')


>>> import mindspore.nn as nn
>>> import mindspore.ops as ops
>>> dense = nn.Dense(1, 1, activation=nn.ReLU())
>>> dense = nn.Dense(1, 1, activation=ops.ReLU())


`tensor.dim()`, `tensor.size()` has been renamed to `tensor.ndim`, `tensor.size` ([!10175](https://gitee.com/mindspore/mindspore/pulls/10175))

Previously, tensor.size() and tensor.dim() were used for checking the total number of elements/dimensions in the tensor.
However, from a user's perspective, tensor.size and tensor.ndim (methods -> properties) are better choices, since they follow the numpy naming convention.

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> from mindspore import Tensor
>>> Tensor((1,2,3)).size()
>>> Tensor((1,2,3)).dim()


>>> from mindspore import Tensor
>>> Tensor((1,2,3)).size
>>> Tensor((1,2,3)).ndim


`EmbeddingLookup` add a config in the interface: sparse ([!8202](https://gitee.com/mindspore/mindspore/pulls/8202))

sparse (bool): Using sparse mode. When 'target' is set to 'CPU', 'sparse' has to be true. Default: True.

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> from mindspore.nn import EmbeddingLookup
>>> input_indices = Tensor(np.array([[1, 0], [3, 2]]), mindspore.int32)
>>> result = EmbeddingLookup(4,2)(input_indices)
>>> print(result.shape)
(2, 2, 2)


>>> from mindspore.nn import EmbeddingLookup
>>> input_indices = Tensor(np.array([[1, 0], [3, 2]]), mindspore.int32)
>>> result = EmbeddingLookup(4,2)(input_indices, sparse=False)
>>> print(result.shape)
(2, 2, 2)


`nn.probability.bijector` change types of attributes from (int, float) to (float, list, numpy.ndarray, Tensor) ([!8191](https://gitee.com/mindspore/mindspore/pulls/8191))

Attributes Type change: (int, float) -> (float, list, numpy.ndarray, Tensor).
Int type is not supported anymore. Parameters of all bijectors should be type float, list, numpy.ndarray or Tensor.

<td style="text-align:center"> 1.0.1 </td> <td style="text-align:center"> 1.1.0 </td>

>>> import mindspore.nn.probability.bijector as msb
>>> power = 2
>>> bijector = msb.PowerTransform(power=power)


>>> import mindspore.nn.probability.bijector as msb

