- `torch.nn.LayerNorm` and `ane_transformers.reference.layer_norm.LayerNormANE` apply scale and bias terms in opposite orders. In order to accurately restore a state_dict trained using the former into the the latter, we adjust the bias term. This change slightly improves the parity between the Hugging Face PyTorch model's outputs and ane_transformers CoreML model's outputs.
- Introduce upper bound for torch version at 1.11.0 to ensure that coremltools-5.2.0 is well paired. The upper bound will be removed/updated when coremltools-6.0 is released (currently in beta1).