This is the initial release of Fast Transformer and implements Fast Transformer as a subclassed TensorFlow model.
Classes
- [FastAttention](https://github.com/Rishit-dagli/Fast-Transformer/blob/d47d4e74e1c84907d4136ef07f7c57c441eaf603/fast_transformer/fast_attention.py#L6): Implements additive attention as a TensorFlow Keras layer, and supports using relative positional encodings.
- [PreNorm](https://github.com/Rishit-dagli/Fast-Transformer/blob/d47d4e74e1c84907d4136ef07f7c57c441eaf603/fast_transformer/fast_transformer.py#L8): Normalize the activations of the previous layer for each given example in a batch independently and apply some function to it, implemented as a TensorFlow Keras Layer.
- [FeedForward](https://github.com/Rishit-dagli/Fast-Transformer/blob/d47d4e74e1c84907d4136ef07f7c57c441eaf603/fast_transformer/fast_transformer.py#L19): Create a FeedForward neural net with two `Dense` layers and GELU activation, implemented as a TensorFlow Keras Layer.
- [FastTransformer](https://github.com/Rishit-dagli/Fast-Transformer/blob/d47d4e74e1c84907d4136ef07f7c57c441eaf603/fast_transformer/fast_transformer.py#L37): Implements the FastTransformer model using all the other classes, allows using rotary embeddings, weight tie projections, and converts to logits. Implemented as a TensorFlow Keras Model.