In this release we add several new substantial features.
Multi-machine distributed training support
We've added a new tutorial (see [here](https://allenact.org/tutorials/distributed-objectnav-tutorial/)) and scripts necessary to be able to run allenact across multiple machines.
Improved Navigation Models with Auxiliary Tasks
[Recent work](https://arxiv.org/abs/2007.04561) has shown that certain auxiliary tasks (e.g. inverse/forward dynamics) can be used to speed up training and improve the performance of navigation agents. We have implemented a large number of these auxiliary tasks (see for instance the `InverseDynamicsLoss`, `TemporalDistanceLoss`, `CPCA16Loss`, and `MultiAuxTaskNegEntropyLoss` classes in the `allenact.embodiedai.aux_losses.losses` module as well as a new base architecture for visual navigation (`allenact.embodiedai.models.VisualNavActorCritic`) which makes it very easy to use these auxiliary losses during training.
CLIP Preprocessors and Embodied CLIP experiments
We've added a new `clip_plugin` that makes preprocessors available which use the CLIP-pretrained visual encoders. See the `projects/objectnav_baselines/experiments/robothor/clip/objectnav_robothor_rgb_clipresnet50gru_ddppo.py` experiment configuration file which uses these new proprocessors to obtain SOTA results on the [RoboTHOR ObjectNav leaderboard](https://leaderboard.allenai.org/robothor_objectnav/). These results correspond to [our new paper](https://arxiv.org/abs/2111.09888) on using CLIP visual encoders for embodied tasks.
New storage flexibility
We've substantially generalized the rollout storage class. This is an "advanced" option but it is now possible to implement custom storage classes which can enable new types of training (e.g. Q-learning) and even mixing various training paradigms (e.g. training with Q-learning, PPO, and offline imitation learning simultaneously).
Better Habitat Support and Experiments
We've added support for training ObjectNav models in Habitat and include a experiment config that trains such a model with a CLIP visual encoder backbone (see `projects/objectnav_baselines/experiments/habitat/clip/objectnav_habitat_rgb_clipresnet50gru_ddppo.py`).