--------------------
* Many smaller fixes and performance improvements. Also fixes a critical error
that in some cases would cause the validation callback to only consider a
subset of the predicted batch when computing validation metrics, which could
make validation metrics noisy especially for large batch sizes.