The code was modified so that it is easier to reproduce the original results - before, only the code structure was the same, but the hyperparameters were different and the optimizer was SGD - there were difficulties with making RMSProp training work.
Now the networks can be successfully trained with RMSProp and with the same hyperparameters as in the paper.
- Added reproducibility section to the readme
- Hyperparameters were modified so that they match those from the NAS-Bench-101 paper
- TensorFlow version of RMSProp is supported
- Gradient clipping can be turned off
Special thanks to [longerhost](https://github.com/longerHost) for helping to reproduce the original training!